Statistical Data Analysis

Data scientist: How to tap into the 'sexiest' role of the 21st century

This is a contributed piece by Aaron Beach, data scientist at SendGrid


According to the Harvard Business Review, the data scientist could well be the ‘sexiest’ job role of the 21st century, and according to recent research, UK organisations have the second highest volume of data with unidentified value in EMEA. This leaves a growing global demand for competent data analysts and data scientists, but not nearly enough candidates to fill the positions.


But how you should position yourself as a data science candidate?

Whether you work for a large company grappling with huge amounts of data, or at a company where the product itself is especially data-driven, what’s required from a data scientist may differ.

Here are some suggestions of what to look out for as you seek a new career in a data science discipline:

Emphasise your specific coding/programming skills - No matter what type of company you’re looking to join, you’re likely going to be expected to know how to use the tools of that company’s trade. This means statistical programming languages such as Open Source R or Python combined with database querying languages such as SQL. However, because what companies need out of a data scientist is so variable, it’s likely that the role and reasons for its existence will be stated during the initial hiring stage so as to recruit the right candidate.

Talk up your background in basic statistics and machine learning - At least a basic understanding of statistics is vital to the modern-day data scientist. If you don’t already have basic statistical premises in your toolkit, you’ll be in for a world of pain. You should be familiar with statistical tests, distributions, maximum likelihood estimators, etc. One of the more important aspects of your statistics knowledge is understanding when different techniques are (or aren’t) a valid approach.

If you’re at a large company with huge amounts of unstructured data, or working at a company where the product is itself especially data-driven, it may be the case that you’ll need to be familiar with machine learning methods. This can mean things like anomaly detection, spam filtering, ensemble methods – any or all techniques may be required. Alternatively, if you work for a fledgling start-up which has amassed large amounts of data, it’s important that you know how to formulate this into a coherent framework of usable data for analysis.

Multivariable calculus and linear algebra knowledge - As a new candidate, you may be asked to derive some statistical results based on someone’s previous deductions. As a new data scientist you should be expected to understand basic multivariable calculus or linear algebra questions, since they form the basis of a lot of these techniques. While there are many open source tools which can do this heavy lifting for you, it’s important that any data scientist has an understanding of these fundamentals should it be necessary to build implementations in-house from scratch. Understanding these concepts is most important at companies where the product is defined by the data and small improvements in predictive performance, such as A/B testing or algorithm optimisation, can lead to huge wins for the company.

Data visualisation and the ability to translate complexity for the C-Suite - Visualising and communicating data is incredibly important, especially at young companies which are making data-driven decisions for the first time. When it comes to communicating that internally, this means describing your findings or the way techniques work to audiences, both technical and non-technical. It can be immensely helpful to be familiar with data visualisation tools like Tangle and D3.js. It is important to not just be familiar with the tools necessary to visualise data, but also the principles behind visually encoding data and to communicate that information effectively.

A software engineering background - If you’re interviewing at a smaller company and are one of the first data science hires, it can be important to have a strong software engineering background. You’ll be responsible for handling a lot of the data logging yourself, and potentially the development of data-driven products as the company expands.


Future based thinking: the key to success for modern-day data scientists

It’s no secret that tech companies aren’t the only organisations who now look to hire data scientists. Increasingly, ad tech companies and major retailers now need to find the “why” locked behind their customer data. If you are a prospective data scientist candidate, companies want to see that you’re a problem solver, someone who is constantly asking questions that will help future-proof their business.

At some point during the interview process, expect to be asked about some high-level problem – for example, about a test the company may want to run or a data-driven product it may wish to develop on behalf of customers. It’s important to think about what factors are important, and which aren’t. How should you, as the prospective data scientist, interact with the engineers and product managers? What methods should you use? When do approximations make sense?

Think of yourself as an interdisciplinary expert: someone crossing the chasm between statistician, data mining expert and analytics expert. One who can project, manage and advise on what steps the company should take to own their own success.


« Apple Spring Launch reactions: Are we going "backwards"?


Humanity will invent new jobs long before AI "steals" them all »
IDG Connect

IDG Connect tackles the tech stories that matter to you

  • Mail


Do you think your smartphone is making you a workaholic?