You want to be a Data Scientist…and what is that?
Data Science has become a massive buzz term in recent years. In fact, being a Data Scientist was called the “Sexiest Job of the 21st Century” by the Harvard Business Review in 2012. But what does that mean and what does it exactly entail?
Let’s start with the go-to resource of the 21st century, Wikipedia, which defines Data Science as “an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining”. Cool then. Got that? All done here?
Nope, me either. And why doesn’t that make sense? Because…dun dun dun!…I don’t believe anyone really knows what being a Data Scientist actually entails because it means different things to different people. One of my favorite quotes about the field of big data – which is highly connected to data science – comes from Dan Ariely (whose books I highly recommend). He says:
Nevertheless, people continue their attempts to define the field, which is actually a good thing. I’m hoping the more we try to define, refine, and educate then maybe one day the field will be better understood by everyone. One visual I really like is from a site called Towards Data Science, mostly because I’ve always been a fan of the Venn diagram. Below is what they have on their site and it shows the intersection of the various disciplines and the “sweet spot” called Data Science.
But even this doesn’t really explain what people who work in the field actually do. Sure, there is Math and Computer Science but what is “Domain Expertise” and how broad can that be? So, the truth is that every organization and even different departments and managers within the same organization might define requirements for a Data Scientist very differently.
What? How is this possible? It’s because the field is still so new and has gotten so large in such a short period of time that it is hard to define what a Data Scientist does. Each person with that title will have different skills and responsibilities. If you don’t believe me, go ahead and Google “data scientist” and you will get around 259,000,000 results. Each result has a slightly different take on what a Data Scientist is, what he or she does, and what skills are needed. And of course, you’ll get a lot of educational ads on what books to buy, courses to take or schools to attend.
Many organizations either think they are (or want others to think they are) data driven and mining their “big data” for insights; they have one or more Data Scientists (or Data Analysts) so they must be making data driven decisions, right? The truth is that very few organizations are implementing it in a way that will deliver the full benefit. And part of this is because very few people really understand data well enough to develop and implement a true data strategy.
Often, a data person – or even a team of data people – don’t get to do a lot of problem solving. In many cases, they pull data for management PowerPoint decks, spend their time building or updating dashboards, or doing ad hoc work for internal folks who don’t have database access. In some cases, companies only have one data person who is the “go-to person” or subject matter expert.
None of these scenarios is ideal and is setting these people up for frustration and the organization up for disappointment. Because trust me, no one person will have expertise in every aspect of the field. Nor can one person do everything data-related in an organization. I will discuss the breadth of the field in future posts but for additional information on the disillusionment of Data Scientists, here is a pretty good article on why so many data scientists are leaving their jobs. It’s a cautionary tale but provides you with a bit of a reality check about the day-to-day work for a good number of of Data Scientists.
Additionally, in many cases the hiring managers aren’t even data science practitioners so they don’t really know what type of skills they should be seeking. They often just think it’s something they should have or be doing. If you’re working with a Marketing department, they might really want a lot of data visualization skills whereas an IT department might focus more on machine learning and what tools and applications you are capable of using. On the other hand, an Operations department might be gung-ho on descriptive and predictive modeling. It all comes down to the HiPPO (Highest Paid Person’s Opinion).
But does what these people want mean these are the important skills and work that will move their business or goals forward? Maybe. But not necessarily. It could just be they developed a strategy based on their own experiences and beliefs and are hiring for things with which they are familiar. Or what a colleague in another organization told them they are doing. But maybe it’s not what they need. I once was offered and accepted a job by telling my future boss that what he wanted was not actually what he needed. A good part of my interview was about educating my future employers on the nuances between analytics and data science and the skills I felt would be put to the best use for the organization.
And sometimes a hiring manager will throw everything they can think of into a job posting only to pick and choose the skills they believe are important for the role based on the resumes they receive. This is the scariest scenario because it tells you they don’t really know what they want. Or worse yet, what they need.
Even more confusing is that some job titles say Data Analyst but many responsibilities of the role are in the realm of a Data Scientist, or vice versa. Having been a Data Analyst for many years, these two roles have some overlap but there are definite distinctions that are often overlooked.
Where does that leave you? Well, the honest truth is you will have to decide where your interests lie and choose your own focus. Don’t always depend on what companies think they are looking for. Not every job title that says “Data Scientist” will match your skills. Nor should it. And there’s nothing wrong with being an advocate for the field and educating future employers. The field has just gotten too big to be a generalist. Find your niche and go gang busters on it. And take a look at some of my future posts where I explain what some of these focus areas could be.
Or to paraphrase Dan Ariely, you can pretend you know how to do everything in data science and see where that takes you.