You’ve probably heard of data science. It seems that everyone is talking about the field. But what does a data scientist do? When thinking of such a person, perhaps you imagine a cross between a scientist in a lab coat and a computer programmer, but beyond that, you’re unsure what the job entails.
This is understandable since data scientist is a relatively new term. Plus, the role isn’t confined to any specific industry (for the record, we’ve got an in-house data scientist here at DreamHost). Data scientists don’t pour data into beakers, but they do make use of some of the same skills traditional scientists use: observation, hypothesis-testing, and data analysis.
While data scientists don’t pour data into beakers, they do make use of some of the same skills traditional scientists use: observation, hypothesis-testing, and data analysis.
So what does this mean for you?
Well, do you have a passion for improving access to healthy foods for those with lower incomes? Are you interested in working for a Hollywood studio that wants to maximize the reach of its next film? Is it your dream to start a successful business selling self-loading dishwashers? Are you working on urban planning for a city trying to revitalize its downtown? Each of these endeavors is an example of an opportunity to use data science to improve outcomes. Indeed, data science can be used in almost any field you feel passionate about.
Each of these endeavors is an example of an opportunity to use data science to improve outcomes. Indeed, data science can be used in almost any field you feel passionate about.
So What Is Data Science?
The need for data scientists emerged in response to the “data deluge”— the increasingly large amounts of data generated each year — and the realization that some of this data is uniquely valuable. Data science is a field devoted to processing, understanding, and using all this information.
More often than not, the role of the data scientist — “hybrid of data hacker, analyst, communicator, and trusted advisor” — is to draw insights from data to inform business or research decisions. Unlike business analysts who apply tried-and-tested, top-down approaches, data scientists start from the bottom up. They comb through data, searching for clues that will indicate how they can solve the problems at hand. They also use data to draw inferences and generate insights that can be put to good use.
Why Should You Be a Data Scientist?
In 2012, Harvard Business Review named data scientist as “the sexiest job of the 21st century.” And if that’s not enough of a reason to consider entering the field, data scientists earn a lot of money.
That’s because demand for scientists is high and supply is low, and the field keeps growing as the amount of data does. In fact, the number of jobs in data science has multiplied twelvefold since 2010.
All Right, You’ve Convinced Me. Where Do I Start?
The internet is full of advice about the best way to become a data scientist. Of course, some approaches and courses of study are better than others. There are many training options, including a growing set of master’s degree programs, intensive “boot camps” or “hacker schools,” and a wealth of free online courses and tutorials.
When deciding how to pursue becoming a data scientist, keep in mind that while the career is multi-faceted, the two basic skills required are the ability to manipulate data sets using computer programming languages and an understanding of statistics.
Choosing a Programming Language
Python and R are two of the best programming languages for use with data science since they are popular, free, and have both great internet documentation and a large community of users. The best way to get a handle on a programming language is to download it, watch or read some tutorials, and start playing around. CodeAcademy offers some syntactical training in Python, but often a Coursera course is the best way to test whether programming is something you enjoy. There’s also MySQL (pronounced “my sequel” or “sequel” for short), which can help manage relational databases.
Learning Statistics (and Other Relevant Types of Math)
If you’re looking for a way to develop both these crucial skillsets in one place, the Data Science Specialization through Coursera, hosted by Johns Hopkins University, is one of the most popular online courses in the field. It’s a great, comprehensive way to invest in becoming a data scientist. The specialization consists of nine courses, each a month long (with year-round offerings) and focused on a distinct skill relevant to the field of data science. From downloading and installing R to far more advanced topics, the progression of the coursework makes the program widely accessible.
There are many ways to train to become a data scientist. Whichever path you choose, may the data be with you.