What skills does a data scientist need?

From what I’ve learned, data science is a highly interdisciplinary field, requiring a diverse skill set.

Here are some of the most cited skills:

  • Coding
    • All-purpose programming languages like Python and R (perhaps Matlab/Octave) which are good for analyzing data
    • (??) Backend programming languages that might include Java, Python, Scala, or Ruby
    • (??) Big data stuff like Hadoop, Map-Reduce
    • Databases (SQL?)
    • Data munging/wrangling skills (UNIX, Perl?) for parsing, cleaning data
    • General knowledge of good software engineering practices
  • Science
    • Statistics
    • Machine learning and its various algorithms
    • The more “artistic” part which is understanding models, knowing what questions to ask, being rigorous and not just cranking algorithms through a black box
  • Communication
  • Passion
    • Technically not a skill, but a data scientist should be someone who’s excited about playing around with data, curious about the world, and enjoys doing their own data analysis projects for fun

(??) Not as important, but potentially useful depending on what you’re doing

I drew up this list partially from the opinions of a few practicing data scientists, including Pete Skomoroch (Principal Data Scientist at LinkedIn), Hilary Mason (Chief Scientist at bitly), and Michael Driscoll (Chairman of Dataspora).

Another way to find out what skills a data scientist needs — go to LinkedIn’s skills trend page on data science.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s