berkeleyjess

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Sunday, 19 April 2015

What is a Data Scientist?

Posted on April 19, 2015 by anand
Data Scientist has been called the Sexiest Job of the 21st Century... but many people (including some companies trying to hire data scientists) don't really understand what this job means.  The term is used to describe a wide variety of roles; A data scientist at one company doesn't necessarily do the same thing as a data scientist at another company.

Below I break down some of the different 'types' of data scientist jobs there are and the skills needed for these various roles.  Please note that this list is not exhaustive, and sometimes a data science position expects someone to fill multiple of the below roles:

Data Analyst
  • Derive business insight from data. 
  • Work across all teams within an organization. 
  • Answer questions using analysis of data.
  • Design and perform experiments and tests. 
  • Create forecasts and models.  
  • Prioritize which questions and analyses are actionable and valuable.
  • Help teams/executives make data-driven decisions.
  • Communicate results across the company to technical and non-technical people.
                Required Skills: SQL, Statistics, programming, data management, data analysis, data modeling, data visualization, experimental design, decision making, prioritization, project management, product development, communication. 

                Data Architect
                • Design systems to get raw data into an easily analyzable form.  
                • Act as a bridge between engineers and analysts. 
                • Organize data into useful database tables for analysis.
                • Optimize data sets for efficient analysis.
                • Create ETL Systems for your data sets. 
                        Required Skills: SQL, computer programming, backend software engineering, database design, database management, data optimization, data modeling.


                        Data Engineer
                        • Work with analysts to build internal tools for analyzing, visualizing, and sharing data. 
                        • Design and maintain A/B testing systems.
                        • Work with engineers to insure that the right data is being collected.
                        • Create systems which allow analyst work to scale.
                        • Work with data architects / operations to insure the data is organized for optimal analysis.
                                Required Skills: SQL, computer programming, full-stack software engineering, data visualization, database management, communication.

                                Domain Experts
                                People who have advanced specializations like:
                                • Machine Learning 
                                • Natural Language Processing
                                • Algorithms
                                • Financial/Economic Models
                                • Graph Theory
                                • Supply Chain Analysis
                                • Operations Analysis
                                Required Skills: Advanced degree in computer science, math, statistics, finance, or economics. Programming, statistics, data management, data analysis, data modeling, data visualization, experimental design, communication.


                                Unicorn 
                                Someone who does all of the above.
                                (basically impossible to find, thus the name)


                                Some FAQs about Data Science Jobs

                                Q: Do I need an advanced degree to be a data scientist?
                                A: No
                                In general, the tech industry cares very little about degrees or pedigree.  I have been at companies where the CTO didn't graduate high school, and I have been at companies where many people have PhDs.  I would be very surprised if any company would not consider an (otherwise qualified) candidate for a data role just because they didn't have a certain degree.

                                That being said, many of the skills required to be a data scientist overlap with the skills required to be a scientific researcher.  People who have worked as researchers (either because they did an advanced degree, or because they worked in a lab) tend to be good candidates for data science jobs.

                                However, you can get this experience in a lot of other ways.  For instance, many people start out as (junior) analysts where they pick up many of the skills needed to be data scientists and then become data scientists after 3-5 years of industry experience.  Some people start out as software developers and work on more and more data-oriented projects and get into data science through engineering. Other people start out as financial, marketing, or business analysts and become data scientists through that path.

                                If you want to be a data scientist, I would look at the skills required for the above roles and then find ways to develop those skills either through schooling, your current job, or self-directed projects.

                                Q: What are the most important skills needed to be a data scientist?
                                A: SQL, Data Analysis, Programming
                                Of course, this depends on exactly what is expected for a particular role.  Some data science roles are more analyst-oriented, some are more engineering-oriented, some are more specialized.  This is something you can assess from the job description, but should also be discussed during the interview process.

                                However, in general I would say the most important skills are the following:
                                1) SQL
                                2) Data Analysis
                                3) Statistical Programming

                                So if you were going to learn one thing, I would say learn SQL.  Then do a project that involves analyzing a data set and deriving results.  Then learn a programming language like Python or R.

                                Q:  How do I develop these skills?
                                A: Online course, hackathons, meet-ups, volunteer organizations.
                                There are a lot of great online resources which can help you develop these skills.
                                • Coursera, Khan Academy, and MOOCs through various universities all have free online data science courses.
                                • My friends at Mode Analytics have an awesome SQL school where you can learn SQL.
                                • You can get involved with hackathons or other meet-ups where you can develop your programming and software development skills.  
                                • You can participate in a Kaggle competition, or go through some of the previous Kaggle data exercises.  
                                • You can get involved with organizations like DataKind, Data for Good, or Bayes Impact which all do data projects with social impact.
                                Email ThisBlogThis!Share to XShare to Facebook
                                Posted in algorithms, analytics, big data, data analyst, data architect, data engineer, data science, data scientist, graph, hiring, jobs, machine learning, natural language processing, operations, supply chain | No comments
                                Newer Post Older Post Home

                                0 comments:

                                Post a Comment

                                Subscribe to: Post Comments (Atom)

                                Popular Posts

                                • Guide to Being a Total Badass Woman in Science
                                  Today I was honored to be the keynote speaker at the "You Belong Here!" workshop at Humboldt State University.  I thought I'd ...

                                Categories

                                • AAS 225
                                • academia
                                • advice
                                • algorithms
                                • analytics
                                • anxiety
                                • astronomer
                                • astronomy
                                • astrostatistics and astroinformatics
                                • badass
                                • big data
                                • career
                                • career profile
                                • Career Profile Interview Project
                                • career profiles
                                • crying
                                • data analyst
                                • data architect
                                • data engineer
                                • data science
                                • data scientist
                                • depression
                                • diversity
                                • gender
                                • grad
                                • graduate school
                                • graph
                                • guide
                                • Hired
                                • hiring
                                • humboldt state university
                                • inspiration
                                • InstaEDU
                                • Interstellar
                                • interview
                                • interviews
                                • Jessica Kirkpatrick
                                • job
                                • jobs
                                • machine learning
                                • menstruation
                                • mental health
                                • microsoft
                                • natural language processing
                                • operations
                                • PMDD
                                • PMS
                                • privilege
                                • school
                                • slides
                                • supply chain
                                • talk
                                • talks
                                • team building
                                • tech
                                • techiesproject
                                • techinclusion16
                                • transition
                                • underrepresented minorities
                                • wage gap
                                • women in astronomy
                                • women in science
                                • Women in STEM
                                • work
                                • yammer
                                • you belong here

                                Blog Archive

                                • March 2017 (1)
                                • October 2016 (1)
                                • April 2016 (1)
                                • April 2015 (1)
                                • February 2015 (1)
                                • January 2015 (1)
                                • November 2014 (2)
                                • October 2014 (1)
                                • July 2014 (2)
                                • May 2014 (1)
                                • April 2014 (2)
                                • October 2013 (1)
                                • July 2013 (3)
                                • June 2013 (1)
                                • April 2013 (1)
                                Powered by Blogger.

                                Report Abuse

                                • Home

                                About Me

                                anand
                                View my complete profile

                                Search This Blog