Approachable data science

2014-06-02 2 min read

    Data science has earned the reputation of being complicated and inaccessible to those without an advanced degree but it doesn’t have to be this way. The goal of data science is simply to unlock insights and value from data. There’s no need to make it more complicated than that. Of course, there are times where the data requires some domain knowledge or is just too big for someone without the necessary experience to work with but I believe that most places have enough low hanging fruit that anyone who can write a quick script can contribute and do data science.

    This can be as simple as looking at a site’s log files to figure out the most popular pages and how long they take to load in order to identify slow pages that can be sped up. Another quick task can be writing some queries to provide summary statistics across varying dimensions and visualizing them to see if any patterns emerge. A more advanced project can be going through a codebase and implementing a system to help track metrics in a way that makes future analysis easier. None of these require advanced quantitative knowledge and there’s no reason that anyone should feel unqualified to dabble in data analysis. In my experience the most value has come from someone noticing something interesting and asking the right questions that led to a more thorough analysis. The more people that approach data with a curious mindset the more valuable a company’s data becomes.

    There’s always the risk of discovering something spurious so it’s important to validate discoveries but I’d rather have signals and noise than silence - especially if this encourages more people to become interested in data. At first, this can pose a problem for the people who need to deal with the noise but over time people will become more aware of what’s valuable and can help identify areas of further analysis. This is the way to build a data driven culture - not by hiring a few data scientists.