Member-only story

My Journey to Self-Taught Data Scientist

4 min readJan 14, 2018

I became a Data Scientist by quitting my job and competing in Kaggle competitions.

This is my Kaggle profile. And yes, that’s a picture of me holding a catfish!

After I graduated from college in early 2014, I knew that I wanted to become a Data Scientist, but I didn’t have the skillset yet. I thought I had a pretty good idea of what I needed to learn to get my foot in the door, but I was sick of being in school, didn’t have much money, and didn’t want to borrow $60,000 to get a Master’s degree.

I had a job lined up at Amazon in Seattle after I graduated since I had interned on the Prime team as a financial analyst during my junior year. I decided to go back because I knew that that I’d learn valuable skills like Excel, SQL, and be constantly working with data at a rapidly growing tech company. I would save up some money while figuring out what my next step was going to be.

I had discovered Kaggle in late 2013, and was obsessed with the potential of machine learning and data science to transform business and the world, but every time I tried to work with a dataset other than Titanic I felt lost. It didn’t help that whenever I attempted to train a machine learning algorithm on a Kaggle dataset it would takes hours to finish (my laptop at the time only had 4 Gb of RAM — it was 2014 and I didn’t have much money). I also routinely worked 60 hours a week so doing Kaggle competitions or taking online courses in the evenings seemed really unmanageable. Every day, it felt nearly impossible to study data science in my spare time, let alone take care of my other, albeit few, responsibilities.

Luckily for me, I had learned a bit of R while taking courses for my economics major in college. I made sure my managers at Amazon knew this, and told them to “take advantage of this skillset whenever possible”. I am glad I did this — in my first year I ended up making some histograms for an ad-hoc analysis of the Kindle Unlimited program that ended up being included in a report that went directly to Jeff Bezos. More importantly, I was tasked with building regression models to estimate the efficiency of Kindle marketing spend which I did in R (doing this in Excel would have been miserable). This latter project was an invaluable learning experience.

My Journey to Self-Taught Data Scientist

Written by Thomas Hepner

Responses (3)