The Winning Formula to Being a Kaggle Data Scientist
Is there a formula to be a data science "guru"? If so, what does it include? Is the most significant factor education, experience or pure talent?
Software Advice, which researches and compares business intelligence software, tackled this question with a study to examine the top analysts within the world’s largest data scientist community, Kaggle.
Kaggle is the largest and leading host of predictive analytics competitions, offering companies the chance to tap into its community of more than 100,000 analysts in order to undertake various big data challenges. I wrote about Kaggle in Chapter 5 of my book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die. The study analyzed the top 100 Kaggle users (as of October 2013) to learn more about what these data superstars have in common.
Interesting study results:
Education: Over 80 percent of the top 100 performers have a Master’s degree or higher, and 35 percent have a Ph.D. The top 21 performers all have an M.S. or higher: 9 have Ph.D.s and several have multiple degrees (one member even has two Ph.D.s).
Background/Disciplines: Analysts come from a broad variety of educational backgrounds, with computer science and mathematics as the top areas of study. While most of the areas of study centrally involve quantitative skills, a few surprising programs surfaced, such as philosophy and law.
Where in the World: These “data wizards” hail from all over the globe, with 29 countries represented in the top 100 performers group. The United States has the most members in this list (30), followed by Russia (nine) and India (six).
Sticktoitiveness: The number of contests entered also correlates with a higher chance of winning competitions and becoming a member of the top Kaggle prize-winners.
The Prize Winning Group
In the end, the study concludes that the skills necessary to be one of these elite Kaggle performers can be developed by growth in any one of multiple disciplines, with various levels of study. The name of the game is persistence and a high level of activity in the community.