Introducing data science competitions
Competitive programming has a long story, starting in the 1970s with the first editions of the ICPC, the “International Collegiate Programming Contest”. In the original ICPC, small teams from universities and companies participated in a competition that required solving a series of problems using a computer program (at the beginning participants coded in FORTRAN). In order to achieve a good final rank, teams had to display good skills in team working, problem solving and programming.
The experience of participating in the heat of such a competition and the opportunity to have a spotlight for recruiting companies provided the students enough motivation and it made the competition popular for many years. Among ICPC finalists, a few ones have become renowned. Among these, there is Adam D'Angelo, the former CTO of Facebook and founder of Quora, Nikolai Durov, the co-founder of Telegram Messenger, and Matei Zaharia, the creator of Apache Spark. Together with many others professionals, they all share the same experience: having taken part to an ICPC edition.
After ICPC, programming competitions flourished, especially after 2000, when remote participation become more feasible, allowing international competitions more easily and at a lower cost. The format is similar and simply the same for most of these competitions: there is a series of problems and you have to code a solution to solve them. The winners can then take a prize, but also make themselves noticed by recruiting companies or simply become famous and popular among their peers.
In this chapter, we will explore how competitive programming evolved into data science competitions, why the Kaggle platform is the most popular site for such competitions and how it works.