Systems
Just like any other profession, a data scientist needs a suitable set of tools to create value from data. A plethora of data science solutions is available on the market, many of which are open source software. Specialized tools exist for each aspect of the data science workflow.
There is no need to discuss the multitude of packages that are available. Many excellent websites review the various offerings. This section provides some thoughts on the use of spreadsheets versus writing code and business intelligence platforms.
Spreadsheets are a versatile tool to analyze data that has proliferated into almost every aspect of business. This universal tool is, however, not very suitable to undertake complex data science. One of the perceived advantages of spreadsheets is that they contain the data, the code and the output in one convenient file. This convenience comes at a price as it reduces the soundness of the analysis. Anyone who ever had the displeasure of reverse-engineering...