Hadoop solutions team
The rise of big data technology-based solutions has given birth to a new role: the data scientist. Data scientists, just like other scientists, perform a number of experiments to generate new insights and prove a hypothesis right or wrong. Data scientists perform their experiments on big data by combining their knowledge of statistics, computer science, and mathematics in solving data science problems (O'Reilly Media Inc., 2013).
The role of the data engineer
This book has focused on building Hadoop-based solutions with the help of tools available in the open source ecosystem. We have covered how various tools can be selected, installed, configured, and programmed to build a solution. Typically, these tasks will be performed by a data engineer. The data engineer has a strong knowledge of a number of tools available in the big data ecosystem. Data engineers will build data pipelines, program ELT routines, and design queries. They will have the skills to deploy a Hadoop...