Definition of a data lake
It's a good time to be alive. We have a tremendous amount of information available at just a few keystrokes (thank you, Google) or a simple voice command (thank you, Alexa).
The data that companies are generating is richer than ever before. The amount they are generating is growing at an exponential rate. Fortunately, the processing power needed to harness this deluge of data is ever increasing and becoming cheaper. Cloud technologies such as AWS allow us to scale data almost instantaneously and in a massive fashion:
Figure 14.1 – The problem with the abundance of information
Figure 12.1 – The problem with the abundance of information
Do you remember, before the internet, that we thought that the cause of collective stupidity was a lack of information?
Well, it wasn't…
Data is everywhere today. It was always there, but it was too expensive to keep it. With the massive drops in storage costs...