Splunk and big data
Big data is a widely used term but, as is often the case, one that means different things to different people. In this part of the chapter, we present common characteristics of big data .
There is no doubt that today there is a lot of data, and more commonly today, the term big data is not meant to reference the volume as much as it is characterized by other factors, including variability so wide that legacy, conventional organizational data systems cannot consume and produce analytics from it.
Streaming data
Streaming data is almost always being generated, with a timestamp associated to each entry. Splunk's inherent ability to monitor and track data loaded from ever growing log files, or accept data as it arrives on a port, are critical pieces of functionality.
However, streaming data is no different than other data in that it's usefulness erodes, particularly at a detailed level. For instance, consider a firewall log.
In real time, Splunk will capture and index events written...