Streaming
Traditional data warehouses support batch analytic queries. However, in the open source ecosystem as well as in commercial products, we are seeing a convergence of hybrid batch and streaming architectures. For example, Spark supports both batch processing as well as stream processing with Spark Streaming. Apache Flink is another project aiming to combine batch and stream processing. This is a natural progression because fundamentally it is possible to use very similar APIs and languages to specify a batch or streaming computation. It is no longer necessary to have two completely disparate systems. In fact, a unified architecture makes it easier to discover different types of data sources.
Hybrid batch and streaming architectures will also prove to be extremely beneficial when it comes to IoT data. Streaming can be used to analyze and react to data in real time as well as to ingest data into the data lake for batch processing. Modern, high-performance messaging systems such as Apache Kafka can be used to help in the unification of batch and streaming. Integration tools can help feed Kafka, process Kafka data in a streaming fashion, and also feed a data lake with filtered and aggregated data.