Stream Processing

If you care about Big Data, you care about Stream Processing
As the scale of data grows across organizations with terabytes and petabytes coming into systems every day, running ad hoc queries across the entire dataset to generate important metrics and intelligence is no longer feasible. Once the quantum of data crosses a threshold, even simple questions such as what is

Stream Processing Myths Debunked
This post appeared originally in the dataArtisans blog Six Common Streaming Misconceptions Needless to say, we here at data Artisans spend a lot of time thinking about stream processing. Even cooler: we spend a lot of time helping others think about stream processing and how to apply streaming to data

Apache Samza Levels Up to a Top-Level Project; Witnesses Large Scale Adoption
Apache Samza, the distributed stream processing framework, has now graduated to become a Top-Level Project (TLP), the Apache Software Foundation has revealed. “The incubation process at Apache has been great. It has helped us cultivate a strong community, and provided us with the support and infrastructure to make Samza grow,”

Meet Apache Samza – LinkedIn’s Stream Processing Framework
With the advent of Big Data and the rapidly growing scale of web-applications, monolithic relational databases were replaced by scalable, partitioned, NoSQL databases and HDFS; individual queries to relational databases were replaced by the likes of Hive and Pig. This growing scale and partitioned consumption model brought about by these