Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

If you care about Big Data, you care about Stream Processing

byArpan Shah
March 13, 2017
in Articles
Home Resources Articles
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

As the scale of data grows across organizations with terabytes and petabytes coming into systems every day, running ad hoc queries across the entire dataset to generate important metrics and intelligence is no longer feasible. Once the quantum of data crosses a threshold, even simple questions such as what is the distribution of request latencies becomes infuriatingly slow with the usual sql and database model. Imagine running such a request on the Facebook request data, the query would take days to complete.

Stream processing offers the solution for anyone looking to manage an ever increasing volume of data.

What is stream processing

Stream processing is the handling of units of data on a record-by-record basis or over sliding time windows. This methodology limits the amount of data processed and offers a very contrasting way of looking at the data and the analytics. Complexity is traded in for speed. The nature of the analytics with stream processing are usually filtering, correlations, sampling and aggregations over the data.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Examples of the diverse applications of stream processing include:

  • Alerting on sensor data on IoT devices
  • Log analysis and statistics on web traffic
  • Risk analysis with movements of money and orders in Fintech
  • Click stream analytics on Ad Networks

 

Comparison to Batch Processing:

Batch processing Stream processing
Scope Queries over all or most of the data in the dataset. Queries or processing over data within a rolling time window, or on just the most recent data record.
Size Large numbers of records Individual records or micro batches consisting of a few records.
Performance Latencies in minutes to hours. Requires latency in the order of seconds or milliseconds.
Analyses Complex analytics. Simple response functions, aggregates, and rolling metrics.


Be truly real-time

Stream processing is the only way applications can truly be real time with their data. As soon as we talk about aggregations over the universe of data, any batch process job will continue to take longer as the amount of data increases. If you have an application that relies on real time metrics, without a stream processing solution, your view of the world is always going to be delayed.

Flow data over queries

To harness the power of stream processing, engineers and data scientists need to evolve their model from one where they run queries over their data to one where the data runs over the queries. This is a powerful shift in the way to look at the data applications.

The key shift is from tables to streams and doing operations on those streams such as

  • filtering
  • joining
  • aggregating
  • windowing

Lets compare the contrast in the batch processing world vs the stream processing world in the example application of Calculating the distribution of number of user sessions per user in a day on the website.

In batch processing: a complicated query would define what events to include in a user session, what the maximum delay between events should be for a session and computing a unique count over them. Since events in tables can occur out of order, even if a new table is maintained   every record in the table would need to be traversed for an answer

In stream processing: A stream of events would be simply be filtered to produce a stream of relevant events, this stream would then be windowed over a delay window to produce a stream of unique sessions. Finally this stream would be aggregated upon for unique counts per user with the relevant number simply being appended to a key value store.

As this example demonstrates, a less computationally intense and more real-time answer can be obtained simply be switching the way the data is perceived

Easier than ever before

Managing stream processing applications is not easy with all the moving pieces of managing issues such as fault-tolerance, partitioning and scaling and durability of the data. Over the last 7 years a lot of systems have emerged; many of which are open source and handle a lot of these issues entirely. This leaves the developer to only worry about implementing the application logic.

Some of the most exciting examples of have only launched in the last couple of years: Spark Streaming (2015), Apache Beam (2016) and Kafka Streams (2016). A full comparison of the various popular options are in figure 2.

Figure2

Any person developing applications with big data should keep the models of stream processing their mind as they choose the architecture and stack for their system. While stream processing is not a panacea for all the tasks around large scale data processing, it offers a new and important way to look at data that allows scalable real-time

 

Credits:

Figure 1: Information obtained from AWS Documentation

Figure 2: Image Courtesy: Ian Hellström on Twitter

 

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

Tags: Big Datadata scienceStream Processingsurveillance

Related Posts

How automation tools are being integrated into professional networking

How automation tools are being integrated into professional networking

May 31, 2026
Autonomous agentic UI orchestration for high-throughput enterprise ecosystems

Autonomous agentic UI orchestration for high-throughput enterprise ecosystems

May 31, 2026
Freedom Holding Corp.: Competing through data and integration

Freedom Holding Corp.: Competing through data and integration

May 15, 2026
First Round Capital’s Network Shows Where Seed Capital Is Landing

First Round Capital’s Network Shows Where Seed Capital Is Landing

May 5, 2026
The silence in the machine: Reclaiming authority in the age of digital noise

The silence in the machine: Reclaiming authority in the age of digital noise

April 22, 2026
Synthetic Data Alone Cannot Train Physical AI to Handle the Real World

Synthetic Data Alone Cannot Train Physical AI to Handle the Real World

April 17, 2026
Please login to join discussion

LATEST NEWS

Amazon adds AI-generated product previews to search results

Meta launches AI business agents on WhatsApp, Instagram and Messenger

Nintendo will release a repair-friendly Switch 2 in Europe

Google rolls out Ask Gemini in Drive to eligible Workspace users

Google Wallet to add digital IDs from select EU countries this summer

Why Telegram Mini Apps have become the optimal ecosystem for launching AI SaaS products

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.