Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Twitter Outline Home-Grown Analytics Architecture TSAR

byEileen McNulty
June 30, 2014
in News
Home News

Twitter Outline Home-Grown Analytics Architecture TSAR

Twitter has detailed its home-grown, real-time analytics system TSAR (Time Series AggregatoR) in a blog post. The system is focused on automating and aggregating data collection, as well as integrating the various components Twitter uses, such as Hadoop and Storm.

In their data collection and processing pipeline, Twitter are using a range of different solutions (Hadoop, MySQL, NoSQL), which all process the data in different ways, and use different languages to do it. TSAR takes the legwork out of getting these systems to talk to each other.

The blog post by Anirudh Todi, TSAR’s key design principles are:

  • “Hybrid computation. Process every event twice — in real time, and then again (at a later time) in a batch job. The double processing is orchestrated using Summingbird. This hybrid model confers all the advantages of batch (stability, reproducibility) and streaming (recency) computation.

  • Separation of event production from event aggregation. The first processing stage extracts events from source data; in this example, TSAR parses Tweet impression events out of log files deposited by web and mobile clients. The second processing stage buckets and aggregates events. While the “event production” stage differs from application to application, TSAR standardizes and manages the “aggregation” stage.

  • Unified data schema. The data schema for a TSAR service is specified in a datastore-independent way. TSAR maps the schema onto diverse datastores and transforms the data as necessary when the schema evolves.

  • Integrated service toolkit. TSAR integrates with other essential services that provide data processing, data warehousing, query capability, observability, and alerting, automatically configuring and orchestrating its components.”

TSAR was built on top of the Summingbird system, a high-level extraction library which paired the batch processing capabilities of Hadoop with the real-time powers of Storm. TSAR builds upon these, making it easier for the different technologies to communicate back and forth.

With over 500 million tweets created each day, it’s understandable that Twitter developed their own robust technology to orchestrate their multi-faceted system.

Read more here.
(Image credit: Blog post)

Follow @DataconomyMedia


Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!

[mc4wp_form]

 

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Tags: Twitter

Related Posts

Zoom announces AI Companion 3.0 at Zoomtopia

Zoom announces AI Companion 3.0 at Zoomtopia

September 19, 2025
Google Cloud adds Lovable and Windsurf as AI coding customers

Google Cloud adds Lovable and Windsurf as AI coding customers

September 19, 2025
Radware tricks ChatGPT’s Deep Research into Gmail data leak

Radware tricks ChatGPT’s Deep Research into Gmail data leak

September 19, 2025
Elon Musk’s xAI chatbot Grok exposed hundreds of thousands of private user conversations

Elon Musk’s xAI chatbot Grok exposed hundreds of thousands of private user conversations

September 19, 2025
Roblox game Steal a Brainrot removes AI-generated character, sparking fan backlash and a debate over copyright

Roblox game Steal a Brainrot removes AI-generated character, sparking fan backlash and a debate over copyright

September 19, 2025
DeepSeek releases R1 model trained for 4,000 on 512 H800 GPUs

DeepSeek releases R1 model trained for $294,000 on 512 H800 GPUs

September 19, 2025
Please login to join discussion

LATEST NEWS

Zoom announces AI Companion 3.0 at Zoomtopia

Google Cloud adds Lovable and Windsurf as AI coding customers

Radware tricks ChatGPT’s Deep Research into Gmail data leak

Elon Musk’s xAI chatbot Grok exposed hundreds of thousands of private user conversations

Roblox game Steal a Brainrot removes AI-generated character, sparking fan backlash and a debate over copyright

DeepSeek releases R1 model trained for $294,000 on 512 H800 GPUs

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.