Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Time Series Data: A Difficult Yet Tameable Beast

byRohit Gupta
January 16, 2018
in Articles, IT
Home Resources Articles

It seems like every quarter a new McKinsey report predicts that this will be the year trillions of dollars of IoT potential is unlocked. But while the amount of data IoT produces has skyrocketed, we’re still waiting for that return on investment. The good news is, the reports aren’t wrong. Actionable data can, in fact, enable data scientists to accelerate business growth. The bad news is, businesses haven’t had access to the right tools to make their data actionable. In fact, examples indicate just 1 percent of operational data is being used in enterprises.

The primary exhaust of IoT devices is time series data, i.e. sequential events indexed by time. Working with time series data is tricky. Whether it’s information coming from a machine on a factory floor or the trunk of a self-driving car, events occur in uneven intervals, different sized windows and formats that vary across datasets. Time series data is unique in that it’s write once, non-deletable, non-transactional and non-relational. It also has different access patterns, such as looking for behaviors and patterns across time rather than joining on a specific field.

Unfortunately, time series data often gets grouped with other types of data such as CRM records, log data and general analytics. This results in tools that don’t work, leaving data scientists and their organizations without an effective solution for leveraging their data or making it actionable.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Unique Hurdles and Advantages

With traditional datasets, data scientists often look for relationships that can be expressed easily and efficiently with SQL. For time series data, however, data scientists need to look for behaviors and patterns in events streaming across time. They need to look for specific sequences, how often they happen and the characteristics of the data during these windows of time in order to gain insights and build models. Relying on SQL to do time series data lookup can quickly become very costly and inefficient.

Luckily, time series data can be sampled. Data scientists only need a small portion of the extracted data to understand its overall shape. This initial sample can fit into memory and be analyzed with pandas or a Jupyter notebook. It may even be small enough to efficiently do full table scans inside of a No-SQL or SQL database. The small sample size of time series data makes it possible for data scientists to quickly explore the data for patterns and write small programs to transform the data or add new features.

Performance and Workflow Challenges

Eventually, though, analysis needs to scale. Managing the performance and workflow of time series analytics from a small sample to production-level volumes can be extremely challenging for data scientists. For instance, even simple pre-processing and data transformation steps need to be moved to distributed batch processing workflows. Moving the extract, transform and load (ETL) program from local scripts into a production-ready data pipeline requires rewriting entire programs for environments where table scans just aren’t feasible.

The level of experimental interactivity and flexibility during the data exploration and model development process is directly related to how valuable the time series data insights will be. When data scientists are forced to wait hours or days for long batch processing pipelines, they lose interactivity, iterate less and find suboptimal solutions that often have unintended consequences. For instance, because it’s so inefficient to adapt or tune an ETL, early assumptions aren’t tested, leaving dangerous biases and failure modes in a system. These problems compound when data scientists need to join streams of data together, each with different states, features, ETL requirements and schemas. The resulting pipelines are extremely fragile, and they break frequently. Before you know it, the majority of a data scientist’s time is spent troubleshooting.

Crucial Best Practices

Creating useful metrics from time series data requires looking at high-level features not visible in the raw data itself. For instance, instead of merely looking at a temperature value, it’s useful to extract degrees/hour change. Useful patterns are discovered by combining derived features from multiple data sources into higher level query expressions.

For data scientists looking to effectively leverage the insights behind their organization’s time series data, acknowledging and prioritizing scaling challenges is critical. Maintain a high level of interactivity so you can explore and iterate quickly. Recognize the unique behaviors complex events will reveal, and be ready to test as many combinations as possible. In doing so, data scientists can productively work with time series data to help a business grow.

Like this article? Subscribe to our weekly newsletter to never miss out!

Tags: surveillance

Related Posts

When Regulation Embraces Innovation: Xenco Medical Founder and CEO Jason Haider Discusses the Upcoming 2026 CMS Transforming Episode Accountability Model

When Regulation Embraces Innovation: Xenco Medical Founder and CEO Jason Haider Discusses the Upcoming 2026 CMS Transforming Episode Accountability Model

August 26, 2025
DeFAI and the Future of AI Agents

DeFAI and the Future of AI Agents

July 26, 2025
Unifying the fragmented AI ecosystem: A new paradigm for generative AI workflows

Unifying the fragmented AI ecosystem: A new paradigm for generative AI workflows

July 21, 2025

How to plan for technical debt before it buries you

July 21, 2025
Optimizing performance for a global user base

Optimizing performance for a global user base

July 17, 2025
How the right FPS mouse can make or break your game (or workflow)

How the right FPS mouse can make or break your game (or workflow)

July 14, 2025
Please login to join discussion

LATEST NEWS

Psychopathia Machinalis and the path to “Artificial Sanity”

GPT-4o Mini is fooled by psychology tactics

AI reveals what doctors cannot see in coma patients

Asian banks fight fraud with AI, ISO 20022

Android 16 Pixel bug silences notifications

Azure Integrated HSM hits every Microsoft server

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.