Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

New Apple paper reveals how AI can track your daily chores

The team utilized the Ego4D dataset to analyze twelve distinct activities like cooking and cycling.

byKerem Gülen
November 23, 2025
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Apple researchers published a study detailing how large language models (LLMs) can interpret audio and motion data to identify user activities, focusing on late multimodal sensor fusion for activity recognition.

The paper, titled “Using LLMs for Late Multimodal Sensor Fusion for Activity Recognition,” by Ilker Demirel, Karan Ketankumar Thakkar, Benjamin Elizalde, Miquel Espi Marques, Shirley Ren, and Jaya Narain, was accepted at the Learning from Time Series for Health workshop at NeurIPS 2025. This research explores integrating LLM analysis with traditional sensor data to enhance activity classification.

The researchers state, “Sensor data streams provide valuable information around activities and context for downstream applications, though integrating complementary information can be challenging. We show that large language models (LLMs) can be used for late fusion for activity classification from audio and motion time series data.” They curated a subset of data for diverse activity recognition from the Ego4D dataset, encompassing household activities and sports.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Evaluated LLMs achieved 12-class zero- and one-shot classification F1-scores significantly above chance, without task-specific training. Zero-shot classification through LLM-based fusion from modality-specific models enables multimodal temporal applications with limited aligned training data for a shared embedding space. LLM-based fusion allows model deployment without requiring additional memory and computation for targeted application-specific multimodal models.

The study highlights LLMs’ ability to infer user activities from basic audio and motion signals, showing improved accuracy with a single example. Crucially, the LLM was not directly fed raw audio. Instead, it received short text descriptions generated by audio models and an IMU-based motion model, which tracks movement via accelerometer and gyroscope data.

For the study, researchers utilized Ego4D, a dataset featuring thousands of hours of first-person perspective media. They curated a dataset of daily activities from Ego4D by searching narrative descriptions. The curated dataset includes 20-second samples from twelve high-level activities:

These activities were chosen to cover household and fitness tasks and based on their prevalence in the larger Ego4D dataset. Audio and motion data were processed through smaller models to generate text captions and class predictions. These outputs were then fed into different LLMs, specifically Gemini-2.5-pro and Qwen-32B, to assess activity identification accuracy.

Apple compared model performance in two scenarios: a closed-set test where models chose from the 12 predefined activities, and an open-ended test without provided options. Various combinations of audio captions, audio labels, IMU activity prediction data, and extra context were used for each test.

The researchers noted that the results offer insights into combining multiple models for activity and health data. This approach is particularly beneficial when raw sensor data alone is insufficient to provide a clear picture of user activity. Apple also published supplemental materials, including Ego4D segment IDs, timestamps, prompts, and one-shot examples, to facilitate reproducibility for other researchers.


Featured image credit

Tags: AIAppleego4d

Related Posts

AI mirrors the brain’s processing and is quietly changing human vocabulary

AI mirrors the brain’s processing and is quietly changing human vocabulary

December 11, 2025
Catching the  trillion ghost: AI is rewriting the rules of financial crime

Catching the $2 trillion ghost: AI is rewriting the rules of financial crime

December 9, 2025
LLMs show distinct cultural biases in English vs Chinese prompts

LLMs show distinct cultural biases in English vs Chinese prompts

December 9, 2025
New robot builds furniture from voice commands in 5 minutes

New robot builds furniture from voice commands in 5 minutes

December 8, 2025
Study: LLMs favor sentence structure over meaning

Study: LLMs favor sentence structure over meaning

December 5, 2025
OpenAI wants its AI to confess to hacking and breaking rules

OpenAI wants its AI to confess to hacking and breaking rules

December 4, 2025

LATEST NEWS

The Game Awards 2025: Clair Obscur sweeps Oscars of gaming amid massive announcements

Trump signs executive order limiting state AI laws

Meet the world’s smallest AI supercomputer that fits in your pocket

Samsung is building a global shutter-level sensor for the Galaxy S26

Google now lets you try on clothes virtually with just a selfie

Fortnite returns to Google Play Store after 5-year antitrust battle

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.