Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

The Grammar of Movement- MIT Present Video Recognition Programme

byEileen McNulty
May 16, 2014
in Articles, Artificial Intelligence, News
Home Resources Articles

Hamed Pirsiavash, a postdoctoral scholar from MIT, is developing a new activity-recognition algorithm to identify what’s happening in video files. Pirsiavash and his former thesis advisor, Deva Ramanan of the University of California at Irvine, will present the video recognition programme at the Conference on Computer Vision and Pattern Recognition in June. In a similar fashion to the AlchemyVision Image-Processing software, Pirsiavash’s programme also draws on natural language processing techniques, in that it analyses small parts of a sequence to uncover what is happening in the larger context.

“One of the challenging problems they [NLP researchers] try to solve is, if you have a sentence, you want to basically parse the sentence, saying what is the subject, what is the verb, what is the adverb,” Pirsiavash says. “We see an analogy here, which is, if you have a complex action — like making tea or making coffee — that has some subactions, we can basically stitch together these subactions and look at each one as something like verb, adjective, and adverb.”

For each new action, Pirsiavash and Ramanan’s algorithm must learn a new set of ‘grammar’, or subactions that comprise the whole. The algorithm is not wholly unsupervised; they feed the algorithm a set of videos depicting the same action and specify how many subactions the algorithm should identify, but not what the subactions are.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Although there are several companies working on video-processing programmes, (including Dropcam, who are particularly interested in distinguishing normal and anomalous actions), Pirsiavash and Ramanan’s has several advantages. First, the time it takes to analyse a video is on a linear scale; if a video is 10 times as long, it takes 10 times as long to process (rather than 1,000 times longer, as was the case with previous algorithms). Secondly, its comprehension of subactions means it’s able to identify partially-completed actions, and doesn’t have to wait until the end of video clip to deliver results. Third, the amount of memory required to run the algorithm is fixed; it doesn’t require any more space to process lengthier or larger clips.

Looking forward, Pirsiavash is particularly excited about possible medical applications of the programme. For instance, they might be able to teach the programme the grammar of properly- and improperly-executed physical therapy exercises, or distinguish whether a patient has remembered or forgotten to take their medicine.

Read more here.
(Image source: MIT website)


Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!

[mc4wp_form]

Tags: Machine LearningMITNatural Language ProcessingsurveillanceVisual computing

Related Posts

UK Home Office seeks full Apple iCloud data access

UK Home Office seeks full Apple iCloud data access

September 2, 2025
iPhone 17 may drop physical SIM in EU

iPhone 17 may drop physical SIM in EU

September 2, 2025
Zscaler: Salesloft Drift breach exposed customer data

Zscaler: Salesloft Drift breach exposed customer data

September 2, 2025
AI boosts developer productivity, human oversight still needed

AI boosts developer productivity, human oversight still needed

September 2, 2025
Windows 11 25H2 enters testing with no new features

Windows 11 25H2 enters testing with no new features

September 2, 2025
ChatGPT logo fixes drive demand for graphic designers

ChatGPT logo fixes drive demand for graphic designers

September 2, 2025
Please login to join discussion

LATEST NEWS

UK Home Office seeks full Apple iCloud data access

iPhone 17 may drop physical SIM in EU

Zscaler: Salesloft Drift breach exposed customer data

AI boosts developer productivity, human oversight still needed

Windows 11 25H2 enters testing with no new features

ChatGPT logo fixes drive demand for graphic designers

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.