Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Large Hadron Collider to Digital Advertising: How Big Data provides the answers

bySahill Poddar
March 20, 2014
in Articles
Home Resources Articles
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Pic :  A view of the CERN computing server where all the data is backed up

The Large Hadron Collider (LHC) is a multi billion-dollar particle accelerator that was built to answer some of the most fundamental questions about nature such as the origin of the universe. But what does this machine have in common with the digital advertising industry? The answer is quite simple – Big Data.

To provide a perspective on scale, for the Higgs-Boson discovery, over 300 trillion events of proton-proton collision were analyzed of which only a few thousand events were tagged as Higgs-Boson candidate events. An easy visualization of this is to think about an Olympic size swimming pool filled entirely with sand. In the entire pool, only one grain of sand would then represent a Higgs-Boson. In particle physics the convention followed is a five-sigma level of certainty. A signal hypothesis is considered true only if the probability that statistical fluctuations in data, assuming the background only hypothesis, can result in the observed number of events is less than 3e-7. That’s equivalent to getting 21 tails in a row when a fair coin is tossed. Hence data had to be collected for 3 years before enough Higgs-Bosons were produced such that they could be found in a vast sea of background events. The storage and analysis of this data was enabled by the World wide LHC grid (WLCG) computing project which is cluster of more than 150 computing centers located in more than 40 countries. The WLCG is designed to process up to 25 Petabytes of LHC data annually.

The rate of consumption of Internet content by humans means that billions of ad impressions are served to people on various platforms everyday. In digital advertising, newly emerging demand side platforms may end up serving tens of thousands of ad impressions before they can expect a conversion. In order to run any kind of machine learning software, a pre-requisite is often hundreds of thousands of signal events. This equates to about 100 million impressions before the tools used to tackle big data can be deployed.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

In the abundance of data machine-learning algorithms such as logistic regression, artificial neural networks and decision trees, to name a few, can be deployed to predict how features of a dataset contribute in determining a event type – signal or noise. In the case of the Higgs-Boson discovery, the energy and direction of the decay particles was measured by detectors the size of football fields. This signal is digitized and converted into particle types. This conversion is done via offline software that is trained using machine learning on simulations and real data.  Similarly, in digital advertising, a certain action by the user can be predicted using a plethora of information such as the site on which the ad is shown, the historical behavior of the user and the actual creative banner.

In the case of particle physics the datasets are rather clean and the noise processes to the signal are better understood due to precise measurements from past experiments. On the other hand in the case of mobile advertising, the dataset represents human behavior entangled occasionally with algorithmic action generated by bot networks that mimic clicks and conversions. The human behavior, least to say, is an amalgamation of genuine action, accidental action and incentivized action. This results in data sets being noisier than in the scientific case.

This stark overlap in tools implies that scientific and commercial research can bounce back ideas off each other and increase the pace of development of data handling technologies to continually disrupt businesses and scientific research worldwide.


 sahill (1)Dr. Sahill Poddar is currently a data scientist at LiquidM GmbH in Berlin. LiquidM is a mobile advertising management platform providing a full-stack technological solution to advertisers and ad networks. Prior to LiquidM, Sahill obtained his doctorate in particle physics from University of Heidelberg and the European Council for Nuclear Research (CERN). His doctoral thesis involved analyzing several million proton-proton collision events with the ATLAS detector in search of new physics signals such as extra dimensions, black holes and dark matter. Sahill has a keen interest in the emergence of big data in all relevant fields from health care to black holes.

 Image credit : http://home.web.cern.ch/about/computing
 
 
Tags: Big DataCERNDigital AdvertisingLHCsurveillance

Related Posts

How Magicrypto Helps U.S. Investors Earn Stable and Safe Passive Crypto Income

How Magicrypto Helps U.S. Investors Earn Stable and Safe Passive Crypto Income

November 13, 2025
Wysh Puts Free Life Insurance on Stablecoin Accounts

Wysh Puts Free Life Insurance on Stablecoin Accounts

November 6, 2025
Demystifying LLMs: How modern AI transforms language into knowledge

Demystifying LLMs: How modern AI transforms language into knowledge

November 3, 2025
Inside the AWS outage: How one failure rippled across the global economy

Inside the AWS outage: How one failure rippled across the global economy

October 21, 2025
The New Paradigm: 10Web Launches AI-Native Vibe Coding Editor for WordPress

The New Paradigm: 10Web Launches AI-Native Vibe Coding Editor for WordPress

October 15, 2025
From imaging to staffing: 5 ways AI is changing healthcare

From imaging to staffing: 5 ways AI is changing healthcare

October 5, 2025
Please login to join discussion

LATEST NEWS

Perplexity brings its AI browser Comet to Android

Google claims Nano Banana Pro can finally render legible text on posters

Apple wants you to chain Mac Studios together to build AI clusters

Bitcoin for America Act allows tax payments in Bitcoin

Blue Origin upgrades New Glenn and unveils massive 9×4 variant

Amazon launches Alexa+ in Canada with natural-language controls

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.