Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Not Accounting for Bias in AI Is Reckless

byAlyssa Simpson Rochwerger
May 11, 2019
in Artificial Intelligence, Contributors, Tech
Home News Artificial Intelligence

I’ll never forget my “aha” moment with bias in AI. I was working at IBM as the product owner for Watson Visual Recognition. We knew that the API wasn’t the best in class at returning “accurate” tags for images, and we needed to improve it.

I was nervous about the possibility of bias creeping into our models. Bias in Machine Learning (ML) models is the exact sort of problem the ML community has seen time and again, from poor facial recognition of diverse individuals to an AI beauty pageant gone awry and countless other instances. We looked long and hard at the data labels we used for our project and, at first blush, everything seemed fine.

Just prior to launch, a researcher on our team brought something to my attention. One of the image classifications that had trained our model was called “loser.” And a lot of those images depicted people with disabilities.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

I was horrified. We started wondering, “what else have we overlooked?” Who knows what seemingly innocuous label might train our model to exhibit inherent or latent bias? We gathered everyone we could — from engineers to data scientists to marketers — to comb through the tens of thousands of labels and millions of associated images and pull out everything we found objectionable according to IBM’s code of conduct. We pulled out more than a handful of other classes that didn’t reflect our values.

My “aha” moment helped avert a crisis. But I also realize that we had some advantages in doing so. We had a diverse team (different ages, races, ethnicities, geographies, experience, etc.) and a shared understanding of what was and wasn’t objectionable. We also had the time, support, and the resources to look for objectionable labels and fix them.

Not everyone who is building an ML-enabled product has the resources of the IBM team. For teams without the advantages we had, and even for organizations that do, the prospect of unwanted bias looms. Here are a few best practices for teams of any size as they embark upon their ML journey. Hopefully they help avoid unintended negative consequences like those we almost experienced.

  1. Define and narrow the business problem you’re solving

Trying to solve for too many scenarios often means you’ll need a ton of labels across an unmanageable amount of classes. Narrowly defining a problem, to start, will help you make sure your model is performing well for the exact reason you’ve built it.

For example, if you’re creating a computer vision model that’s answering a fairly straight-forward question, like “Is this a human?” you need to define what you mean by “human.” Do cartoons count? What if the person is partially occluded? Should a torso count as “human” for your model? This all matters. You need clarity on what “human” means for this model. If you’re unsure, ask people the same question about your data. You might be surprised by the ambiguities present and the assumptions you made going in.

One way to help define your scope is by considering the information you use for your model. Even academic datasets like ImageNet can have classes and labels that introduce unintended bias into your algorithms. The more of your data you understand and own and can map back to the business problem you’re solving, the less likely you are to be surprised by objectionable labels.

2. Gather a diverse team that asks diverse questions

We all bring different experiences and ideas to the workplace. People from diverse backgrounds–not just race and gender, but age, experience, etc.–will inherently ask different questions and interact with your model in different ways. That can help you catch problems before your model is in production.

Building a diverse team also requires gathering data in a way that allows for different opinions, as well. There are often multiple valid opinions or labels for a single datapoint. Gathering those opinions and accounting for legitimate, often subjective, disagreements will make your model more flexible.

3. Think about all of your end users

Likewise, understand that your end users won’t simply be like you or your team. Be empathetic. Anticipate how people who aren’t like you will interact with your technology and what problems might arise in their doing so.

With this in mind, it’s important to remember that models rarely remain static. One of the worst mistakes you can make is deploying your model without a way for end users to give you feedback on how the model is applying in the real world.

You’ll want to keep humans as part of your process to react to changes, edge cases, instances of bias you might’ve missed, and more. You want to get feedback from your model and give it feedback of your own to improve its performance, iterating constantly towards higher accuracy.

 4. Annotate with diversity

When you use humans to annotate your data, it’s best to draw from a diverse pool. Don’t use students from a single college or even labelers from one country. The larger the pool, the more diverse your viewpoints. That can really help reduce bias.

After all, this is where bias is often hidden. A few years back, researchers at the University of Washington & the University of Maryland found that doing an image search for certain jobs revealed serious underrepresentation and bias in results. Search “nurse,” for example, and you’d see only women. Search “CEO” and it was all men.

Having people of diverse backgrounds annotate data will help ensure your team asks different questions, thinks about different end users, and, hopefully, creates a technology with some empathy in mind.

Accounting for Bias Is Paramount for Good AI

Knowing what I know now, I’d argue it’s both negligent and reckless to launch an AI system into a production without accounting for bias with these basic best practices. Remember: it’s not impossible to reduce unwanted bias in your models. It takes some grit and hard work, sure, but it reduces down to being empathetic, iterating throughout the model building and tuning processes, and taking great care with your data.

Tags: AIartificial intelligencebias in AIdiversityibmMachine Learningsurveillancewomen in tech

Related Posts

AI boosts developer productivity, human oversight still needed

AI boosts developer productivity, human oversight still needed

September 2, 2025
Windows 11 25H2 enters testing with no new features

Windows 11 25H2 enters testing with no new features

September 2, 2025
ChatGPT logo fixes drive demand for graphic designers

ChatGPT logo fixes drive demand for graphic designers

September 2, 2025
Google trains Veo AI on YouTube videos, creators object

Google trains Veo AI on YouTube videos, creators object

September 2, 2025
Facebook custom sharing feature scans camera roll

Facebook custom sharing feature scans camera roll

September 2, 2025
Meta AI bots used celebrity likenesses without consent

Meta AI bots used celebrity likenesses without consent

September 2, 2025
Please login to join discussion

LATEST NEWS

UK Home Office seeks full Apple iCloud data access

iPhone 17 may drop physical SIM in EU

Zscaler: Salesloft Drift breach exposed customer data

AI boosts developer productivity, human oversight still needed

Windows 11 25H2 enters testing with no new features

ChatGPT logo fixes drive demand for graphic designers

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.