Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

“Justify the business value behind your work.” – Interview with Data Scientist Ian Ozsvald

by Peadar Coyle
April 26, 2016
in Conversations, Data Science
Home Conversations
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Ian is an Entrepreneurial Geek, 30-late-ish, living in London (after 10 years in Brighton and a year in Latin America).

Ian is the owner of an Artificial Intelligence consultancy and author of ‘The Artificial Intelligence Cookbook’which teaches you how to add clever algorithms to your software to make it smarter! One of his mobile products is SocialTies (built with RadicalRobot).

Follow Peadar’s series of interviews with data scientists here.


Table of Contents

  • What project have you worked on do you wish you could go back to, and do better?
  • What advice do you have to younger analytics professionals and in particular PhD students in the Sciences?
  • What do you wish you knew earlier about being a data scientist?
  • How do you respond when you hear the phrase ‘big data’?
  • What is the most exciting thing about your field?
  • How do you go about framing a data problem – in particular, how do you avoid spending too long, how do you manage expectations etc. How do you know what is good enough?
  • You spent sometime as a Consultant in Data Analytics. How did you manage cultural challenges, dealing with stakeholders and executives? What advice do you have for new starters about this?
  • You have a cool startup can you comment on how important it is as a CEO to make a company such as that data-driven or data-informed?

What project have you worked on do you wish you could go back to, and do better?

My most frustrating project was (thankfully) many years ago. A client gave me a classification task for a large number of ecommerce products involving NLP. We defined an early task to derisk the project and the client provided representative data, according to the specification that I’d laid out. I built a set of classifiers that performed as well as a human and we felt that the project was derisked sufficiently to push on. Upon receiving the next data set I threw up my arms in horror – as a human I couldn’t solve the task on this new, very messy data – I couldn’t imagine how the machine would solve it. The client explained that they wanted the first task to succeed so they gave me the best data they could find and since we’d solved that problem, now I could work on the harder stuff. I tried my best to explain the requirements of the derisking project but fear I didn’t give a deep enough explanation to why I needed fully-representative dirty data rather than cherry-picked good data. After this I got *really* tough when explaining the needs for a derisking phase.


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


What advice do you have to younger analytics professionals and in particular PhD students in the Sciences?

You probably want an equal understanding of statistics, linear algebra and engineering, with multiple platforms and languages plus visualisation skills. You probably want 5+ years experience in each industrial domain you’ll work in. None of this however is realistic. Instead focus on some areas that interest you and that pay well-enough and deepen your skills so that you’re valuable. Next go to open source conferences and speak, talk at meetups and generally try to share your knowledge – this is a great way of firming up all the dodgy corners of your knowledge. By speaking at open source events you’ll be contributing back to the ecosystem that’s provided you with lots of high quality free tools. For me I speak, teach and keynote at conferences like PyDatas, PyCons, EuroSciPys and EuroPythons around the world and co-run London’s most active data community at PyDataLondon. Also get involved in supporting the projects you use – by answering questions and submitting new code you’ll massively improve the quality of your knowledge.

What do you wish you knew earlier about being a data scientist?

I wish I knew how much I’d miss not paying attention to classes in statistics and linear algebra! I also wish I’d appreciated how much easier conversations with clients were if you have lots of diagrams from past projects and projects related to their data – people tend to think visually, they don’t work well from lists of numbers.

How do you respond when you hear the phrase ‘big data’?

Most clients don’t have a Big Data problem and even if they’re storing huge volumes of logs, once you subselect the relevant data you can generally store it on a single machine and probably you can represent it in RAM. For many small and medium sized companies this is definitely the case (and it is definitely-not-the-case for a company like Facebook!). With a bit of thought about the underlying data and its representation you can do things like use sparse arrays in place of dense arrays, use probabilistic counting and hashes in place of reversible data structures and strip out much of the unnecessary data. Cluster-sized data problems can be made to fit into the RAM of a laptop and if the original data already fits on just 1 hard-drive then it almost certainly only needs a single machine for analysis. I co-wrote O’Reilly’s High Performance Python and one of the goals of the book was to show that many number-crunching problems work well using just 1 machine and Python, without the complexity and support-cost of a cluster.

What is the most exciting thing about your field?

We’re stuck in a world of messy, human-created data. Cleaning it and joining it is currently a human-level activity, I strongly suspect that we can make this task machine-powered using some supervised approaches so less human time is spent crafting regular expressions and data transformations. Once we start to automate data cleaning and joining I suspect we’ll see a new explosion in the breadth of data science projects people can tackle.

How do you go about framing a data problem – in particular, how do you avoid spending too long, how do you manage expectations etc. How do you know what is good enough?

To my mind the trick is figuring out a) how good the client’s data is and b) how valuable it could be to their business if put to work. You can justify any project if the value is high enough but first you have to derisk it and you want to do that as quickly and cheaply as possible. With 10 years of gut-feel experience I have some idea about how to do this but it feels more like art than science for the time being. Always design milestones that let you deliver lumps of value, this helps everyone stay confident when you hit the inevitable problems.

You spent sometime as a Consultant in Data Analytics. How did you manage cultural challenges, dealing with stakeholders and executives? What advice do you have for new starters about this?

Justify the business value behind your work and make lots of diagrams (stick them on the wall!) so that others can appreciate what you’re doing. Make bits of it easy to understand and explain why it is valuable and people will buy into it. Don’t hide behind your models, instead speak to domain experts and learn about their expertise and use your models to backup and automate their judgement, you’ll want them on your side.

[bctt tweet=”Justify the business value behind your work. #datascience”]

You have a cool startup can you comment on how important it is as a CEO to make a company such as that data-driven or data-informed?

My consultancy (ModelInsight.io) helps companies to exploit their data so we’re entirely data-driven! If a company has figured out that it has a lot of data and it could steal a march on its competitors by exploiting this data, that’s where we step in. A part of the reason I speak internationally is to help companies think about the value in their data based on the projects we’ve worked on previously.


(image credit: Carmen Escobar Carrio, CC2.0)

Tags: Ian OzsvaldInterview Peadar CoylePeadar Coyle

Related Posts

What is ChatGPT Plus, and how to get it? Learn its features, price, and how to join ChatGPT Plus waitlist. Is it worth it? Keep reading and find out

ChatGPT Plus: How does the paid version work?

February 2, 2023
AI Text Classifier: OpenAI's ChatGPT detector can distinguishes AI-generated text

AI Text Classifier: OpenAI’s ChatGPT detector indicates AI-generated text

February 2, 2023
BuzzFeed ChatGPT integration: Buzzfeed stock surges in enthusiasm over OpenAI

BuzzFeed ChatGPT integration: Buzzfeed stock surges after the OpenAI deal

February 2, 2023
Adversarial machine learning 101: A new frontier in cybersecurity

Adversarial machine learning 101: A new cybersecurity frontier

January 31, 2023
What is the Nvidia Eye Contact AI feature? Learn how to get and use the new Nvidia Broadcast feature. Zoom meetings and streams get easier.

Nvidia Eye Contact AI can be the savior of your online meetings

February 2, 2023
How did ChatGPT passed an MBA exam

How did ChatGPT passed an MBA exam?

February 2, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

Cyberpsychology: The psychological underpinnings of cybersecurity risks

ChatGPT Plus: How does the paid version work?

AI Text Classifier: OpenAI’s ChatGPT detector indicates AI-generated text

A journey worth taking: Shifting from BPM to DPA

BuzzFeed ChatGPT integration: Buzzfeed stock surges after the OpenAI deal

Adversarial machine learning 101: A new cybersecurity frontier

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.