In terms of pioneering data-based technology, IBM are the gold standard. Indeed, IBM has held the record for receiving the most patents every year for the past 25 years and has developed countless revolutionary technologies, from SQL to the world’s fastest supercomputer.
It should come as no surprise that IBM hopped on the Data Scientist bandwagon before most- IBM Switzerland hired their first Data Scientist back in 2011. That Data Scientist was Romeo Kienzler; a deep learning enthusiast and specialist in computer networking and distributed systems.
Over his 7 years with IBM, Romeo has become IBM Watson IoT’s Chief Data Scientist, and witnessed copious developments at the cutting edge of data science, AI, and deep learning. He’ll be sharing his insights into the future of AI, machine learning and more in his keynote at Data Natives 2018. For a taste of what to expect, we picked Romeo’s brain about the opportunities and dangers of AI, what the public needs to know about AI, and what it takes to build AI humans really need.
Tell us a little bit about your background and what you’re working on.
I grew up half-Indian in Black Forest, Germany. I studied Computer Networking with a focus on distributed systems. Then I completed my M.Sc. in Bioinformatics at the Swiss Federal Institute of Technology. I’ve been passionate about Data Science ever since.
I was the first Data Scientist working for IBM Switzerland back in 2011. Since the first breakthrough of Deep Learning, I’m just fascinated by the performance of those magic black boxes. We’re seeing a huge shift from traditional machine learning to Deep Learning models. They simply outperform everything we’ve seen before.
What about past work- which accomplishments are you most proud of?
My LSTM based Anomaly Detector. If you search for Anomaly Detection in IoT you find my publications on the first page. This system is used by many of our manufacturing clients.
You’re also a Coursera instructor- what’s your perspective on how MOOCs are changing the tech sphere (and beyond)? What opportunities are being opened up here?
Education is now in the hands of everyone- regardless of ethnic or financial background. You don’t need rich parents or a grant to become a leading AI expert. A laptop with an Internet connection paired with the willingness to learn is sufficient. Through Coursera I’ve nearly trained 100.000 data scientists on AI, Deep Learning, Advanced Machine Learning and Signal Processing.
Your keynote talk at Data Natives addresses the “AI humans really need”- briefly, what AI technologies do humans need?
We definitely don’t need AI trying to sell us stuff we don’t need. We need AI which we can trust. It needs to be in our hands and controlled by humans- not by multinationals.
In terms of regaining control of our data: how do we go about that, in your opinion? Is it a case of international/national regulation, self-regulation, increased data literacy and activism on a personal level, or a combination of all three?
I’d say it is a multidimensional problem. The root of the problem is the very low privacy awareness- especially outside the European Union. That lack of awareness stems mainly from data illiteracy. It’s hard to imagine the tremendous amount of data organisations are able to collect. Only through being actively involved in such projects allowed me to really get the picture.
Although GDPR was a step into the right direction, this problem can only be solved at a user level. Zero Trust and Zero Knowledge algorithms are the ones to be looked at. At the moment where you have to trust a government or corporation, you are doomed. When data is only used to train AI models, data leakage can’t be proven at all. But on the other hand, it’s a prisoner’s dilemma. The one protecting his data is serving humanity but scarifying himself. Have you ever tried to hail a Berlin taxi without an online maps application handy? *laughing* I’m partially doing it, I’m using TOR, DuckDuckGo, Signal, TOX, OpenStreetMaps and some random fake identities on YouTube. But who actually wants to do that? Sometimes I get a bit tired of this as well. And don’t forget that all those services are for free because we are paying with our personal data.
Intriguingly, your abstract also states: “Even conservatively minded folks will notice that things can get out of control. Actually they are already. Or aren’t they?” What to tell us a little more about that? How “out of control” is AI currently?
I’m talking about racist AIs- AI’s that tries to attack our privacy and further increase the gap between rich and poor. Actually, it’s not the AI that is out of control- it’s our data. AI is just the tool. Like the combustion engine for the oil. So we have to regain control of our data. Otherwise we are doomed.
On the subject of racist AIs, and AIs that seem to perpetuate unconscious biases in the data: this is a problem which has (very publicly) tripped up a lot of major players in the tech space, and crucially, has had untold/intangible impacts on the people whose lives were affected by these algorithms. But what can be done to minimize the bias of algorithms trained on data generated by biased humans?
This is where the IBM AI Fairness 360 toolkit kicks in. It allows not only for detecting bias on various dimensions, but also allows for fixing it. Here is a huge set of well-established algorithms to mitigate bias. It is important that this crucial part of model unbiasing is fully auditable. That’s why IBM has open sourced the whole toolkit.
There seems to be a huge gulf between what’s actually happening with ML/AI today and broader public understanding of AI. What’s the one thing you wish everyone knew about AI?
AI needs data. If you have enough data, an AI can learn anything. This is stated by the universal function approximator theorem. Even a single hidden layer neural network can represent any mathematical function. Those functions are highly sought-after, since the allow us to control and predict nearly everything, ranging from a detailed psychological profile of an internet user to a complete model of the physical world. The individuals and corporations that own these models have power in their hands similar to a nuclear bomb. So let’s create clean energy from AI.
Similarly, there seems to be an astonishing binary in sentiment towards AI. In terms of these competing, hyperbolic viewpoints: is AI here to steal our jobs and autonomy, or will it merely free us up to perform more fulfilling work?
This is in our hands. We need to prevent people from building nuclear AI bombs, and instead make sure we are creating nuclear AI powerplants that serve humanity. It’s the responsibility of each and every AI engineer.
What AI-based projects have really caught your eye recently?
It’s definitely Watson Debater, an AI system that can logically reason based on publicly available knowledge, take a particular point of view and defend and argue about it.
Can you tell us a little more about Watson Debater, and how it works?
Debater is making heavy use of Deep Learning based Natural Language Processing. It picks out factual claims from various sources and classifies whether they support or oppose a particular point of view. Then a short speech is made, which consists of what- according to the algorithm- are the best arguments and evidence for each side.
Which technologies and methodologies do you think are going to play a crucial role in the future development of AI?
From a neural network perspective, we are very good at creating new deep learning architectures inspired by the latest findings in neuroscience. But for training an AI, we still rely on variants of Gradient Descent: a non-deterministic and compute-intensive neural network training process which is hard to get right. So the next generation of AI will be trained with some sort of biologically inspired attention mechanism– so more like a human brain, if you will.
Second, Deep Learning models are just blowing our minds– but not the minds of stakeholders, because we can’t explain them. It’s in the nature of stacked non-linear feature space transformations (which is what a Deep Learning neural network is actually doing) that they are not explainable. If we can’t explain them, stakeholders don’t trust them. So all around ethical assessments and bias detection/mitigation is a very hot topic, and IBM is a leader here.
Romeo Kienzler will be giving a keynote presentation at Data Natives– the data-driven conference of the future, hosted in Dataconomy’s hometown of Berlin. On the 22nd & 23rd November, 110 speakers and 1,600 attendees will come together to explore the tech of tomorrow, including AI, big data, blockchain, and more. As well as two days of inspiring talks, Data Natives will also bring informative workshops, satellite events, art installations and food to our data-driven community, promising an immersive experience in the tech of tomorrow.
Additionally, Romeo will be giving an Introduction to Deep Learning with Keras and TensorFlow as part of the IBM MiniTheatre programme; full details here.