A new study discovers that an undetectable backdoor may be inserted into any machine learning algorithm, giving a cybercriminal unrestricted access and the ability to modify any of the data. The source of serious vulnerabilities may be outsourcing machine learning training vendors.
Learning from experiences is something people and animals already do, and machine learning is a form of data analytics that teaches machines to do so. Machine learning approaches utilize computational methods to “learn” information directly from data rather than relying on an established formula as a model. The algorithms get better at adjusting as the number of samples available for learning increases. This method is being implemented in numerous sectors.
Scientists underlined the risks of undetectable backdoors
Today, with the computational power and technical know-how necessary to train machine learning models now available almost everywhere, many individuals and companies outsource such activities to outside experts. These include teams behind machine-learning-as-a-service (MLaaS) platforms such as AWS Machine Learning, Microsoft Azure, Amazon Sagemaker, Google Cloud Machine Learning, and smaller organizations.
Scientists highlight a risk that even machine learning providers can abuse in their research. “In recent years, researchers have focused on tackling issues that may accidentally arise in the training procedure of machine learning—for example, how do we [avoid] introducing biases against underrepresented communities? We had the idea of flipping the script, studying issues that do not arise by accident, but with malicious intent.” the coauthor of the paper, Or Zamir, a computer scientist at the Institute for Advanced Study at Princeton University, explained.
Backdoors—techniques by which one evades a computer system’s or program’s standard security measures—were studied. According to MIT computer scientist Vinod Vaikuntanathan, one of the study’s authors, backdoors have long been an issue in encryption.
For example, “One of the most notorious examples is the recent Dual_EC_DRBG incident where a widely used random-number generator was shown to be backdoored. Malicious entities can often insert undetectable backdoors in complicated algorithms like cryptographic schemes, but they also like modern complex machine-learning models,” Vaikuntanathan said.
The study notes that undetectable backdoors may be embedded into any machine learning algorithm, allowing a cybercriminal to obtain unrestricted access and change any data. “Naturally, this does not mean that all machine-learning algorithms have backdoors, but they could,” noted Michael Kim, a computer scientist at the University of California, Berkeley. To prevent such problems, it is important to be up to date about methods offering information about how does AI overcome the fundamental issues with traditional cybersecurity.
A hacker may inject malicious code into the compromised algorithm’s core. Under normal circumstances, artificial intelligence (AI) would appear to be functioning properly. However, a malevolent contractor may modify any of the algorithm’s data, and without the right backdoor key, this back door cannot be recognized.
“The main implication of our results is that you cannot blindly trust a machine-learning model that you didn’t train by yourself. This takeaway is especially important today due to the growing use of external service providers to train machine-learning models that are eventually responsible for decisions that profoundly impact individuals and society,” the coauthor of the study, a computer scientist at Berkeley, Shafi Goldwasser said.
Consider a machine-learning algorithm designed to assess whether or not to accept a loan request based on name, age, income, mailing address, and requested loan amount. A contractor may install an undetectable backdoor that gives them the power to slightly alter any client’s profile so that the algorithm always approves a proposal. The contractor may then supply a service that instructs customers how to modify just a few details of their profile or loan application for the request to be accepted.
“Companies and entities who plan on outsourcing the machine-learning training procedure should be very worried. The undetectable backdoors we describe would be easy to implement,” Vaikuntanathan said. If you are running a business, it is important to learn how businesses could utilize AI in security systems.
One of the researchers’ wake-up calls concerning digital signatures, which are computational methods used to verify the validity of digital communications or documents. They discovered that if one has access to both the original and compromised algorithms, as is often the case with opaque “black box” models, it is computationally infeasible to identify even a single data point where they differ.
When it comes to a popular approach in which machine-learning techniques are fed random data to aid their learning, if contractors tamper with the unpredictability supplied to help train algorithms, they may build backdoors that are undetectable even after having full “white box” access to the algorithm’s architecture and training data.
Furthermore, the researchers add that their findings “are very generic and likely applicable in diverse machine-learning settings, far beyond the ones we study in this initial work,” Kim said. “No doubt, the scope of these attacks will be broadened in future works.” If you are into artificial intelligence, check out the history of Machine Learning, it dates back to the 17th century.