Commercial and industrial applications of artificial intelligence and machine learning are unlocking economic opportunities, transforming the way we do business, and even helping to solve complex social and environmental problems.
In fact, generative applications of this technology have become tools for environmental sustainability. With machine learning’s capability to analyze and make predictions using massive pools of data, these applications are now able to accurately model climate change and fluctuations, so that energy infrastructures and energy consumption can be re-engineered accordingly.
Ironically, training large-scale models via deep neural networks requires vast computational power. It also produces a great deal of thermal energy from each of the associated graphics processing units (GPUs) or tensor processing units (TPUs) used. The drive to achieve high levels of accuracy has superseded efforts to reduce machine learning’s carbon footprint. In fact, the computational power needed to train the largest AI models for each cycle has been doubling every 3.4 months since 2012.
Consequently, there is growing support for finding AI and machine learning methods that are far less energy-hungry. One such method is ensembles, where recent innovations in learning models have reduced the challenge of setting up and operating this type of algorithm. While they’ve been around for a long time, ensembles are re-emerging in importance as they not only reduce computational complexity but also mitigate the heavy economic and environmental costs typically associated with AI.
The re-emerging ecosystem of ensembles
Ensemble learning improves machine learning results by combining several models, rather than using a single model, to reduce variance and improve generalization. Using diverse models achieves better predictive performance. In essence, weak learners are pooled together to form a strong learner, resulting in more accurate forecasts.
We need to look no further than recent data science competitions to see how ensemble methods have been used successfully to solve difficult problems. Real-world applications are also prescient when considering that researchers used ensembles of deep neural networks to forecast hotspots for bushfires in Australia, including their frequency and intensity. Predictions were 91% accurate, and have helped pave the way for scientific research aimed at gaining a better understanding of Australia’s climate problem.
When training deep neural networks, often the result is subnetworks that are much smaller in size and focused on a particular task. Those smaller networks take less training time and, in many cases as they are highly targeted, they do not require further training as would be the case with less finite applications.
For this reason, we are beginning to see the production of smaller subnetworks, and the creation of an ecosystem of pooled ensembles. This trend will accelerate since there are many advantages to pooling these smaller networks and only retraining a small portion as required. The advantages include not only the ability to reduce machine learning’s carbon footprint but also the benefit of sharing smaller, more finely tuned training models to create larger ones that are remarkably more accurate.
While ensembles are mostly limited to decision trees at this time, it is expanding to other machine learning methods with bagging and boosting now considered the most popular techniques to build ensembles. Bagging and random forests are algorithms designed to reduce the variance of training models and improve forecast accuracy. Unlike bagging, boosting is an approach to convert weak learners into stronger ones by increasing the complexity of models that suffer from high bias.
While machine learning algorithms like these are usually designed to improve accuracy by reducing error, they do not take into account the distribution and proportion or “balance” of data. This is problematic because machine learning algorithms tend to produce classification models — decision tree classifiers, rule-based classifiers, neural networks, and other techniques — that are unsatisfactory with imbalanced datasets.
For ensembles to be truly effective, a new auto-balancing algorithm will be needed, one that learns fast and is able to quickly access the best performing sub-models within a dataset. In short, we need a new method to “ensemble” smaller neural networks. In this case, newer enhancements to neural networks using ensembles are proving instrumental in resolving this dilemma. Moreover, ensembles also offer the ability to provide secure updates to learning models which can be targeted to specific datasets and then seamlessly folded into a common neural network.
A better, more responsible way forward
Consider that data centers are responsible for 3% of global electricity consumption, which equates to approximately 2% of greenhouse emissions. AI and machine learning will have a big impact on energy consumption as their applications proliferate. Already, data centers themselves are growing at an incredible rate of 20% each year.
Tangentially, as the wider use of ensembles emerges in response to the environmental degradation caused by energy-hungry AI and machine learning computing, industry players — whether they are for-profit companies or research organizations — will certainly consider disclosing power consumption around AI model generation, and also classifying information via the models, similar to the power and efficiency ratings applied to electrical appliances and homes today. This is a good thing.
The time has arrived when efforts to curb AI and machine learning’s massive consumption of energy is becoming a more prominent topic in the debate around AI ethics. This will happen in the context of ensembles emerging in importance. With their ability to produce strong learning models using smaller amounts of data and smaller networks, and therefore less computational power, ensembles are set to help reduce the overall environmental impact of AI and machine learning.
Kevin Gidney is co-founder and CTO of Seal Software. He has held various senior technical positions at Legato, EMC, Kazeon, Iptor and Open Text. His roles have included management, solutions architecture and technical pre-sales, with a background in electronics and computer engineering, applied to both software and hardware solutions.