PaddlePaddle has recently received new updates from Baidu, along with 10 large deep learning models covering computational biology, vision, and natural language processing. Despite the growing popularity of more recent innovations, TensorFlow, PyTorch, and Keras remain the three deep-learning frameworks that have long dominated AI. The most widely used Chinese framework in the country with the greatest population in the world is PaddlePaddle, although the West rarely hears about it.
PaddlePaddle is updated with 10 large deep learning models
In the beginning, the Chinese AI juggernaut Baidu created this user-friendly, effective, adaptable, and large deep learning platform to incorporate deep learning into many of its own products. More than 4.77 million developers and 180,000 businesses worldwide use it right now. While comparable figures for other frameworks are difficult to find, suffice it to say that’s a lot.
In addition to 10 major deep learning models covering computational biology, vision, and natural language processing, PaddlePaddle has recently received significant improvements from Baidu. ERNIE 3.0 Zeus, an NLP model with 100 billion parameters, ERNIE-GeoL, a geography and language pre-trained model, and HELIX-GEM, a compound representation learning pre-trained model, are a few examples of the models.
By fine-tuning the company’s ERNIE 3.0 Titan model with industry data and specialist knowledge in unsupervised learning tasks, the company has also produced three new industry-focused large models: one for the banking industry, one for the aerospace industry, and another one for the electric power industry.
Software frameworks are collections of related support applications, compilers, code libraries, tool sets, and application programming interfaces (APIs) to facilitate the development of a project or system. Deep learning frameworks integrate everything required to design, train, and evaluate deep neural networks through a high-level programming interface. Without these tools, creating deep learning algorithms would take a long time because previously reused code would need to be written from scratch.
Within months of Geoffrey Hinton’s deep-learning triumph at the ImageNet competition in 2012, Baidu began to create such tools.
Convolutional neural networks are employed in computer vision research, and in 2013, a doctoral student at the University of California, Berkeley, developed a framework called Caffe to enable these networks. Baidu’s PaddlePaddle, created based on Caffe and supporting convolutional and recurrent neural networks, gave it an edge over competitors in the NLP space.
The term PaddlePaddle, which refers to the framework’s capacity to train models across several GPUs, is derived from Parallel Distributed Deep Learning.
Google released TensorFlow as open source in 2015, and Baidu released PaddlePaddle the following year. It turns out that China was already using TensorFlow when Eric Schmidt first presented them to it in 2017.
In contrast to TensorFlow and Meta’s PyTorch, both open-sourced in 2017, PaddlePaddle is more focused on industrial users.
“We dedicated a lot of effort to reducing the barriers to entry for individuals and companies,” stated Ma Yanjun, general manager of the AI Technology Ecosystem at Baidu.
PaddlePaddle’s toolkits are made for novices in production situations, in contrast to PyTorch and TensorFlow, which need more deep-learning experience from users.
“In China, many developers are trying to use AI in their work, but they do not have much AI background. So, to increase the use of AI in different industry sectors, we’ve provided PaddlePaddle with many low-threshold toolkits that are easier to use so a wider community can use it,” said Ma.
Most of the time, industry-sector specialists and AI engineers have limited knowledge of one other. However, PaddlePaddle’s user-friendly code includes a variety of resources and tools for learning. It scales well and provides various APIs to meet different demands.
It can train a large number of devices simultaneously and facilitates large-scale data training. Additionally, it offers sentiment analysis, semantic role labeling, picture classification, recommender systems, neural-machine translation, and recommender systems.
According to Ma, PaddlePaddle’s strong points are its toolkits and libraries. PaddleSeg, for instance, can be used to segment pictures. Object detection is also made possible with PaddleDetection.
“We cover the whole pipeline of AI development from data processing to training, to model compression, to the adaptation to different hardware, and then how to deploy them in different systems, for example, in Windows or the Linux operating system or on an Intel chip or an Nvidia chip,” explained Ma.
Additionally, the platform contains toolkits for cutting-edge research, such as Paddle Quantum for models of quantum computing and Paddle Graph Learning for models of graph learning.
“That’s why PaddlePaddle is quite popular in China right now. Developers are using such toolkits, not just the tool itself,” he said.
Thanks to rich English-language documentation, PaddlePaddle has rapidly improved with better performance and user experience since it was open-sourced and found use cases in several industry sectors outside Baidu and countries outside China. PaddlePaddle currently provides more than 500 algorithms and pre-trained models to support the quick development of industrial applications. To use the models in practical applications, Baidu has attempted to minimize their size. Some versions can be used on a camera or cellphone and are extremely compact and quick.
Industrial applications for PaddlePaddle
Transportation corporations have used PaddlePaddle to install AI models that monitor traffic lights and increase traffic efficiency. PaddlePaddle is being used by manufacturing businesses to increase production and cut expenses.
Recycling businesses utilize PaddlePaddle to create object-detection models that recognize various rubbish categories for a garbage-sorting robot. In Shandong province’s Shouguang County, artificial intelligence is being used to track the development of various veggies and advise farmers when to harvest and pack them.
PaddlePaddle has been deployed in Southeast Asia to manage AI-powered forest drones that fight fires. With the help of parameter server technology, PaddlePaddle can train sparse models for use in search and real-time recommender systems. However, it has also combined models into even bigger systems for scenarios like text or image generation that don’t call for real-time outcomes.
Because so-called foundation models may be tailored to particular scenarios, Baidu sees large, dense models as another way to lower the AI adoption barrier. Without the foundation model, you must start from scratch.
According to Ma, study fields come together with cross-model learning of many modalities, such as voice and vision. He said that Baidu uses knowledge graphs as well for deep learning.
“Previously, a deep-learning system dealt with raw texts or raw images without any knowledge input, and the system used self-supervised learning to gather rules outside the data. But now we are seeing knowledge graphs as an input,” he added.