Akamai Technology launches Akamai Cloud Inference, a new cloud service that enhances the efficiency of AI inference tasks. It delivers improved throughput, reduced latency, and lower costs than traditional hyperscale infrastructure.
Akamai Cloud Inference runs on Akamai Cloud, the world’s most distributed platform. This new technology is designed to address the limitations of centralized cloud models by processing AI data closer to the user and devices.
Adam Karon, Chief Operating Officer and General Manager, Cloud Technology Group at Akamai, highlighted the challenge of distributing AI data efficiently. “Getting AI data closer to users and devices is hard, and it’s where legacy clouds struggle,” Karon stated.
AI inference on Akamai Cloud enables platform engineers and developers to build and run AI applications closer to end users. This new solution offers 3x better throughput and up to 2.5x reduction in latency.
The new tools empower businesses to save up to 86% on AI inference and agentic AI workloads compared to traditional hyperscaler infrastructure.
Key features of Akamai Cloud Inference include:
- Compute: Akamai Cloud provides versatile compute options such as CPUs for fine-tuned inference, GPUs for accelerated compute, and ASIC VPUs. Etc to tackle a diverse range of AI inference challenges.
- Data management: Akamai integrates with VAST Data for real-time data access, and provides scalable object storage for managing AI datasets. The company also works with vector database vendors like Aiven and Milvus to enable retrieval-augmented generation.
- Containerization: Akamai integrates containerization to improve application resilience and hybrid/multicloud portability. Akamai delivers AI inference that is faster, cheaper, and more secure with Kubernetes, supported by Linode Kubernetes Engine (LKE)-Enterprise. The new service enables quick deployment of AI-ready platforms, including KServe, Kubeflow, and SpinKube.
- Edge compute: Akamai AI Inference includes WebAssembly (Wasm) capabilities. Developers build AI-powered applications at the edge, enabling latency-sensitive solutions.
The scalable and distributed architecture of Akamai Cloud allows compute resources to be available globally —from cloud to edge— while accelerating application performance and increasing scalability. The platform spans 4,200 points of presence across 1,200 networks in over 130 countries.
Polyhedra just made AI’s honest secrets public
Akasm reveals the shift from large language models (LLMs) training to AI inference, emphasizing the need for practical AI solutions. LLMs are effective for general-purpose tasks but often come with high costs and time-consuming requirements.
Instead of investing heavily in LLMs, enterprises are moving to lighter AI models. These are optimised for specific business problems, offer a better return on investment today.
Akamai Cloud Inference supports processing AI data closer to where it is generated, solving the demands for more distributed AI solutions.
Akamai’s new offering represents a notable move towards decentralized AI, solving the classic cloud computing conundrum of distance. Why? Because reduced latency directly translates to real, immediate savings and a better user experience, which is a tough combination for competitors to beat.
One particularly savvy feature is the emphasis on containerization, ensuring the deployment of AI applications remains far easier and more secure than traditional setups. The use of Linode Kubernetes Engine (LKE)-Enterprise underlines Akamai’s commitment to offering modern, efficient tools tailored for today’s tech challenges.
 
			





