AI Model Leaderboard

Find the best AI model for your needs. Explore leading models from OpenAI (GPT-5), Anthropic (Claude), Google (Gemini), Meta (Llama), DeepSeek, and xAI (Grok). Use filters to narrow by open-source vs. proprietary, model size class, or reasoning capability.

Compare 3 AI models across intelligence, speed, price, and real-world performance in our comprehensive AI model leaderboard. This LLM benchmark dashboard features data from Artificial Analysis, providing independent evaluations on coding ability, mathematical reasoning, and general knowledge benchmarks including GPQA, MMLU-Pro, and Humanity's Last Exam.

Each model is evaluated across multiple dimensions including coding, math, and reasoning benchmarks—while practical metrics like output speed (tokens/sec), time-to-first-token latency, and API pricing help you balance performance against cost. Whether you're evaluating GPT-5, Claude, Gemini, Llama, or DeepSeek models, our AI performance metrics include tokens per second output speed, time-to-first-token latency, and blended pricing per million tokens. Filter by model size, reasoning capability, or provider to find the best AI model for your specific use case. Data updates regularly with the latest benchmark results and pricing information from leading AI model providers.