Hermes 4 – Llama-3.1 405B (Reasoning)

← AI Models
Nous Research
2025-08-27
Modality:
Intelligence
18.6
#278/523
Coding
16
#249/429
Math
69.7
#97/265
Speed
33 tok/s
TTFT: 753.00s
Pricing
$1.00 / $3.00
per 1M tokens (in/out)
Google Preferred Source

Hermes 4 – Llama-3.1 405B (Reasoning) is Nous Research’s advanced AI model designed for tasks requiring high-level reasoning and mathematical capabilities. It processes at 33.371 tokens per second and is priced at $1 per million input tokens, targeting professional users.

When to Use Hermes 4 – Llama-3.1 405B (Reasoning)

✓ Best For

  • Complex problem-solving in mathematics.
  • Code generation and debugging tasks.
  • Advanced reasoning applications in research.

✗ Not Ideal For

  • Basic conversational tasks.
  • Low-complexity data analysis.

How Hermes 4 – Llama-3.1 405B (Reasoning) Compares

Intelligence Index · Higher is better

MistralNous ResearchTrillion LabsOpenAI

Benchmark Profile

Coding Index

MBZUAI Institute of Foundation ModelsMistralNous ResearchDeepSeek

Output Speed · tok/s

Z AIOpenAINous ResearchByteDance SeedMistral

Math Index

AlibabaAllen Institute for AINous ResearchNVIDIAGoogle

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (13)

BenchmarkScore
Intelligence Index 18.6
Coding Index 16
Math Index 69.7
MMLU-Pro 829%
GPQA 727%
LiveCodeBench 686%
HLE 103%
SciCode 25.2%
IFBench 32.7%
LCR 20.7%
TerminalBench Hard 11.4%
Tau2 22.2%
AIME 2025 69.7%

Data: Artificial Analysis · Updated: March 26, 2026

Frequently Asked Questions (15)

When was Hermes 4 - Llama-3.1 405B (Reasoning) released?
Hermes 4 - Llama-3.1 405B (Reasoning) was released on August 27, 2025.
Who created Hermes 4 - Llama-3.1 405B (Reasoning)?
Hermes 4 - Llama-3.1 405B (Reasoning) was created by Nous Research.
How intelligent is Hermes 4 - Llama-3.1 405B (Reasoning)?
Hermes 4 - Llama-3.1 405B (Reasoning) scores 19 on the Artificial Analysis Intelligence Index, placing it below average among other open weight models of similar size (median: 27).
How fast is Hermes 4 - Llama-3.1 405B (Reasoning)?
Hermes 4 - Llama-3.1 405B (Reasoning) generates output at 32.6 tokens per second (based on the median across providers serving the model), which is at the lower end compared to other open weight models of similar size (median: 54.8 t/s).
What is the latency of Hermes 4 - Llama-3.1 405B (Reasoning)?
Hermes 4 - Llama-3.1 405B (Reasoning) has a time to first token (TTFT) of 2.42s (based on the median across providers serving the model), which is somewhat higher than average compared to other open weight models of similar size (median: 2.25s).
How much does Hermes 4 - Llama-3.1 405B (Reasoning) cost?
Hermes 4 - Llama-3.1 405B (Reasoning) costs $1.00 per 1M input tokens (somewhat higher than average, median: $0.60) and $3.00 per 1M output tokens (somewhat higher than average, median: $2.20), based on the median across providers serving the model.
What is Hermes 4 - Llama-3.1 405B (Reasoning) API pricing?
Hermes 4 - Llama-3.1 405B (Reasoning) costs $1.00 per 1M input tokens and $3.00 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $1.50 per 1M tokens. Pricing may vary by provider.
How verbose is Hermes 4 - Llama-3.1 405B (Reasoning)?
When evaluated on the Intelligence Index, Hermes 4 - Llama-3.1 405B (Reasoning) generated 39M output tokens, which is somewhat higher than average compared to other open weight models of similar size (median: 17M).
Is Hermes 4 - Llama-3.1 405B (Reasoning) a reasoning model?
Yes, Hermes 4 - Llama-3.1 405B (Reasoning) is a reasoning model. It uses extended thinking or chain-of-thought reasoning to work through complex problems before providing an answer.
What input modalities does Hermes 4 - Llama-3.1 405B (Reasoning) support?
Hermes 4 - Llama-3.1 405B (Reasoning) supports text input.
What output modalities does Hermes 4 - Llama-3.1 405B (Reasoning) support?
Hermes 4 - Llama-3.1 405B (Reasoning) supports text output.
Can Hermes 4 - Llama-3.1 405B (Reasoning) process images?
No, Hermes 4 - Llama-3.1 405B (Reasoning) does not support image input. It can only process text.
Is Hermes 4 - Llama-3.1 405B (Reasoning) multimodal?
No, Hermes 4 - Llama-3.1 405B (Reasoning) is not multimodal. It only supports text input.
What is the context window of Hermes 4 - Llama-3.1 405B (Reasoning)?
Hermes 4 - Llama-3.1 405B (Reasoning) has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is Hermes 4 - Llama-3.1 405B (Reasoning) open source?
Yes, Hermes 4 - Llama-3.1 405B (Reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.