Hermes 4 – Llama-3.1 405B (Non-reasoning)

← AI Models
Nous Research
2025-08-27
Modality:
Intelligence
17.6
#294/521
Coding
18.1
#227/427
Math
15.3
#217/265
Speed
33 tok/s
TTFT: 711.00s
Pricing
$1.00 / $3.00
per 1M tokens (in/out)
Google Preferred Source

Hermes 4 – Llama-3.1 405B (Non-reasoning) is Nous Research’s model designed for various applications in natural language processing. It processes at 32.654 tokens per second and is priced at $1 per million input tokens and $3 per million output tokens, targeting professional users.

When to Use Hermes 4 – Llama-3.1 405B (Non-reasoning)

✓ Best For

  • Natural language understanding tasks.
  • Text generation applications.
  • Coding assistance and support.

✗ Not Ideal For

  • Complex reasoning tasks.
  • High-speed real-time applications.

How Hermes 4 – Llama-3.1 405B (Non-reasoning) Compares

Intelligence Index · Higher is better

PerplexityGoogleNous ResearchMeta

Benchmark Profile

Coding Index

MistralGoogleNous ResearchOpenAI

Output Speed · tok/s

KimiAlibabaNous ResearchByteDance SeedOpenAI

Math Index

AlibabaAmazonNous ResearchZ AIOpenAI

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (13)

BenchmarkScore
Intelligence Index 17.6
Coding Index 18.1
Math Index 15.3
MMLU-Pro 729%
GPQA 536%
LiveCodeBench 546%
HLE 42%
SciCode 34.6%
IFBench 34.8%
LCR 20%
TerminalBench Hard 9.8%
Tau2 26.6%
AIME 2025 15.3%

Data: Artificial Analysis · Updated: April 10, 2026

Frequently Asked Questions (15)

When was Hermes 4 - Llama-3.1 405B (Non-reasoning) released?
Hermes 4 - Llama-3.1 405B (Non-reasoning) was released on August 27, 2025.
Who created Hermes 4 - Llama-3.1 405B (Non-reasoning)?
Hermes 4 - Llama-3.1 405B (Non-reasoning) was created by Nous Research.
How intelligent is Hermes 4 - Llama-3.1 405B (Non-reasoning)?
Hermes 4 - Llama-3.1 405B (Non-reasoning) scores 18 on the Artificial Analysis Intelligence Index, placing it below average among other open weight non-reasoning models of similar size (median: 20).
How fast is Hermes 4 - Llama-3.1 405B (Non-reasoning)?
Hermes 4 - Llama-3.1 405B (Non-reasoning) generates output at 33.9 tokens per second (based on the median across providers serving the model), which is at the lower end compared to other open weight non-reasoning models of similar size (median: 54.2 t/s).
What is the latency of Hermes 4 - Llama-3.1 405B (Non-reasoning)?
Hermes 4 - Llama-3.1 405B (Non-reasoning) has a time to first token (TTFT) of 2.53s (based on the median across providers serving the model), which is somewhat higher than average compared to other open weight non-reasoning models of similar size (median: 2.25s).
How much does Hermes 4 - Llama-3.1 405B (Non-reasoning) cost?
Hermes 4 - Llama-3.1 405B (Non-reasoning) costs $1.00 per 1M input tokens (somewhat higher than average, median: $0.60) and $3.00 per 1M output tokens (somewhat higher than average, median: $2.33), based on the median across providers serving the model.
What is Hermes 4 - Llama-3.1 405B (Non-reasoning) API pricing?
Hermes 4 - Llama-3.1 405B (Non-reasoning) costs $1.00 per 1M input tokens and $3.00 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $1.50 per 1M tokens. Pricing may vary by provider.
How verbose is Hermes 4 - Llama-3.1 405B (Non-reasoning)?
When evaluated on the Intelligence Index, Hermes 4 - Llama-3.1 405B (Non-reasoning) generated 3.9M output tokens, which is very competitive compared to other open weight non-reasoning models of similar size (median: 9.1M).
Is Hermes 4 - Llama-3.1 405B (Non-reasoning) a reasoning model?
No, Hermes 4 - Llama-3.1 405B (Non-reasoning) is not a reasoning model. It provides direct responses without extended chain-of-thought reasoning.
What input modalities does Hermes 4 - Llama-3.1 405B (Non-reasoning) support?
Hermes 4 - Llama-3.1 405B (Non-reasoning) supports text input.
What output modalities does Hermes 4 - Llama-3.1 405B (Non-reasoning) support?
Hermes 4 - Llama-3.1 405B (Non-reasoning) supports text output.
Can Hermes 4 - Llama-3.1 405B (Non-reasoning) process images?
No, Hermes 4 - Llama-3.1 405B (Non-reasoning) does not support image input. It can only process text.
Is Hermes 4 - Llama-3.1 405B (Non-reasoning) multimodal?
No, Hermes 4 - Llama-3.1 405B (Non-reasoning) is not multimodal. It only supports text input.
What is the context window of Hermes 4 - Llama-3.1 405B (Non-reasoning)?
Hermes 4 - Llama-3.1 405B (Non-reasoning) has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is Hermes 4 - Llama-3.1 405B (Non-reasoning) open source?
Yes, Hermes 4 - Llama-3.1 405B (Non-reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.