Hermes 3 – Llama-3.1 70B

← AI Models
Nous Research
2024-08-15
Modality:
Intelligence
10.6
#427/521
Coding
Math
Speed
41 tok/s
TTFT: 301.00s
Pricing
$0.30 / $0.30
per 1M tokens (in/out)
Google Preferred Source

Hermes 3 – Llama-3.1 70B is Nous Research’s advanced AI model designed for high-performance tasks. It processes at 40.965 tokens per second and is priced at $0.3 per million tokens, targeting professional users.

When to Use Hermes 3 – Llama-3.1 70B

✓ Best For

  • Natural language processing tasks
  • Code generation and debugging
  • Data analysis and insights

✗ Not Ideal For

  • Users requiring real-time interaction
  • Applications needing extensive mathematical computations

How Hermes 3 – Llama-3.1 70B Compares

Intelligence Index · Higher is better

DeepSeekAllen Institute for AINous ResearchAI21 LabsAlibaba

Benchmark Profile

Output Speed · tok/s

AnthropicMistralNous ResearchAlibabaKimi

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (8)

BenchmarkScore
Intelligence Index 10.6
MMLU-Pro 571%
GPQA 401%
LiveCodeBench 188%
HLE 41%
SciCode 23.1%
AIME 2.3%
MATH 500 53.8%

Data: Artificial Analysis · Updated: April 2, 2026

Frequently Asked Questions (15)

When was Hermes 3 - Llama-3.1 70B released?
Hermes 3 - Llama-3.1 70B was released on August 15, 2024.
Who created Hermes 3 - Llama-3.1 70B?
Hermes 3 - Llama-3.1 70B was created by Nous Research.
How intelligent is Hermes 3 - Llama-3.1 70B?
Hermes 3 - Llama-3.1 70B scores 11 (estimated) on the Artificial Analysis Intelligence Index, placing it below average among other open weight non-reasoning models of similar size (median: 13).
How fast is Hermes 3 - Llama-3.1 70B?
Hermes 3 - Llama-3.1 70B generates output at 37.2 tokens per second (based on the median across providers serving the model), which is at the lower end compared to other open weight non-reasoning models of similar size (median: 61.5 t/s).
What is the latency of Hermes 3 - Llama-3.1 70B?
Hermes 3 - Llama-3.1 70B has a time to first token (TTFT) of 1.30s (based on the median across providers serving the model), which is better than average compared to other open weight non-reasoning models of similar size (median: 1.57s).
How much does Hermes 3 - Llama-3.1 70B cost?
Hermes 3 - Llama-3.1 70B costs $0.30 per 1M input tokens (better than average, median: $0.52) and $0.30 per 1M output tokens (very competitive, median: $0.81), based on the median across providers serving the model.
What is Hermes 3 - Llama-3.1 70B API pricing?
Hermes 3 - Llama-3.1 70B costs $0.30 per 1M input tokens and $0.30 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $0.30 per 1M tokens. Pricing may vary by provider.
How verbose is Hermes 3 - Llama-3.1 70B?
When evaluated on the Intelligence Index, Hermes 3 - Llama-3.1 70B generated 920k output tokens, which is very competitive compared to other open weight non-reasoning models of similar size (median: 3.8M).
Is Hermes 3 - Llama-3.1 70B a reasoning model?
No, Hermes 3 - Llama-3.1 70B is not a reasoning model. It provides direct responses without extended chain-of-thought reasoning.
What input modalities does Hermes 3 - Llama-3.1 70B support?
Hermes 3 - Llama-3.1 70B supports text only input.
What output modalities does Hermes 3 - Llama-3.1 70B support?
Hermes 3 - Llama-3.1 70B supports text only output.
Can Hermes 3 - Llama-3.1 70B process images?
No, Hermes 3 - Llama-3.1 70B does not support image input. It can only process text.
Is Hermes 3 - Llama-3.1 70B multimodal?
No, Hermes 3 - Llama-3.1 70B is not multimodal. It only supports text only input.
What is the context window of Hermes 3 - Llama-3.1 70B?
Hermes 3 - Llama-3.1 70B has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is Hermes 3 - Llama-3.1 70B open source?
Yes, Hermes 3 - Llama-3.1 70B is open weights. The model weights are publicly available and can be downloaded for self-hosting.