Granite 4.0 H Small

← AI Models
IBM
2025-09-22
Modality:
Intelligence
10.8
#431/532
Coding
8.5
#440/88
Math
13.7
#224/265
Speed
403 tok/s
TTFT: 8.66s
Pricing
$0.06 / $0.25
per 1M tokens (in/out)
Google Preferred Source

Granite 4.0 H Small is IBM’s model designed for advanced data processing tasks. It processes at 403.25 tokens per second and is priced at $0.06 per million input tokens, targeting professional users.

When to Use Granite 4.0 H Small

✓ Best For

  • Data analysis and reporting
  • Mathematical computations
  • Coding assistance

✗ Not Ideal For

  • High-speed real-time applications
  • Tasks requiring extensive contextual understanding

How Granite 4.0 H Small Compares

Intelligence Index · Higher is better

Nous ResearchAI21 LabsIBMAlibabaDeepSeek

Benchmark Profile

Output Speed · tok/s

IBMLiquid AIMultiverse ComputingStepFun

Math Index

MistralMiniMaxIBMNVIDIACohere

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (13)

BenchmarkScore
Intelligence Index 10.8
Coding Index 8.5
Math Index 13.7
MMLU-Pro 624%
GPQA 416%
LiveCodeBench 251%
HLE 37%
SciCode 20.9%
IFBench 31.5%
LCR 9%
TerminalBench Hard 2.3%
Tau2 17.3%
AIME 2025 13.7%

Data: Artificial Analysis · Updated: March 26, 2026

Frequently Asked Questions (15)

When was Granite 4.0 H Small released?
Granite 4.0 H Small was released on September 22, 2025.
Who created Granite 4.0 H Small?
Granite 4.0 H Small was created by IBM.
How intelligent is Granite 4.0 H Small?
Granite 4.0 H Small scores 11 on the Artificial Analysis Intelligence Index, placing it below average among other open weight non-reasoning models of similar size (median: 12).
How fast is Granite 4.0 H Small?
Granite 4.0 H Small generates output at 432.0 tokens per second (based on the median across providers serving the model), which is well above average compared to other open weight non-reasoning models of similar size (median: 100.5 t/s).
What is the latency of Granite 4.0 H Small?
Granite 4.0 H Small has a time to first token (TTFT) of 10.20s (based on the median across providers serving the model), which is at the higher end compared to other open weight non-reasoning models of similar size (median: 1.49s).
How much does Granite 4.0 H Small cost?
Granite 4.0 H Small costs $0.06 per 1M input tokens (very competitive, median: $0.16) and $0.25 per 1M output tokens (better than average, median: $0.40), based on the median across providers serving the model.
What is Granite 4.0 H Small API pricing?
Granite 4.0 H Small costs $0.06 per 1M input tokens and $0.25 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $0.11 per 1M tokens. Pricing may vary by provider.
How verbose is Granite 4.0 H Small?
When evaluated on the Intelligence Index, Granite 4.0 H Small generated 2.3M output tokens, which is very competitive compared to other open weight non-reasoning models of similar size (median: 5.3M).
Is Granite 4.0 H Small a reasoning model?
No, Granite 4.0 H Small is not a reasoning model. It provides direct responses without extended chain-of-thought reasoning.
What input modalities does Granite 4.0 H Small support?
Granite 4.0 H Small supports text input.
What output modalities does Granite 4.0 H Small support?
Granite 4.0 H Small supports text output.
Can Granite 4.0 H Small process images?
No, Granite 4.0 H Small does not support image input. It can only process text.
Is Granite 4.0 H Small multimodal?
No, Granite 4.0 H Small is not multimodal. It only supports text input.
What is the context window of Granite 4.0 H Small?
Granite 4.0 H Small has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is Granite 4.0 H Small open source?
Yes, Granite 4.0 H Small is open weights. The model weights are publicly available and can be downloaded for self-hosting.