DeepSeek-OCR: New Open-source AI Model Goes Viral On GitHub

A new open-source model named DeepSeek-OCR has been released, disrupting the traditional paradigm of large models. The model, which was open-sourced yesterday afternoon, has seen a meteoric rise in the AI community, gaining over 4,000 stars on GitHub overnight. The core focus of DeepSeek-OCR is a novel visual approach to handling text, which promises to solve one of the biggest challenges in AI: long-context efficiency.

How DeepSeek-OCR changes the game

The new DeepSeek-OCR model is not just another text-reading tool. Its power lies in its ability to compress information. According to its creators, the model can take a 1,000-word article and compress it into just 100 visual tokens. This represents a staggering tenfold compression ratio with 97% accuracy. This efficiency is remarkable; a single NVIDIA A100 GPU can process 200,000 pages of data per day using the DeepSeek-OCR method. This new processing approach could signal a significant shift in the input methods used for large models.

The rapid traction of DeepSeek-OCR was amplified by high-profile endorsements. Andrej Karpathy, the co-founder of OpenAI and former Director of Autopilot at Tesla, shared his excitement about the paper. He called DeepSeek-OCR a “good OCR model” and highlighted its more “interesting part”: the concept of a computer vision AI “masquerading as a natural language person.”

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter.

The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language… https://t.co/AxRXBdoO0F

— Andrej Karpathy (@karpathy) October 20, 2025

Karpathy believes this visual-first method is a superior input for large language models. He proposed that LLMs should use images as their primary input, and even when processing plain text, they should render it into an image first. In his view, this would lead to much higher information compression and a more generalized information flow.

Karpathy also emphasized that the DeepSeek-OCR approach could solve issues with traditional “word segmenters,” or tokenizers. He argued that word segmenters are “ugly and standalone,” introduce Unicode and byte encoding issues, and can even increase security risks.

He views OCR as just one of many visual-text tasks, suggesting that text-to-text tasks could be converted to visual-text tasks, but not the other way around. This sentiment was echoed by Xie Saining, an assistant professor at New York University, who agreed with Karpathy’s views on integrating computer vision and natural language processing.

How to access DeepSeek-OCR

The DeepSeek-OCR model is available as an open-source project on GitHub and Hugging Face under the name deepseek-ai/DeepSeek-OCR. The model, which has 3 billion parameters, is available for download and use with the Hugging Face transformers library. The creators have provided code examples for inference on NVIDIA GPUs, and the repository also includes guidance for PDF processing and model acceleration using vLLM.

Tags: deepseek-ocr Featured

DeepSeek-OCR: New open-source AI model goes viral on GitHub

DeepSeek-OCR's power lies in its ability to compress information. According to its creators, the model can take a 1,000-word article and compress it into just 100 visual tokens.

Related Posts

Elden Ring: Tarnished Edition launches on Switch 2 in August

FIFA World Cup game arrives on Netflix on June 11

Meta tests hidden facial recognition code for smart glasses

OpenAI upgrades ChatGPT memory with a new personalization system

Meta rolls out Instagram Plus subscription worldwide

Steam Machine and Steam Frame are coming this summer

LATEST NEWS

Elden Ring: Tarnished Edition launches on Switch 2 in August

FIFA World Cup game arrives on Netflix on June 11

Meta tests hidden facial recognition code for smart glasses

OpenAI upgrades ChatGPT memory with a new personalization system

Meta rolls out Instagram Plus subscription worldwide

Steam Machine and Steam Frame are coming this summer

BEST AI MODELS LEADERBOARD

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.