Skip to content
Regolo Logo

DeepSeek-OCR-2

DeepSeek‑OCR‑2 is a 3B‑parameter, Apache‑2.0 vision–language model with DeepEncoder V2, delivering SOTA document OCR and layout understanding using up to 20× fewer tokens and supporting industrial‑scale PDF ingestion.
Core Model
OCR

How to Get Started

pip install requestsCode language: Bash (bash)
import requests

api_url = "https://api.regolo.ai/v1/chat/completions"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_REGOLO_KEY"
}
data = {
  "model": "minimax-m2.5",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of Italy, and which region does it belong to?"
    }
  ],
  "reasoning_effort": "low"
}

response = requests.post(api_url, headers=headers, json=data)
print(response.json())Code language: Python (python)

Applications & Use Cases

Privacy‑sensitive enterprise workflows where an open, Apache‑licensed OCR model can be deployed on‑premises without proprietary dependencies.

High‑fidelity OCR for complex documents on OmniDocBench‑style workloads, including tables, multi-column layouts, charts, and formulas.

Large‑scale document ingestion pipelines that process up to ~200k pages per day per A100 GPU thanks to vision-token compression and dynamic resolution.

Markdown, HTML, or JSON structured extraction from PDFs and images using prompts like “Convert the document to markdown” via the built‑in infer API.

Domain‑specific OCR models fine‑tuned with Unsloth, achieving 57–86% character error rate reductions on challenging languages such as Persian.