GPT OSS 20b

gpt-oss-20b is a 21B-parameter open-weight MoE reasoning model from OpenAI with ~4B active parameters, a 128k context window, and native support for chain-of-thought, tools, and structured outputs under Apache 2.0.

Core Model

Chat

You can use gpt-oss-20b as a compact but strong backbone for reasoning-heavy assistants, RAG systems, and agentic workflows that need tool calling, Python/code execution, web search, and structured outputs. It is optimized for lower latency and local or specialized deployments compared with gpt-oss-120b, and can be fine-tuned on consumer or single-node hardware for domain-specific reasoning tasks.

How to Get Started

pip install requestsCode language: Bash (bash)

import requests


api_url = "https://api.regolo.ai/v1/chat/completions"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_REGOLO_KEY"
}
data = {
  "model": "gpt-oss-20b",
  "messages": [
    {
      "role": "user",
      "content": "List the steps to bake a chocolate chip cookie, including ingredients, temperatures, and baking time."
    }
  ],
  "reasoning_effort": "low"
}

response = requests.post(api_url, headers=headers, json=data)
print(response.json())
Code language: Python (python)

Applications & Use Cases

Multimodal-style but text-only chat assistants for support, analytics, and research that require explicit reasoning traces and high controllability via chain-of-thought.
Tool-enabled agents that call functions, browse the web, run Python code, and return structured outputs inside orchestrated workflows and automation backends.
Retrieval-augmented generation systems for legal, financial, and technical documents that exploit the 128k context to ingest long briefs, reports, and multi-turn histories.
Domain-specialized reasoning models fine-tuned with TRL or similar libraries for math, coding, or multilingual reasoning, running on consumer or single-node GPUs.
Governance, evaluation, and oversight models that inspect or critique outputs from other LLMs using full chain-of-thought visibility and configurable reasoning effort.
Open-weight enterprise copilots where Apache 2.0 licensing, local deployability, and transparent reasoning are required for compliance, security, and cost control.