Qwen3-Reranker-4B

Qwen3-Reranker-4B is a 4B text reranking model built to improve retrieval precision in multilingual and code search workflows. It supports 32K context, 100+ languages, and instruction-aware ranking, making it well suited for search, RAG, and enterprise retrieval systems.

Core Model

Rerank

Qwen3-Reranker-4B is a text reranking model in the Qwen3 Embedding and Reranker series, built specifically for ranking tasks on top of the dense Qwen3 foundation models. It is designed for multilingual retrieval, long-text understanding, and reasoning-heavy ranking workflows, with support for 100+ languages and a 32K context window. The model is also instruction-aware, which lets developers guide ranking behavior for specific tasks, domains, and languages.

How to get started

pip install requestsCode language: Bash (bash)

import requests


api_url = "https://api.regolo.ai/v1/rerank"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_REGOLO_KEY"
}
data = {
  "model": "Qwen3-Reranker-4B",
  "query": "<Instruct>: Given a web search query, retrieve relevant passages that answer the query.\n<Query>: What is the capital of Italy?",
  "documents": [
    "<Document>: Italy is known for its beautiful landscapes, including the Dolomites and the Amalfi Coast.",
    "<Document>: The Italian national football team has won several European Championships.",
    "<Document>: Pizza Margherita is a traditional dish from Naples, Italy.",
    "<Document>: Venice is a city in northeastern Italy famous for its canals and architecture.",
    "<Document>: The capital of Italy is Rome, located in the Lazio region."
  ],
  "top_n": 3
}

response = requests.post(api_url, headers=headers, json=data)

results = response.json().get('results', [])

for res in results:
    score = res['relevance_score']
    clean_text = res['document']['text'].replace("<Document>: ", "")
    print(f"Score: {score:.4f} | Text: {clean_text}")
Code language: Python (python)

Output

{
  "id": "rerank-8d73d7edf26226f7",
  "results": [
    {
      "index": 4,
      "relevance_score": 0.8142903447151184,
      "document": {
        "text": "<Document>: The capital of Italy is Rome, located in the Lazio region."
      }
    },
    {
      "index": 1,
      "relevance_score": 0.5912639498710632,
      "document": {
        "text": "<Document>: The Italian national football team has won several European Championships."
      }
    },
    {
      "index": 2,
      "relevance_score": 0.5660451650619507,
      "document": {
        "text": "<Document>: Pizza Margherita is a traditional dish from Naples, Italy."
      }
    }
  ],
  "meta": null
}Code language: JavaScript (javascript)

Applications & Use Cases

Second-stage reranking for search and RAG pipelines that need higher relevance after first-pass retrieval.
Multilingual enterprise search across documents, FAQs, and internal knowledge bases.
Code retrieval and developer search systems that rank code and documentation results more precisely.
Long-document retrieval workflows that benefit from 32K context handling.
Domain-specific ranking systems that use task instructions to tune retrieval behavior.
Text classification, clustering, and bitext-mining pipelines when paired with Qwen3 embedding models.