Skip to content

Rerank

The Rerank API lets you re‑order a list of documents (or passages) according to their relevance to a given query.
It is powered by models such as Qwen3‑Reranker‑4B and returns the top‑N most relevant documents together with a relevance score.

When to use it – After you have retrieved a large set of candidate documents (e.g., with BM25, vector search, or another LLM), feed them to the Rerank endpoint to obtain a concise, high‑quality ranking that can be directly presented to users or passed to a downstream LLM for answer generation.


API Call Parameters

Tip

Important note – The endpoint expects a JSON payload and the total request size (including all documents) must stay lighter as possible. If you have many candidates, consider chunking them and calling the API multiple times.

Parameter Type Description
model string (required) Identifier of the reranker model, e.g., Qwen3‑Reranker‑4B.
query string (required) The user’s question or search query.
documents array[string] (required) List of candidate documents/passages to be reranked.
top_n integer (optional) Number of highest‑scoring documents to return.

Technical Explanation of Prompt Tags

To achieve optimal performance with the Qwen3-Reranker model, the input strings must follow a specific prompt template. This structure allows the underlying Cross-Encoder to differentiate between instructions, user intent, and the content to be ranked.

Tag Purpose Description
<Instruct> Task Specification Defines the context of the retrieval (e.g., "retrieve relevant passages"). This guides the model's focus based on the specific use case.
<Query> User Intent Marks the beginning of the actual question or search term. This helps the model isolate the core information the user is looking for.
<Document> Content Boundary Identifies each candidate passage. By explicitly tagging documents, the model better understands where one passage ends and another begins during the scoring process.

Why these tags are necessary

Modern LLM-based rerankers are trained using these semantic markers to improve zero-shot accuracy. Omitting these tags or using inconsistent formatting between the query and the documents can lead to sub-optimal relevance scores, as the model might fail to distinguish between the instruction and the data.


  curl --request POST \
    --url https://api.regolo.ai/rerank \
    --header 'Authorization: Bearer REGOLO_API_KEY' \
    --header 'Content-Type: application/json' \
    --data '{
      "model": "Qwen3-Reranker-4B",
      "query": "<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: What is the capital of China?",
      "documents": [
        "<Document>: The capital of China is Beijing.",
        "<Document>: Gravity is a force that attracts two bodies towards each other..."
      ],
      "top_n": 5
    }'
import requests

api_key = "REGOLO_API_KEY"
url = "https://api.regolo.ai/rerank"

task = "Given a web search query, retrieve relevant passages that answer the query"
query_text = "What is the capital of China?"

documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other..."
]

payload = {
    "model": "Qwen3-Reranker-4B",
    "query": f"<Instruct>: {task}\n<Query>: {query_text}",
    "documents": [f"<Document>: {doc}" for doc in documents],
    "top_n": 5
}

response = requests.post(
    url,
    json=payload,
    headers={"Authorization": f"Bearer {api_key}"}
)

results = response.json().get('results', [])
for res in results:
    score = res['relevance_score']
    clean_text = res['document']['text'].replace("<Document>: ", "")
    print(f"Score: {score:.4f} | Text: {clean_text}")

Response

{
  "id": "rerank-8b37844a3beeecb7",
  "results": [
    {
      "index": 0,
      "relevance_score": 0.8835278153419495,
      "document": {
        "text": "<Document>: The capital of China is Beijing."
      }
    },
    {
      "index": 1,
      "relevance_score": 0.08649543672800064,
      "document": {
        "text": "<Document>: Gravity is a force that attracts two bodies towards each other..."
      }
    }
  ],
  "meta": null
}