News
Models
Products
keyboard_arrow_down
Reader
Convert any URL to Markdown for better grounding LLMs.
Embeddings
World-class multimodal multilingual embeddings.
Reranker
World-class reranker for maximizing search relevancy.
DeepSearch
Search, read and reason until best answer found.
More
keyboard_arrow_down
Classifier
Zero-shot and few-shot classification for image and text.
Segmenter
Cut long text into chunks and do tokenization.

API Docs
Auto codegen for your copilot IDE or LLM
open_in_new


Company
keyboard_arrow_down
About us
Contact sales
Intern program
Join us
open_in_new
Download logo
open_in_new
Terms & Conditions


Log in
login

DeepSearch

Search, read and reason until best answer found.

play_arrowDemo

DeepSearch API

Fully compatible with OpenAI's Chat API schema, simply swap api.openai.com with deepsearch.jina.ai to get started.
key
API Key & Billing
code
Usage
more_horiz
More
chevron_leftchevron_right

home
speedRate Limit
bug_report Raise issue
help_outlineFAQ
api
Status
chevron_leftchevron_right

Chat with DeepSearch
Vibe check with a simple chat UI. DeepSearch is best for complex questions that require iteratively reasoning, world-knowledge or up-to-date information.
open_in_new
Messages
A list of messages between the user and the assistant comprising the conversation so far. You can add images (webp, png, jpeg) or files (txt, pdf) to the message.
Attach Image/Document
Different message types (modalities) are supported, like text (.txt, .pdf), images (.png, .webp, .jpeg). Files are supported up to 10MB and must be encoded into data URI upfront.
{
  "role": "user",
  "content": "hi"
}

upload
Request
curl https://deepsearch.jina.ai/v1/chat/completions \
  -H "Content-Type: application/json"\
  -H "Authorization: Bearer " \
  -d @- <<EOFEOF
  {
    "model": "jina-deepsearch-v1",
    "messages": [
        {
            "role": "user",
            "content": "Hi!"
        },
        {
            "role": "assistant",
            "content": "Hi, how can I help you?"
        },
        {
            "role": "user",
            "content": "what's the latest blog post from jina ai?"
        }
    ],
    "stream": true,
    "reasoning_effort": "medium"
  }
EOFEOF


info
This is the last chunk of the stream which contains the final answer, visited URLs and the token usage. Click the button above to get real-time response.
download
Response
fiber_manual_record 200 OK
timer
0.0 s
straighten
196,526 Tokens
{
  "id": "1742181758589",
  "object": "chat.completion.chunk",
  "created": 1742181758,
  "model": "jina-deepsearch-v1",
  "system_fingerprint": "fp_1742181758589",
  "choices": [
    {
      "index": 0,
      "delta": {
        "content": "The latest blog post from Jina AI is titled \"Snippet Selection and URL Ranking in DeepSearch/DeepResearch,\" published on March 12, 2025 [^1]. This post discusses how to improve the quality of DeepSearch by using late-chunking embeddings for snippet selection and rerankers to prioritize URLs before crawling. You can read the full post here: https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch\n\n[^1]: Since our DeepSearch release on February 2nd 2025 we ve discovered two implementation details that greatly improved quality In both cases multilingual embeddings and rerankers are used in an in context manner operating at a much smaller scale than the traditional pre computed indices these models typically require  [jina.ai](https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch)",
        "type": "text",
        "annotations": [
          {
            "type": "url_citation",
            "url_citation": {
              "title": "Snippet Selection and URL Ranking in DeepSearch/DeepResearch",
              "exactQuote": "Since our DeepSearch release on February 2nd 2025, we've discovered two implementation details that greatly improved quality. In both cases, multilingual embeddings and rerankers are used in an _\"in-context\"_ manner - operating at a much smaller scale than the traditional pre-computed indices these models typically require.",
              "url": "https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch",
              "dateTime": "2025-03-13 06:48:01"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 169670,
    "completion_tokens": 27285,
    "total_tokens": 196526
  },
  "visitedURLs": [
    "https://github.com/jina-ai/node-DeepResearch/blob/main/src/utils/url-tools.ts",
    "https://huggingface.co/jinaai/jina-embeddings-v3",
    "https://github.com/jina-ai/reader",
    "https://zilliz.com/blog/training-text-embeddings-with-jina-ai",
    "https://threads.net/@unwind_ai/post/DGmhWCVswbe/media",
    "https://twitter.com/JinaAI_/status/1899840196507820173",
    "https://jina.ai/news?tag=tech-blog",
    "https://docs.llamaindex.ai/en/stable/examples/embeddings/jinaai_embeddings",
    "https://x.com/jinaai_",
    "https://x.com/JinaAI_/status/1899840202358784170",
    "https://tracxn.com/d/companies/jina-ai/__IQ81fOnU0FsDpagFjG-LrG0DMWHELqI6znTumZBQF-A/funding-and-investors",
    "https://jina.ai/models",
    "https://linkedin.com/posts/imohitmayank_jinaai-has-unveiled-the-ultimate-developer-activity-7300401711242711040-VD64",
    "https://medium.com/@tossy21/trying-out-jina-ais-node-deepresearch-c5b55d630ea6",
    "https://huggingface.co/jinaai/jina-clip-v2",
    "https://arxiv.org/abs/2409.10173",
    "https://milvus.io/docs/embed-with-jina.md",
    "https://seedtable.com/best-startups-in-china",
    "https://threads.net/@sung.kim.mw/post/DGhG-J_vREu/jina-ais-a-practical-guide-to-implementing-deepsearchdeepresearchthey-cover-desi",
    "https://elastic.co/search-labs/blog/jina-ai-embeddings-rerank-model-open-inference-api",
    "http://status.jina.ai/",
    "https://apidog.com/blog/recreate-openai-deep-research",
    "https://youtube.com/watch?v=QxHE4af5BQE",
    "https://sdxcentral.com/articles/news/cisco-engages-businesses-on-ai-strategies-at-greater-bay-area-2025/2025/02",
    "https://aws.amazon.com/blogs/machine-learning/build-rag-applications-using-jina-embeddings-v2-on-amazon-sagemaker-jumpstart",
    "https://reddit.com/r/perplexity_ai/comments/1ejbdqa/fastest_open_source_ai_search_engine",
    "https://search.jina.ai/",
    "https://sebastian-petrus.medium.com/build-openais-deep-research-open-source-alternative-4f21aed6d9f0",
    "https://medium.com/@elmo92/jina-reader-transforming-web-content-to-feed-llms-d238e827cc27",
    "https://openai.com/index/introducing-deep-research",
    "https://python.langchain.com/docs/integrations/tools/jina_search",
    "https://varindia.com/news/meta-is-in-talks-for-usd200-billion-ai-data-center-project",
    "https://varindia.com/news/Mira-Murati%E2%80%99s-new-AI-venture-eyes-$9-billion-valuation",
    "https://53ai.com/news/RAG/2025031401342.html",
    "https://arxiv.org/abs/2409.04701",
    "https://bigdatawire.com/this-just-in/together-ai-raises-305m-series-b-to-power-ai-model-training-and-inference",
    "https://github.blog/",
    "https://cdn-uploads.huggingface.co/production/uploads/660c3c5c8eec126bfc7aa326/MvwT9enRT7gOESHA_tpRj.jpeg",
    "https://cdn-uploads.huggingface.co/production/uploads/660c3c5c8eec126bfc7aa326/JNs_DrpFbr6ok_pSRUK4j.jpeg",
    "https://app.dealroom.co/lists/33530",
    "https://api-docs.deepseek.com/news/news250120",
    "https://sdxcentral.com/articles/news/ninjaone-raises-500-million-valued-at-5-billion/2025/02",
    "https://linkedin.com/sharing/share-offsite?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://twitter.com/intent/tweet?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://platform.openai.com/docs/api-reference/chat/create",
    "https://mp.weixin.qq.com/s/-pPhHDi2nz8hp5R3Lm_mww",
    "https://huggingface.us17.list-manage.com/subscribe?id=9ed45a3ef6&u=7f57e683fa28b51bfc493d048",
    "https://automatio.ai/",
    "https://sdk.vercel.ai/docs/introduction",
    "https://app.eu.vanta.com/jinaai/trust/vz7f4mohp0847aho84lmva",
    "https://apply.workable.com/huggingface/j/AF1D4E3FEB",
    "https://facebook.com/sharer/sharer.php?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://facebook.com/sharer/sharer.php?u=http%3A%2F%2F127.0.0.1%3A3000%2Fen-US%2Fnews%2Fsnippet-selection-and-url-ranking-in-deepsearch-deepresearch%2F",
    "https://reddit.com/submit?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://apply.workable.com/huggingface",
    "https://news.ycombinator.com/submitlink?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
    "https://news.ycombinator.com/submitlink?u=http%3A%2F%2F127.0.0.1%3A3000%2Fen-US%2Fnews%2Fsnippet-selection-and-url-ranking-in-deepsearch-deepresearch%2F",
    "https://docs.github.com/site-policy/privacy-policies/github-privacy-statement",
    "https://discord.jina.ai/",
    "https://docs.github.com/site-policy/github-terms/github-terms-of-service",
    "https://bigdatawire.com/this-just-in/qumulo-announces-30-million-funding",
    "https://x.ai/blog/grok-3",
    "https://m-ric-open-deep-research.hf.space/",
    "https://youtu.be/sal78ACtGTc?feature=shared&t=52",
    "https://mp.weixin.qq.com/s/apnorBj4TZs3-Mo23xUReQ",
    "https://perplexity.ai/hub/blog/introducing-perplexity-deep-research",
    "https://githubstatus.com/",
    "https://github.blog/changelog/2021-09-30-footnotes-now-supported-in-markdown-fields",
    "https://openai.com/index/introducing-operator",
    "mailto:[email protected]",
    "https://resources.github.com/learn/pathways",
    "https://status.jina.ai/",
    "https://reuters.com/technology/artificial-intelligence/tencents-messaging-app-weixin-launches-beta-testing-with-deepseek-2025-02-16",
    "https://scmp.com/tech/big-tech/article/3298981/baidu-adopts-deepseek-ai-models-chasing-tencent-race-embrace-hot-start",
    "https://microsoft.com/en-us/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks",
    "javascript:UC_UI.showSecondLayer();",
    "https://resources.github.com/",
    "https://storm-project.stanford.edu/research/storm",
    "https://blog.google/products/gemini/google-gemini-deep-research",
    "https://youtu.be/vrpraFiPUyA",
    "https://chat.baidu.com/search?extParamsJson=%7B%22enter_type%22%3A%22ai_explore_home%22%7D&isShowHello=1&pd=csaitab&setype=csaitab&usedModel=%7B%22modelName%22%3A%22DeepSeek-R1%22%7D",
    "https://app.dover.com/jobs/jinaai",
    "http://localhost:3000/",
    "https://docs.cherry-ai.com/",
    "https://en.wikipedia.org/wiki/Delayed_gratification",
    "https://support.github.com/?tags=dotcom-footer",
    "https://docs.jina.ai/",
    "https://skills.github.com/",
    "https://partner.github.com/",
    "https://help.x.com/resources/accessibility",
    "https://business.twitter.com/en/help/troubleshooting/how-twitter-ads-work.html",
    "https://business.x.com/en/help/troubleshooting/how-twitter-ads-work.html",
    "https://support.twitter.com/articles/20170514",
    "https://support.x.com/articles/20170514",
    "https://t.co/jnxcxPzndy",
    "https://t.co/6EtEMa9P05",
    "https://help.x.com/using-x/x-supported-browsers",
    "https://legal.twitter.com/imprint.html"
  ],
  "readURLs": [
    "https://jina.ai/news/a-practical-guide-to-implementing-deepsearch-deepresearch",
    "https://github.com/jina-ai/node-DeepResearch",
    "https://huggingface.co/blog/open-deep-research",
    "https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch",
    "https://x.com/jinaai_?lang=en",
    "https://jina.ai/news",
    "https://x.com/joedevon/status/1896984525210837081",
    "https://github.com/jina-ai/node-DeepResearch/blob/main/src/tools/jina-latechunk.ts"
  ],
  "numURLs": 98
}

DeepSearch Parameters Guide

Learn how to set the right parameters and get the best results.

Quality Control

In DeepSearch, there’s generally a trade-off: the more steps the system takes, the higher quality results you’ll get, but you’ll also consume more tokens. This improved quality comes from broader, more exhaustive searches and deeper reflection. Four main parameters control the quality of DeepSearch: budget_tokens, max_attempts, team_size, and reasoning_effort. The reasoning_effort parameter is essentially a preset combination of budget_tokens and max_attempts that’s been carefully tuned. For most users, adjusting reasoning_effort is the simplest approach.

Budget Tokens

budget_tokens sets the maximum number of tokens allowed for the entire DeepSearch process. This covers all operations including web searches, reading web pages, reflection, summarization, and coding. Larger budgets naturally lead to better response quality. The DeepSearch process will stop when either the budget is exhausted or it finds a satisfactory answer, whichever comes first. If the budget runs out first, you’ll still get an answer, but it might not be the final, fully-refined response since it hasn’t passed all the quality checks defined by max_attempts.

Max Attempts

max_attempts determines how many times the system will retry to solve a problem during the DeepSearch process. Each time DeepSearch produces an answer, it must pass certain quality tests defined by an internal evaluator. If the answer fails these tests, the evaluator provides feedback, and the system uses this feedback to continue searching and refining the answer. Setting max_attempts too low means you’ll get results quickly, but the quality may suffer since the answer might not pass all quality checks. Setting it too high can make the process feel stuck in an endless retry loop where it keeps attempting and failing.

The system returns a final answer when either budget_tokens or max_attempts is exceeded (whichever happens first), or when the answer passes all tests while still having remaining budget and attempts available.

Team Size

team_size affects quality in a fundamentally different way than max_attempts and budget_tokens. When team_size is set to more than one, the system decomposes the original problem into sub-problems and researches them independently. Think of it like a map-reduce pattern, where a large job gets broken down into smaller tasks that run in parallel. The final answer is then a synthesis of each worker’s results. We call it “team_size” because it simulates a research team where multiple agents investigate different aspects of the same problem and collaborate on a final report.

Keep in mind that all agents’ token consumption counts toward your total budget_tokens, but each agent has independent max_attempts. This means that with a larger team_size but the same budget_tokens, agents might return answers sooner than expected due to budget constraints. We recommend increasing both team_size and budget_tokens together to give each agent sufficient resources to do thorough work.

Finally, you can think of team_size as controlling the breadth of the search—it determines how many different aspects will be researched. Meanwhile, budget_tokens and max_attempts control the depth of the search—how thoroughly each aspect gets explored.

Source Control

DeepSearch relies heavily on grounding—the sources it uses for information. Quality isn’t just about algorithmic depth and breadth; where DeepSearch gets its information is equally important, and often the deciding factor. Let’s explore the key parameters that control this.

No Direct Answer

no_direct_answer is a simple toggle that prevents the system from returning an answer at step 1. When enabled, it disables the system’s ability to use internal knowledge and forces it to always search the web first. Turning this on will make the system “overthink” even simple questions like “what day is it,” “how are you doing,” or basic factual knowledge that’s definitely in the model’s training data, like “who was the 40th president of the US.”

Hostname Controls

Three parameters—boost_hostnames, bad_hostnames, and only_hostnames—tell DeepSearch which webpages to prioritize, avoid, or exclusively use. To understand how these work, think about the search-and-read process in DeepSearch:

  1. Search phase: The system searches the web and retrieves a list of website URLs with their snippets
  2. Selection phase: The system decides which URLs to actually visit (it doesn’t visit all of them due to time and cost constraints)
  • boost_hostnames: Domains listed here get higher priority and are more likely to be visited
  • bad_hostnames: These domains will never be visited
  • only_hostnames: When defined, only URLs matching these hostnames will be visited

Here are some important notes on hostname parameters. First, the system always uses snippets returned by search engines as initial clues for building reasoning chains. These hostname parameters only affect which webpages the system visits, not how it formulates search queries.

Second, if the collected URLs don’t contain domains specified in only_hostnames, the system might stop reading webpages entirely. We recommend using these parameters only when you’re familiar with your research question and understand where potential answers are likely to be found (or where they definitely shouldn’t be found).

Special Case: Academic Research

For academic research, you might want searches and reads restricted to arxiv.org. In this case, simply set "search_provider": "arxiv" and everything will be grounded on arxiv as the sole source. However, generic or trivial questions may not get efficient answers with this restriction, so only use "search_provider": "arxiv" for serious academic research.

Search Language Code

search_language_code is another parameter that affects web sources by forcing the system to generate queries in a specific language, regardless of the original input or intermediate reasoning steps. Generally, the system automatically decides the query language to get the best search coverage, but sometimes manual control is useful.

Use Cases for Language Control

International market research: When studying a local brand or company’s impact in international markets, you can force queries to always use English with "search_language_code": "en" for global coverage, or use the local language for more tailored regional information.

Global research with non-English prompts: If your input is always in Chinese or Japanese (because your end users primarily speak these languages), but your research scope is global rather than just local Chinese or Japanese websites, the system might automatically lean toward your prompt’s language. Use this parameter to force English queries for broader international coverage.

Chat with DeepSearch

Vibe check with a simple chat UI. DeepSearch is best for complex questions that require iteratively reasoning, world-knowledge or up-to-date information.
We've just launched a new DeepSearch UI that's lightning-fast, minimalist and FREE. Check it out at https://search.jina.ai or click the button below to give it a try!open_in_newVisit new UI
Chat Clients
For the best experience, we recommend using professional chat clients. DeepSearch is fully compatible with OpenAI's Chat API schema, making it easy to use with any OpenAI-compatible client.
open_in_new
Chatwise
open_in_new
Cherry Studio
open_in_new
Chatbox
open_in_new
LobeChat
open_in_new
NextChat

What is DeepSearch?

DeepSearch combines web searching, reading, and reasoning for comprehensive investigation. Think of it as an agent that you give a research task to - it searches extensively and works through multiple iterations before providing an answer.

Standard LLMs

attach_money
about 1000 tokens
access_time
about 1s
check
Quick answers to general knowledge questions
close
Cannot access real-time or post-training information

Answers are generated purely from pretrained knowledge with a fixed cutoff date

RAG and Grounded LLMs

attach_money
about 10,000 tokens
access_time
about 3s
check
Questions requiring current or domain-specific information
close
Struggles with complex questions requiring multi-hop reasoning

Answers generated by summarizing a single-pass search results
Can access current information beyond training cutoff

DeepSearch

attach_money
about 500,000 tokens
access_time
about 50s
check
Complex questions requiring thorough research and reasoning
info
Takes longer than simple LLM or RAG approaches

Autonomous agent that iteratively searches, reads, and reasons
Dynamically decides next steps based on current findings
Self-evaluates answer quality before returning results
Can perform deep dives into topics through multiple search and reasoning cycles

API Pricing

API pricing is based on the token usage. One API key gives you access to all search foundation products.
With Jina Search Foundation API
The easiest way to access all of our products. Top-up tokens as you go.
Top up this API key with more tokens
Depending on your location, you may be charged in USD, EUR, or other currencies. Taxes may apply.
Please input the right API key to top up
Understand the rate limit
Rate limits are the maximum number of requests that can be made to an API within a minute per IP address/API key (RPM). Find out more about the rate limits for each product and tier below.
keyboard_arrow_down
Rate Limit
Rate limits are tracked in three ways: RPM (requests per minute), and TPM (tokens per minute). Limits are enforced per IP/API key and will be triggered when either the RPM or TPM threshold is reached first. When you provide an API key in the request header, we track rate limits by key rather than IP address.
ProductAPI EndpointDescriptionarrow_upwardw/o API Keykey_offw/ API Keykeyw/ Premium API KeykeyAverage LatencyToken Usage CountingAllowed Request
Reader APIhttps://r.jina.aiConvert URL to LLM-friendly text20 RPM500 RPMtrending_up5000 RPM7.9sCount the number of tokens in the output response.GET/POST
Reader APIhttps://s.jina.aiSearch the web and convert results to LLM-friendly textblock100 RPMtrending_up1000 RPM2.5sEvery request costs a fixed number of tokens, starting from 10000 tokensGET/POST
DeepSearchhttps://deepsearch.jina.ai/v1/chat/completionsReason, search and iterate to find the best answerblock50 RPM500 RPM56.7sCount the total number of tokens in the whole process.POST
Embedding APIhttps://api.jina.ai/v1/embeddingsConvert text/images to fixed-length vectorsblock500 RPM & 1,000,000 TPMtrending_up2,000 RPM & 5,000,000 TPM
ssid_chart
depends on the input size
help
Count the number of tokens in the input request.POST
Reranker APIhttps://api.jina.ai/v1/rerankRank documents by queryblock500 RPM & 1,000,000 TPMtrending_up2,000 RPM & 5,000,000 TPM
ssid_chart
depends on the input size
help
Count the number of tokens in the input request.POST
Classifier APIhttps://api.jina.ai/v1/trainTrain a classifier using labeled examplesblock20 RPM & 200,000 TPM60 RPM & 1,000,000 TPM
ssid_chart
depends on the input size
Tokens counted as: input_tokens × num_itersPOST
Classifier API (Few-shot)https://api.jina.ai/v1/classifyClassify inputs using a trained few-shot classifierblock20 RPM & 200,000 TPM60 RPM & 1,000,000 TPM
ssid_chart
depends on the input size
Tokens counted as: input_tokensPOST
Classifier API (Zero-shot)https://api.jina.ai/v1/classifyClassify inputs using zero-shot classificationblock200 RPM & 500,000 TPM1,000 RPM & 3,000,000 TPM
ssid_chart
depends on the input size
Tokens counted as: input_tokens + label_tokensPOST
Segmenter APIhttps://api.jina.ai/v1/segmentTokenize and segment long text20 RPM200 RPM1,000 RPM0.3sToken is not counted as usage.GET/POST

FAQ

What is DeepSearch?
keyboard_arrow_down
DeepSearch is an LLM API that performs iterative search, reading, and reasoning until it finds an accurate answer to a query or reaches its token budget limit.
How is DeepSearch different from OpenAI and Gemini's deep research capabilities?
keyboard_arrow_down
Unlike OpenAI and Gemini, DeepSearch specifically focuses on delivering accurate answers through iteration rather than generating long-form articles. It's optimized for quick, precise answers from deep web search rather than creating comprehensive reports.
What API key do I need to use DeepSearch?
keyboard_arrow_down
You need a Jina API key. We offers 10M free tokens for new API keys.
What happens when DeepSearch reaches its token budget? Does it return an incomplete answer?
keyboard_arrow_down
It generates a final answer based on all accumulated knowledge, rather than just giving up or returning an incomplete response.
Does DeepSearch guarantee accurate answers?
keyboard_arrow_down
No. While it uses an iterative search process to improve accuracy, the evaluation shows it achieves a 75% pass rate on test questions, significantly better than the 0% baseline (gemini-2.0-flash) but not perfect.
How long does a typical DeepSearch query take?
keyboard_arrow_down
It varies significantly - queries can take anywhere from 1 to 42 steps, with an average of 4 steps based on evaluation data. That's 20 seconds. Simple queries might resolve quickly, while complex research questions can involve many iterations and up to 120 seconds.
Can DeepSearch work with any OpenAI-compatible client like Chatwise, CherryStudio or ChatBox?
keyboard_arrow_down
Yes, the official DeepSearch API at deepsearch.jina.ai/v1/chat/completions is fully compatible with the OpenAI API schema, using 'jina-deepsearch-v1' as the model name. Therefore it is super easy to switch from OpenAI to DeepSearch and use with local clients or any OpenAI-compatible client. We highly recommend Chatwise for a seamless experience.
What are the rate limits for the API?
keyboard_arrow_down
Rate limits vary by API key tier, ranging from 10 RPM to 30 RPM. This is important to consider for applications with high query volumes.
What is the content inside the <think> tag?
keyboard_arrow_down
DeepSearch wraps thinking steps in XML tags ... and provides the final answer afterward, following the OpenAI streaming format but with these special markers for the chain of thoughts.
Does DeepSearch use Jina Reader for web search and reading?
keyboard_arrow_down
Yes. Jina Reader is used for web search and reading, providing the system with the ability to efficiently access and process web content.
Why does DeepSearch use so many tokens for my queries?
keyboard_arrow_down
Yes, the token usage of DeepSearch on complex queries is arguably high - averaging 70,000 tokens compared to 500 for basic LLM responses. This shows the depth of research but also has cost implications.
Is there a way to control or limit the number of steps?
keyboard_arrow_down
The system is primarily controlled by token budget rather than step count. Once the token budget is exceeded, it enters Beast Mode for final answer generation. Check reasoning_effort for more details.
How reliable are the references in the answers?
keyboard_arrow_down
References are considered so important that if an answer is deemed definitive but lacks references, the system continues searching rather than accepting the answer.
Can DeepSearch handle questions about future events?
keyboard_arrow_down
Yes, but with extensive research steps. The example of 'who will be president in 2028' shows it can handle speculative questions through multiple research iterations, though accuracy isn't guaranteed for such predictions.

How to get my API key?

video_not_supported

What's the rate limit?

Rate Limit
Rate limits are tracked in three ways: RPM (requests per minute), and TPM (tokens per minute). Limits are enforced per IP/API key and will be triggered when either the RPM or TPM threshold is reached first. When you provide an API key in the request header, we track rate limits by key rather than IP address.
ProductAPI EndpointDescriptionarrow_upwardw/o API Keykey_offw/ API Keykeyw/ Premium API KeykeyAverage LatencyToken Usage CountingAllowed Request
Reader APIhttps://r.jina.aiConvert URL to LLM-friendly text20 RPM500 RPMtrending_up5000 RPM7.9sCount the number of tokens in the output response.GET/POST
Reader APIhttps://s.jina.aiSearch the web and convert results to LLM-friendly textblock100 RPMtrending_up1000 RPM2.5sEvery request costs a fixed number of tokens, starting from 10000 tokensGET/POST
DeepSearchhttps://deepsearch.jina.ai/v1/chat/completionsReason, search and iterate to find the best answerblock50 RPM500 RPM56.7sCount the total number of tokens in the whole process.POST
Embedding APIhttps://api.jina.ai/v1/embeddingsConvert text/images to fixed-length vectorsblock500 RPM & 1,000,000 TPMtrending_up2,000 RPM & 5,000,000 TPM
ssid_chart
depends on the input size
help
Count the number of tokens in the input request.POST
Reranker APIhttps://api.jina.ai/v1/rerankRank documents by queryblock500 RPM & 1,000,000 TPMtrending_up2,000 RPM & 5,000,000 TPM
ssid_chart
depends on the input size
help
Count the number of tokens in the input request.POST
Classifier APIhttps://api.jina.ai/v1/trainTrain a classifier using labeled examplesblock20 RPM & 200,000 TPM60 RPM & 1,000,000 TPM
ssid_chart
depends on the input size
Tokens counted as: input_tokens × num_itersPOST
Classifier API (Few-shot)https://api.jina.ai/v1/classifyClassify inputs using a trained few-shot classifierblock20 RPM & 200,000 TPM60 RPM & 1,000,000 TPM
ssid_chart
depends on the input size
Tokens counted as: input_tokensPOST
Classifier API (Zero-shot)https://api.jina.ai/v1/classifyClassify inputs using zero-shot classificationblock200 RPM & 500,000 TPM1,000 RPM & 3,000,000 TPM
ssid_chart
depends on the input size
Tokens counted as: input_tokens + label_tokensPOST
Segmenter APIhttps://api.jina.ai/v1/segmentTokenize and segment long text20 RPM200 RPM1,000 RPM0.3sToken is not counted as usage.GET/POST
API-related common questions
code
Can I use the same API key for reader, embedding, reranking, classifying and fine-tuning APIs?
keyboard_arrow_down
Yes, the same API key is valid for all search foundation products from Jina AI. This includes the reader, embedding, reranking, classifying and fine-tuning APIs, with tokens shared between the all services.
code
Can I monitor the token usage of my API key?
keyboard_arrow_down
Yes, token usage can be monitored in the 'API Key & Billing' tab by entering your API key, allowing you to view the recent usage history and remaining tokens. If you have logged in to the API dashboard, these details can also be viewed in the 'Manage API Key' tab.
code
What should I do if I forget my API key?
keyboard_arrow_down
If you have misplaced a topped-up key and wish to retrieve it, please contact support AT jina.ai with your registered email for assistance. It's recommended to log in to keep your API key securely stored and easily accessible.
Contact
code
Do API keys expire?
keyboard_arrow_down
No, our API keys do not have an expiration date. However, if you suspect your key has been compromised and wish to retire it, please contact our support team for assistance. You can also revoke your key in the API Key Management dashboard.
Contact
code
Can I transfer tokens between API keys?
keyboard_arrow_down
Yes, you can transfer tokens from a premium key to another. After logging into your account on the API Key Management dashboard, use the settings of the key you want to transfer out to move all remaining paid tokens.
code
Can I revoke my API key?
keyboard_arrow_down
Yes, you can revoke your API key if you believe it has been compromised. Revoking a key will immediately disable it for all users who have stored it, and all remaining balance and associated properties will be permanently unusable. If the key is a premium key, you have the option to transfer the remaining paid balance to another key before revocation. Notice that this action cannot be undone. To revoke a key, go to the key settings in the API Key Management dashboard.
code
Why is the first request for some models slow?
keyboard_arrow_down
This is because our serverless architecture offloads certain models during periods of low usage. The initial request activates or 'warms up' the model, which may take a few seconds. After this initial activation, subsequent requests process much more quickly.
code
Is user input data used for training your models?
keyboard_arrow_down
We adhere to a strict privacy policy and do not use user input data for training our models. We are also SOC 2 Type I and Type II compliant, ensuring high standards of security and privacy.
Billing-related common questions
attach_money
Is billing based on the number of sentences or requests?
keyboard_arrow_down
Our pricing model is based on the total number of tokens processed, allowing users the flexibility to allocate these tokens across any number of sentences, offering a cost-effective solution for diverse text analysis requirements.
attach_money
Is there a free trial available for new users?
keyboard_arrow_down
We offer a welcoming free trial to new users, which includes ten millions tokens for use with any of our models, facilitated by an auto-generated API key. Once the free token limit is reached, users can easily purchase additional tokens for their API keys via the 'Buy tokens' tab.
attach_money
Are tokens charged for failed requests?
keyboard_arrow_down
No, tokens are not deducted for failed requests.
attach_money
What payment methods are accepted?
keyboard_arrow_down
Payments are processed through Stripe, supporting a variety of payment methods including credit cards, Google Pay, and PayPal for your convenience.
attach_money
Is invoicing available for token purchases?
keyboard_arrow_down
Yes, an invoice will be issued to the email address associated with your Stripe account upon the purchase of tokens.
Offices
location_on
Sunnyvale, CA
710 Lakeway Dr, Ste 200, Sunnyvale, CA 94085, USA
location_on
Berlin, Germany (HQ)
Prinzessinnenstraße 19-20, 10969 Berlin, Germany
location_on
Beijing, China
Level 5, Building 6, No.48 Haidian West St. Beijing, China
location_on
Shenzhen, China
402 Floor 4, Fu'an Technology Building, Shenzhen, China
Search Foundation
Reader
Embeddings
Reranker
DeepSearch
Classifier
Segmenter
API Documentation
Get Jina API key
Rate Limit
API Status
Company
About us
Contact sales
Newsroom
Intern program
Join us
open_in_new
Download logo
open_in_new
Terms
Security
Terms & Conditions
Privacy
Manage Cookies
email
Jina AI © 2020-2025.