DeepSearch API
api.openai.com
with deepsearch.jina.ai
to get started.reasoning_effort
parameter.reasoning_effort
parameter.{
"role": "user",
"content": "hi"
}
curl https://deepsearch.jina.ai/v1/chat/completions \
-H "Content-Type: application/json"\
-H "Authorization: Bearer " \
-d @- <<EOFEOF
{
"model": "jina-deepsearch-v1",
"messages": [
{
"role": "user",
"content": "Hi!"
},
{
"role": "assistant",
"content": "Hi, how can I help you?"
},
{
"role": "user",
"content": "what's the latest blog post from jina ai?"
}
],
"stream": true,
"reasoning_effort": "medium"
}
EOFEOF
{
"id": "1742181758589",
"object": "chat.completion.chunk",
"created": 1742181758,
"model": "jina-deepsearch-v1",
"system_fingerprint": "fp_1742181758589",
"choices": [
{
"index": 0,
"delta": {
"content": "The latest blog post from Jina AI is titled \"Snippet Selection and URL Ranking in DeepSearch/DeepResearch,\" published on March 12, 2025 [^1]. This post discusses how to improve the quality of DeepSearch by using late-chunking embeddings for snippet selection and rerankers to prioritize URLs before crawling. You can read the full post here: https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch\n\n[^1]: Since our DeepSearch release on February 2nd 2025 we ve discovered two implementation details that greatly improved quality In both cases multilingual embeddings and rerankers are used in an in context manner operating at a much smaller scale than the traditional pre computed indices these models typically require [jina.ai](https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch)",
"type": "text",
"annotations": [
{
"type": "url_citation",
"url_citation": {
"title": "Snippet Selection and URL Ranking in DeepSearch/DeepResearch",
"exactQuote": "Since our DeepSearch release on February 2nd 2025, we've discovered two implementation details that greatly improved quality. In both cases, multilingual embeddings and rerankers are used in an _\"in-context\"_ manner - operating at a much smaller scale than the traditional pre-computed indices these models typically require.",
"url": "https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch",
"dateTime": "2025-03-13 06:48:01"
}
}
]
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 169670,
"completion_tokens": 27285,
"total_tokens": 196526
},
"visitedURLs": [
"https://github.com/jina-ai/node-DeepResearch/blob/main/src/utils/url-tools.ts",
"https://huggingface.co/jinaai/jina-embeddings-v3",
"https://github.com/jina-ai/reader",
"https://zilliz.com/blog/training-text-embeddings-with-jina-ai",
"https://threads.net/@unwind_ai/post/DGmhWCVswbe/media",
"https://twitter.com/JinaAI_/status/1899840196507820173",
"https://jina.ai/news?tag=tech-blog",
"https://docs.llamaindex.ai/en/stable/examples/embeddings/jinaai_embeddings",
"https://x.com/jinaai_",
"https://x.com/JinaAI_/status/1899840202358784170",
"https://tracxn.com/d/companies/jina-ai/__IQ81fOnU0FsDpagFjG-LrG0DMWHELqI6znTumZBQF-A/funding-and-investors",
"https://jina.ai/models",
"https://linkedin.com/posts/imohitmayank_jinaai-has-unveiled-the-ultimate-developer-activity-7300401711242711040-VD64",
"https://medium.com/@tossy21/trying-out-jina-ais-node-deepresearch-c5b55d630ea6",
"https://huggingface.co/jinaai/jina-clip-v2",
"https://arxiv.org/abs/2409.10173",
"https://milvus.io/docs/embed-with-jina.md",
"https://seedtable.com/best-startups-in-china",
"https://threads.net/@sung.kim.mw/post/DGhG-J_vREu/jina-ais-a-practical-guide-to-implementing-deepsearchdeepresearchthey-cover-desi",
"https://elastic.co/search-labs/blog/jina-ai-embeddings-rerank-model-open-inference-api",
"http://status.jina.ai/",
"https://apidog.com/blog/recreate-openai-deep-research",
"https://youtube.com/watch?v=QxHE4af5BQE",
"https://sdxcentral.com/articles/news/cisco-engages-businesses-on-ai-strategies-at-greater-bay-area-2025/2025/02",
"https://aws.amazon.com/blogs/machine-learning/build-rag-applications-using-jina-embeddings-v2-on-amazon-sagemaker-jumpstart",
"https://reddit.com/r/perplexity_ai/comments/1ejbdqa/fastest_open_source_ai_search_engine",
"https://search.jina.ai/",
"https://sebastian-petrus.medium.com/build-openais-deep-research-open-source-alternative-4f21aed6d9f0",
"https://medium.com/@elmo92/jina-reader-transforming-web-content-to-feed-llms-d238e827cc27",
"https://openai.com/index/introducing-deep-research",
"https://python.langchain.com/docs/integrations/tools/jina_search",
"https://varindia.com/news/meta-is-in-talks-for-usd200-billion-ai-data-center-project",
"https://varindia.com/news/Mira-Murati%E2%80%99s-new-AI-venture-eyes-$9-billion-valuation",
"https://53ai.com/news/RAG/2025031401342.html",
"https://arxiv.org/abs/2409.04701",
"https://bigdatawire.com/this-just-in/together-ai-raises-305m-series-b-to-power-ai-model-training-and-inference",
"https://github.blog/",
"https://cdn-uploads.huggingface.co/production/uploads/660c3c5c8eec126bfc7aa326/MvwT9enRT7gOESHA_tpRj.jpeg",
"https://cdn-uploads.huggingface.co/production/uploads/660c3c5c8eec126bfc7aa326/JNs_DrpFbr6ok_pSRUK4j.jpeg",
"https://app.dealroom.co/lists/33530",
"https://api-docs.deepseek.com/news/news250120",
"https://sdxcentral.com/articles/news/ninjaone-raises-500-million-valued-at-5-billion/2025/02",
"https://linkedin.com/sharing/share-offsite?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
"https://twitter.com/intent/tweet?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
"https://platform.openai.com/docs/api-reference/chat/create",
"https://mp.weixin.qq.com/s/-pPhHDi2nz8hp5R3Lm_mww",
"https://huggingface.us17.list-manage.com/subscribe?id=9ed45a3ef6&u=7f57e683fa28b51bfc493d048",
"https://automatio.ai/",
"https://sdk.vercel.ai/docs/introduction",
"https://app.eu.vanta.com/jinaai/trust/vz7f4mohp0847aho84lmva",
"https://apply.workable.com/huggingface/j/AF1D4E3FEB",
"https://facebook.com/sharer/sharer.php?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
"https://facebook.com/sharer/sharer.php?u=http%3A%2F%2F127.0.0.1%3A3000%2Fen-US%2Fnews%2Fsnippet-selection-and-url-ranking-in-deepsearch-deepresearch%2F",
"https://reddit.com/submit?url=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
"https://apply.workable.com/huggingface",
"https://news.ycombinator.com/submitlink?u=https%3A%2F%2Fjina.ai%2Fnews%2Fa-practical-guide-to-implementing-deepsearch-deepresearch%2F",
"https://news.ycombinator.com/submitlink?u=http%3A%2F%2F127.0.0.1%3A3000%2Fen-US%2Fnews%2Fsnippet-selection-and-url-ranking-in-deepsearch-deepresearch%2F",
"https://docs.github.com/site-policy/privacy-policies/github-privacy-statement",
"https://discord.jina.ai/",
"https://docs.github.com/site-policy/github-terms/github-terms-of-service",
"https://bigdatawire.com/this-just-in/qumulo-announces-30-million-funding",
"https://x.ai/blog/grok-3",
"https://m-ric-open-deep-research.hf.space/",
"https://youtu.be/sal78ACtGTc?feature=shared&t=52",
"https://mp.weixin.qq.com/s/apnorBj4TZs3-Mo23xUReQ",
"https://perplexity.ai/hub/blog/introducing-perplexity-deep-research",
"https://githubstatus.com/",
"https://github.blog/changelog/2021-09-30-footnotes-now-supported-in-markdown-fields",
"https://openai.com/index/introducing-operator",
"mailto:[email protected]",
"https://resources.github.com/learn/pathways",
"https://status.jina.ai/",
"https://reuters.com/technology/artificial-intelligence/tencents-messaging-app-weixin-launches-beta-testing-with-deepseek-2025-02-16",
"https://scmp.com/tech/big-tech/article/3298981/baidu-adopts-deepseek-ai-models-chasing-tencent-race-embrace-hot-start",
"https://microsoft.com/en-us/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks",
"javascript:UC_UI.showSecondLayer();",
"https://resources.github.com/",
"https://storm-project.stanford.edu/research/storm",
"https://blog.google/products/gemini/google-gemini-deep-research",
"https://youtu.be/vrpraFiPUyA",
"https://chat.baidu.com/search?extParamsJson=%7B%22enter_type%22%3A%22ai_explore_home%22%7D&isShowHello=1&pd=csaitab&setype=csaitab&usedModel=%7B%22modelName%22%3A%22DeepSeek-R1%22%7D",
"https://app.dover.com/jobs/jinaai",
"http://localhost:3000/",
"https://docs.cherry-ai.com/",
"https://en.wikipedia.org/wiki/Delayed_gratification",
"https://support.github.com/?tags=dotcom-footer",
"https://docs.jina.ai/",
"https://skills.github.com/",
"https://partner.github.com/",
"https://help.x.com/resources/accessibility",
"https://business.twitter.com/en/help/troubleshooting/how-twitter-ads-work.html",
"https://business.x.com/en/help/troubleshooting/how-twitter-ads-work.html",
"https://support.twitter.com/articles/20170514",
"https://support.x.com/articles/20170514",
"https://t.co/jnxcxPzndy",
"https://t.co/6EtEMa9P05",
"https://help.x.com/using-x/x-supported-browsers",
"https://legal.twitter.com/imprint.html"
],
"readURLs": [
"https://jina.ai/news/a-practical-guide-to-implementing-deepsearch-deepresearch",
"https://github.com/jina-ai/node-DeepResearch",
"https://huggingface.co/blog/open-deep-research",
"https://jina.ai/news/snippet-selection-and-url-ranking-in-deepsearch-deepresearch",
"https://x.com/jinaai_?lang=en",
"https://jina.ai/news",
"https://x.com/joedevon/status/1896984525210837081",
"https://github.com/jina-ai/node-DeepResearch/blob/main/src/tools/jina-latechunk.ts"
],
"numURLs": 98
}
DeepSearch Parameters Guide
Quality Control
In DeepSearch, there’s generally a trade-off: the more steps the system takes, the higher quality results you’ll get, but you’ll also consume more tokens. This improved quality comes from broader, more exhaustive searches and deeper reflection. Four main parameters control the quality of DeepSearch: budget_tokens
, max_attempts
, team_size
, and reasoning_effort
. The reasoning_effort
parameter is essentially a preset combination of budget_tokens
and max_attempts
that’s been carefully tuned. For most users, adjusting reasoning_effort
is the simplest approach.
Budget Tokens
budget_tokens
sets the maximum number of tokens allowed for the entire DeepSearch process. This covers all operations including web searches, reading web pages, reflection, summarization, and coding. Larger budgets naturally lead to better response quality. The DeepSearch process will stop when either the budget is exhausted or it finds a satisfactory answer, whichever comes first. If the budget runs out first, you’ll still get an answer, but it might not be the final, fully-refined response since it hasn’t passed all the quality checks defined by max_attempts
.
Max Attempts
max_attempts
determines how many times the system will retry to solve a problem during the DeepSearch process. Each time DeepSearch produces an answer, it must pass certain quality tests defined by an internal evaluator. If the answer fails these tests, the evaluator provides feedback, and the system uses this feedback to continue searching and refining the answer. Setting max_attempts
too low means you’ll get results quickly, but the quality may suffer since the answer might not pass all quality checks. Setting it too high can make the process feel stuck in an endless retry loop where it keeps attempting and failing.
The system returns a final answer when either budget_tokens
or max_attempts
is exceeded (whichever happens first), or when the answer passes all tests while still having remaining budget and attempts available.
Team Size
team_size
affects quality in a fundamentally different way than max_attempts
and budget_tokens
. When team_size
is set to more than one, the system decomposes the original problem into sub-problems and researches them independently. Think of it like a map-reduce pattern, where a large job gets broken down into smaller tasks that run in parallel. The final answer is then a synthesis of each worker’s results. We call it “team_size” because it simulates a research team where multiple agents investigate different aspects of the same problem and collaborate on a final report.
Keep in mind that all agents’ token consumption counts toward your total budget_tokens
, but each agent has independent max_attempts
. This means that with a larger team_size
but the same budget_tokens
, agents might return answers sooner than expected due to budget constraints. We recommend increasing both team_size
and budget_tokens
together to give each agent sufficient resources to do thorough work.
Finally, you can think of team_size
as controlling the breadth of the search—it determines how many different aspects will be researched. Meanwhile, budget_tokens
and max_attempts
control the depth of the search—how thoroughly each aspect gets explored.
Source Control
DeepSearch relies heavily on grounding—the sources it uses for information. Quality isn’t just about algorithmic depth and breadth; where DeepSearch gets its information is equally important, and often the deciding factor. Let’s explore the key parameters that control this.
No Direct Answer
no_direct_answer
is a simple toggle that prevents the system from returning an answer at step 1. When enabled, it disables the system’s ability to use internal knowledge and forces it to always search the web first. Turning this on will make the system “overthink” even simple questions like “what day is it,” “how are you doing,” or basic factual knowledge that’s definitely in the model’s training data, like “who was the 40th president of the US.”
Hostname Controls
Three parameters—boost_hostnames
, bad_hostnames
, and only_hostnames
—tell DeepSearch which webpages to prioritize, avoid, or exclusively use. To understand how these work, think about the search-and-read process in DeepSearch:
- Search phase: The system searches the web and retrieves a list of website URLs with their snippets
- Selection phase: The system decides which URLs to actually visit (it doesn’t visit all of them due to time and cost constraints)
boost_hostnames
: Domains listed here get higher priority and are more likely to be visitedbad_hostnames
: These domains will never be visitedonly_hostnames
: When defined, only URLs matching these hostnames will be visited
Here are some important notes on hostname parameters. First, the system always uses snippets returned by search engines as initial clues for building reasoning chains. These hostname parameters only affect which webpages the system visits, not how it formulates search queries.
Second, if the collected URLs don’t contain domains specified in only_hostnames
, the system might stop reading webpages entirely. We recommend using these parameters only when you’re familiar with your research question and understand where potential answers are likely to be found (or where they definitely shouldn’t be found).
Special Case: Academic Research
For academic research, you might want searches and reads restricted to arxiv.org. In this case, simply set "search_provider": "arxiv"
and everything will be grounded on arxiv as the sole source. However, generic or trivial questions may not get efficient answers with this restriction, so only use "search_provider": "arxiv"
for serious academic research.
Search Language Code
search_language_code
is another parameter that affects web sources by forcing the system to generate queries in a specific language, regardless of the original input or intermediate reasoning steps. Generally, the system automatically decides the query language to get the best search coverage, but sometimes manual control is useful.
Use Cases for Language Control
International market research: When studying a local brand or company’s impact in international markets, you can force queries to always use English with "search_language_code": "en"
for global coverage, or use the local language for more tailored regional information.
Global research with non-English prompts: If your input is always in Chinese or Japanese (because your end users primarily speak these languages), but your research scope is global rather than just local Chinese or Japanese websites, the system might automatically lean toward your prompt’s language. Use this parameter to force English queries for broader international coverage.
Chat with DeepSearch
What is DeepSearch?
Standard LLMs
RAG and Grounded LLMs
DeepSearch
API Pricing
Product | API Endpoint | Descriptionarrow_upward | w/o API Keykey_off | w/ API Keykey | w/ Premium API Keykey | Average Latency | Token Usage Counting | Allowed Request | |
---|---|---|---|---|---|---|---|---|---|
Reader API | https://r.jina.ai | Convert URL to LLM-friendly text | 20 RPM | 500 RPM | trending_up5000 RPM | 7.9s | Count the number of tokens in the output response. | GET/POST | |
Reader API | https://s.jina.ai | Search the web and convert results to LLM-friendly text | block | 100 RPM | trending_up1000 RPM | 2.5s | Every request costs a fixed number of tokens, starting from 10000 tokens | GET/POST | |
DeepSearch | https://deepsearch.jina.ai/v1/chat/completions | Reason, search and iterate to find the best answer | block | 50 RPM | 500 RPM | 56.7s | Count the total number of tokens in the whole process. | POST | |
Embedding API | https://api.jina.ai/v1/embeddings | Convert text/images to fixed-length vectors | block | 500 RPM & 1,000,000 TPM | trending_up2,000 RPM & 5,000,000 TPM | ssid_chart depends on the input size help | Count the number of tokens in the input request. | POST | |
Reranker API | https://api.jina.ai/v1/rerank | Rank documents by query | block | 500 RPM & 1,000,000 TPM | trending_up2,000 RPM & 5,000,000 TPM | ssid_chart depends on the input size help | Count the number of tokens in the input request. | POST | |
Classifier API | https://api.jina.ai/v1/train | Train a classifier using labeled examples | block | 20 RPM & 200,000 TPM | 60 RPM & 1,000,000 TPM | ssid_chart depends on the input size | Tokens counted as: input_tokens × num_iters | POST | |
Classifier API (Few-shot) | https://api.jina.ai/v1/classify | Classify inputs using a trained few-shot classifier | block | 20 RPM & 200,000 TPM | 60 RPM & 1,000,000 TPM | ssid_chart depends on the input size | Tokens counted as: input_tokens | POST | |
Classifier API (Zero-shot) | https://api.jina.ai/v1/classify | Classify inputs using zero-shot classification | block | 200 RPM & 500,000 TPM | 1,000 RPM & 3,000,000 TPM | ssid_chart depends on the input size | Tokens counted as: input_tokens + label_tokens | POST | |
Segmenter API | https://api.jina.ai/v1/segment | Tokenize and segment long text | 20 RPM | 200 RPM | 1,000 RPM | 0.3s | Token is not counted as usage. | GET/POST |