Enhance deep search capabilities for all models: Append "-deep-search" to all model suffixes, such as "gpt-4o-deep-search" (for easier use with third-party software). Compared to web search, deep search is more comprehensive but slower.You can choose the deep search provider by adding the parameter "searchType" when requesting the model. Supported providers: jina(default) search1api tavily exaThis feature is adapted from Jina's open-source project.Price: Base model price + search fee
The ID of the model to be used. For details on which models are applicable to the Chat API, refer to the Model Endpoint Compatibility Table.
searchType
string
optional
Select a search service. Supports search1api, tavily, and exa
messages
array [object {2}]
required
Generate chat completion messages using the chat format.
role
string
optional
content
string
optional
temperature
integer
optional
The sampling temperature to use, ranging between 0 and 2. Higher values (e.g., 0.8) make the output more random, while lower values (e.g., 0.2) make it more focused and deterministic. We generally recommend adjusting either this or top_p, but not both.
top_p
integer
optional
An alternative to temperature sampling, called nucleus sampling (top_p), where the model considers only the tokens with a cumulative probability mass of top_p. For example, 0.1 means only the top 10% probability mass tokens are considered. We generally recommend adjusting either this or temperature, but not both.
n
integer
optional
The number of chat completion options to generate for each input message.
stream
boolean
optional
If set, partial message increments will be sent, similar to ChatGPT. As tokens become available, they will be sent as raw data via server-sent events(: [DONE]`), and the stream will be terminated by a message. For sample code, refer to the OpenAI Cookbook.
stop
string
optional
Up to four sequences where the API will stop generating more tokens.
max_tokens
integer
optional
The maximum number of tokens to generate for chat completion. The total length of input tokens and generated tokens is limited by the model's context length.
presence_penalty
number
optional
A number between -2.0 and 2.0. Positive values penalize new tokens based on whether they have appeared in the text so far, increasing the likelihood of the model discussing new topics. See more about frequency and presence penalties.
frequency_penalty
number
optional
A number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, reducing the likelihood of the model repeating the same line verbatim. See more about frequency and presence penalties.
logit_bias
null
optional
Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the model-generated logits before sampling. The exact effect varies by model, but values between -1 and 1 should decrease or increase the likelihood of selection, while values like -100 or 100 should result in the prohibition or exclusive selection of the corresponding token.
user
string
optional
A unique identifier representing your end user, which can help OpenAI monitor and detect abuse. Learn more。
{"id":"chatcmpl-123","object":"chat.completion","created":1677652288,"choices":[{"index":0,"message":{"role":"assistant","content":"\n\nHello there, how may I assist you today?"},"finish_reason":"stop"}],"usage":{"prompt_tokens":9,"completion_tokens":12,"total_tokens":21}}