Skip to main content

Link API Parameters

Overview

Key feature of Link API is to provide a unified interface for the requests & responses with a assurance of failover mechanism , credit control over API keys and reduced halucination by providing a default integration of KnowHalu.

Request Parameters

Links's request and response schemas are very similar to the OpenAI Chat API, with a few small differences. At a high level, Link normalizes the schema across models and providers so you only need to learn one.

Note : These parameters are the maximum number of parameters possible among all the LLMs provided by Link, please check the available parameter in Model Document and set the required parameters accordingly or use the default.

prompt

  • Type: Either "propmt" or "message" is required, string[]
  • Defination :

message

  • Type: Either "propmt" or "message" is required, Message[]
  • Defination :

responseFormat

  • Type: Optional, map
  • Description: Forces the model to produce specific output format. Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON. Note: when using JSON mode, you should also instruct the model to produce JSON yourself via a system or user message.

stop

  • Type: Optional, array
  • Description:Stop generation immediately if the model encounter any token specified in the stop array.

stream

  • Type: Optional, boolean
  • Description: Enable streaming.

temperature

  • Type: Optional, float, Range : [0.0 to 2.0]
  • Default: 1.0
  • Description: This setting influences the variety in the model's responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.

maxTokens

  • Type: Optional, integer, Range: [1 to context_length]
  • Description: This sets the upper limit for the number of tokens the model can generate in response. It won't produce more than this limit. The maximum value is the context length minus the prompt length.

topP

  • Type: Optional, integer, Range: [0, 1]
  • Default: 1.0
  • Description: This setting limits the model's choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model's responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.

topK

  • Type: Optional, integer, Range: [0, 'inf']
  • Default: 0
  • Description: This limits the model's choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.

frequencyPenalty

  • Type: Optional, integer, Range: [-2, 2]
  • Default: 0
  • Description: This limits the model's choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.

presencePenalty

  • Type: Optional, float, Range: [-2, 2]
  • Default: 0.0
  • Description: Adjusts how often the model repeats specific tokens already used in the input. Higher values make such repetition less likely, while negative values do the opposite. Token penalty does not scale with the number of occurrences. Negative values will encourage token reuse.

repeatationPenalty

  • Type: Optional, float, Range: [0.0 , 2.0]
  • Default: 1.0
  • Description: Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token's probability.

seed

  • Type: Optional, integer
  • Description: If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed for some models. Only available for Open AI

transform

  • Type: Optional, string[]
  • Description: To help with prompts that exceed the maximum context size of a model, Link supports a custom parameter called transforms.

If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed for some models.

The transforms param is an array of strings that tell OpenRouter to apply a series of transformations to the prompt before sending it to the model. Transformations are applied in-order.

Available transforms are:

  • middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because LLMs pay significantly less attention to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt. Additionally, it reduce the number of messages to adhere to the model's limit. For instance, Anthropic's Claude models enforce a maximum of 1000 messages.

NOTE: All Link Models by default uses transform feature you can exclude the transform by e.g setting transform:[] in the request body


routeProvider

  • Type: Optional, Provider
  • Description: A model can be provided by multiple providers, Link by default select the cheapest and highest GPU available provider, you can edit it by setting custom order in which you want the request to pick provider.

example:


curl --location 'https://api.link-llm.com/api/v1/prompt' \
--header 'X-LLM:mistral_mistral-small-latest' \
--header 'X-API-KEY: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
"prompt": [
{"content": "how to make an ice cream", "role": "user"}
],
"provider": {
"order": [
"Azure",
"Together"
]
}
}
'

By default, providers that don't support a given LLM parameter will ignore them. But you can change this and only filter for providers that support the parameters in your request.

For example, to only use providers that support JSON formatting:


curl --location 'https://api.link-llm.com/api/v1/prompt' \
--header 'X-LLM:mistralai/mixtral-8x7b-instruct' \
--header 'X-API-KEY: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
"prompt": [
"how is the weather in San Jose?
]
"provider": {
"order": [
"Azure",
"Together"
]
},
`responseFormat`:{
"type":"json"
}
}
'

multiModel

  • Type: Optional, string[]
  • Description: Lets you automatically try other models if the primary model's providers are down, rate-limited, or refuse to reply due to content moderation required by all providers:
  • Pricing : If your primary model failed because of downtime of the model provider we will route your request to the next specified model upon which you will be charged for the last exceuted model only. Also if fallback is true but order array has only single (primary model) , Link will decide the most suitable and cheapest model to execute the request.

example:


curl --location 'https://api.link-llm.com/api/v1/prompt' \
--header 'X-LLM:mistralai/mixtral-8x7b-instruct' \
--header 'X-API-KEY: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
"prompt": [
"how is the weather in San Jose?
]
"multiModel": {
"order": [
"anthropic/claude-2.1", "gryphe/mythomax-l2-13b"
],
"fallback": true
},
`responseFormat`:{
"type":"json"
}
}
'

request structure

{
"messages": Message[], // either message or prompt
"prompt": string[], // either message or prompt
"response_format": { type: 'json_object' },
"stop": string,
"stream": boolean,
"max_tokens": number,
"temperature": number,
"top_p": number,
"top_k": number,
"frequency_penalty": number,
"presence_penalty": number,
"repetition_penalty": number,
"seed": number,

// Custom parameters by Link LLM
"transforms": [], // check transforms above
"multiModel": { // check multimodel above
"order": string[],
"fallback": boolean
}

"provider": {// check routeProvider above
"order": string[]
}
}

ProviderModel NameAPI HEADER
CohereCommand R+cohere_command-r-plus
CohereCommand Rcohere_command-r
GeminiGemini 1.5 Progemini-pro-1.5
GeminiGemini 1.0 Progemini-pro-1.0
MistralOpen Mistral 7bmistral_open-mistral-7b
MistralOpen Mistral 8x7bmistral_open-mixtral-8x7b
MistralOpen Mistral 8x22bmistral_open-mixtral-8x22b
MistralMistral Smallmistral_mistral-small-latest
MistralMistral Mediummistral_mistral-medium-latest
MistralMistral Largemistral_mistral-large-latest
Chat GPTGPT-4 Turboopenai_gpt-4-turbo-2024-04-09
Chat GPTGPT-4openai_gpt-4
Chat GPTGPT-4 32kopenai_gpt-4-32k
Chat GPTGPT-3.5 Turbogpt-3.5-turbo-0125
Chat GPTGPT-3.5 Turboopenai_gpt-3.5-turbo-instruct
Chat GPTGPT-3.5 Turbogpt-3.5-turbo
Chat GPTGPT-3.5 Davinci Babbageopenai_babbage-002
TogetherStripedHyena Nous (7B)together_stripedhyena-nous-7b