Link API Parameters

Overview

Key feature of Link API is to provide a unified interface for the requests & responses with a assurance of failover mechanism , credit control over API keys and reduced halucination by providing a default integration of KnowHalu.

Request Parameters

Links's request and response schemas are very similar to the OpenAI Chat API, with a few small differences. At a high level, Link normalizes the schema across models and providers so you only need to learn one.

Note : These parameters are the maximum number of parameters possible among all the LLMs provided by Link, please check the available parameter in Model Document and set the required parameters accordingly or use the default.

`prompt`

Type: Either "propmt" or "message" is required, string[]
Defination :

`message`

Type: Either "propmt" or "message" is required, Message[]
Defination :

`responseFormat`

Type: Optional, map
Description: Forces the model to produce specific output format. Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON. Note: when using JSON mode, you should also instruct the model to produce JSON yourself via a system or user message.

`stop`

Type: Optional, array
Description:Stop generation immediately if the model encounter any token specified in the stop array.

`stream`

Type: Optional, boolean
Description: Enable streaming.

`temperature`

Type: Optional, float, Range : [0.0 to 2.0]
Default: 1.0
Description: This setting influences the variety in the model's responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.

`maxTokens`

Type: Optional, integer, Range: [1 to context_length]
Description: This sets the upper limit for the number of tokens the model can generate in response. It won't produce more than this limit. The maximum value is the context length minus the prompt length.

`topP`

Type: Optional, integer, Range: [0, 1]
Default: 1.0
Description: This setting limits the model's choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model's responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.

`topK`

Type: Optional, integer, Range: [0, 'inf']
Default: 0
Description: This limits the model's choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.

`frequencyPenalty`

Type: Optional, integer, Range: [-2, 2]
Default: 0
Description: This limits the model's choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.

`presencePenalty`

Type: Optional, float, Range: [-2, 2]
Default: 0.0
Description: Adjusts how often the model repeats specific tokens already used in the input. Higher values make such repetition less likely, while negative values do the opposite. Token penalty does not scale with the number of occurrences. Negative values will encourage token reuse.

`repeatationPenalty`

Type: Optional, float, Range: [0.0 , 2.0]
Default: 1.0
Description: Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token's probability.

`seed`

Type: Optional, integer
Description: If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed for some models. Only available for Open AI

`Custom Link Parameters`

`transform`

Type: Optional, string[]
Description: To help with prompts that exceed the maximum context size of a model, Link supports a custom parameter called transforms.

If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed for some models.

The transforms param is an array of strings that tell OpenRouter to apply a series of transformations to the prompt before sending it to the model. Transformations are applied in-order.

Available transforms are:

middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because LLMs pay significantly less attention to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt. Additionally, it reduce the number of messages to adhere to the model's limit. For instance, Anthropic's Claude models enforce a maximum of 1000 messages.

NOTE: All Link Models by default uses transform feature you can exclude the transform by e.g setting transform:[] in the request body

`routeProvider`

Type: Optional, Provider
Description: A model can be provided by multiple providers, Link by default select the cheapest and highest GPU available provider, you can edit it by setting custom order in which you want the request to pick provider.

example:

curl --location 'https://api.link-llm.com/api/v1/prompt' \
--header 'X-LLM:mistral_mistral-small-latest' \
--header 'X-API-KEY: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
     "prompt": [
     {"content": "how to make an ice cream", "role": "user"}
    ],
    "provider": {
      "order": [
        "Azure",
        "Together"
      ]
    }
}
'

By default, providers that don't support a given LLM parameter will ignore them. But you can change this and only filter for providers that support the parameters in your request.

For example, to only use providers that support JSON formatting:

curl --location 'https://api.link-llm.com/api/v1/prompt' \
--header 'X-LLM:mistralai/mixtral-8x7b-instruct' \
--header 'X-API-KEY: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": [
     "how is the weather in San Jose?
    ]
    "provider": {
      "order": [
        "Azure",
        "Together"
      ]
    },
    `responseFormat`:{
        "type":"json"
    }
}
'

`multiModel`

Type: Optional, string[]
Description: Lets you automatically try other models if the primary model's providers are down, rate-limited, or refuse to reply due to content moderation required by all providers:
Pricing : If your primary model failed because of downtime of the model provider we will route your request to the next specified model upon which you will be charged for the last exceuted model only. Also if fallback is true but order array has only single (primary model) , Link will decide the most suitable and cheapest model to execute the request.

example:

curl --location 'https://api.link-llm.com/api/v1/prompt' \
--header 'X-LLM:mistralai/mixtral-8x7b-instruct' \
--header 'X-API-KEY: {{API_KEY}}' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": [
     "how is the weather in San Jose?
    ]
    "multiModel": {
      "order": [
      "anthropic/claude-2.1", "gryphe/mythomax-l2-13b"
      ],
      "fallback": true
    },
    `responseFormat`:{
        "type":"json"
    }
}
'

`request structure`

{
  "messages": Message[], // either message or prompt
  "prompt": string[], // either message or prompt
  "response_format": { type: 'json_object' },
  "stop": string,
  "stream": boolean,
  "max_tokens": number,
  "temperature": number,
  "top_p": number,
  "top_k": number,
  "frequency_penalty": number,
  "presence_penalty": number,
  "repetition_penalty": number,
  "seed": number,   

  // Custom parameters by Link LLM
  "transforms": [], // check transforms above
   "multiModel": {  // check multimodel above
      "order": string[],
      "fallback": boolean
    }
  
 "provider": {// check routeProvider above
      "order": string[]
    }
}

`Link API Headers`

Provider	Model Name	API HEADER
Cohere	Command R+	cohere_command-r-plus
Cohere	Command R	cohere_command-r
Gemini	Gemini 1.5 Pro	gemini-pro-1.5
Gemini	Gemini 1.0 Pro	gemini-pro-1.0
Mistral	Open Mistral 7b	mistral_open-mistral-7b
Mistral	Open Mistral 8x7b	mistral_open-mixtral-8x7b
Mistral	Open Mistral 8x22b	mistral_open-mixtral-8x22b
Mistral	Mistral Small	mistral_mistral-small-latest
Mistral	Mistral Medium	mistral_mistral-medium-latest
Mistral	Mistral Large	mistral_mistral-large-latest
Chat GPT	GPT-4 Turbo	openai_gpt-4-turbo-2024-04-09
Chat GPT	GPT-4	openai_gpt-4
Chat GPT	GPT-4 32k	openai_gpt-4-32k
Chat GPT	GPT-3.5 Turbo	gpt-3.5-turbo-0125
Chat GPT	GPT-3.5 Turbo	openai_gpt-3.5-turbo-instruct
Chat GPT	GPT-3.5 Turbo	gpt-3.5-turbo
Chat GPT	GPT-3.5 Davinci Babbage	openai_babbage-002
Together	StripedHyena Nous (7B)	together_stripedhyena-nous-7b

Link API Parameters

Overview​

Request Parameters​

prompt​

message​

responseFormat​

stop​

stream​

temperature​

maxTokens​

topP​

topK​

frequencyPenalty​

presencePenalty​

repeatationPenalty​

seed​

Custom Link Parameters​

transform​

routeProvider​

multiModel​

request structure​

Link API Headers​