Generation Parameters

These docs are outdated! Please check out https://docs.titanml.co for the latest information on the TitanML platform. If there's anything that's not covered there, please contact us on our discord.

The API supports the standard generation parameters. See below for a description.

To use the parameters include them in the json payload:

import requests

if __name__ == "__main__":
    
    input_text = 'List 3 things to do in London.'

    url = "http://localhost:8000/generate_stream"
    json = {
        "text":input_text,
        "sampling_temperature":0.1,
        "no_repeat_ngram_size":3
        }
     
    response = requests.post(url, json=json, stream=True)
    response.encoding = 'utf-8'
     
    for text in response.iter_content(chunk_size=1, decode_unicode=True):
        if text:
            print(text, end="", flush=True)

Parameter Name

Description

Default Value

generate_max_length

The maximum generation length

128

sampling_topk

Sample predictions from the top K most probable candidates

sampling_topp

Sample from predictions who's cumulative probability exceeds this value

1.0 (no restriction)

sampling_temperature

Sample with randomness. Bigger temperatures are associated with more randomness and 'creativity'.

1.0

repetition_penalty

Penalise the generation of tokens that have been generated before. Set to > 1 to penalize.

1 (no penalty)

no_repeat_ngram_size

Prevent repetitions of ngrams of this size.

0 (turned off)

PreviousUsing a local model NextTitan Train 🎓: Finetuning Service

Last updated 1 year ago