Generate Text

Generate text completions using the generation API.

Endpoint

POST /v1/generate

Request Body

Parameter	Type	Required	Description
`prompt`	string	Yes	The input text to generate from
`maxTokens`	integer	No	Maximum tokens to generate (default: 100)
`temperature`	number	No	Controls randomness, 0.0 to 2.0 (default: 1.0)
`topP`	number	No	Nucleus sampling threshold (default: 1.0)
`stopSequences`	array	No	Sequences that stop generation
`model`	string	No	Model to use (default: latest)

Example Request

curl https://api.example.com/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Write a haiku about coding",
    "maxTokens": 50,
    "temperature": 0.7
  }'

JavaScript

const response = await fetch('https://api.example.com/v1/generate', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    prompt: 'Write a haiku about coding',
    maxTokens: 50,
    temperature: 0.7
  })
});

const data = await response.json();
console.log(data.text);

Python

import requests

response = requests.post(
    'https://api.example.com/v1/generate',
    headers={
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json',
    },
    json={
        'prompt': 'Write a haiku about coding',
        'maxTokens': 50,
        'temperature': 0.7
    }
)

data = response.json()
print(data['text'])

Response

{
  "id": "gen_abc123",
  "object": "generation",
  "created": 1640000000,
  "model": "model-v1",
  "text": "Code flows like water\nBugs lurk in every corner\nDebug, then deploy",
  "usage": {
    "promptTokens": 5,
    "completionTokens": 15,
    "totalTokens": 20
  },
  "finishReason": "stop"
}

Response Fields

Field	Type	Description
`id`	string	Unique identifier for this generation
`object`	string	Object type (always "generation")
`created`	integer	Unix timestamp of creation
`model`	string	Model used for generation
`text`	string	Generated text
`usage`	object	Token usage information
`finishReason`	string	Reason generation stopped

Finish Reasons

stop: Natural completion or stop sequence reached
length: Maximum token limit reached
content_filter: Content filtered due to policy

Parameters in Detail

temperature

Controls randomness in generation:

0.0: Deterministic, always picks most likely token
1.0: Balanced creativity and coherence (default)
2.0: Maximum randomness

// More creative
{ temperature: 1.5 }

// More focused
{ temperature: 0.3 }

topP

Nucleus sampling - considers tokens with cumulative probability up to topP:

1.0: Consider all tokens (default)
0.9: Consider top 90% probability mass
0.1: Very focused, only high-probability tokens

stopSequences

Array of strings that stop generation when encountered:

{
  prompt: "List three items:\n1.",
  stopSequences: ["\n\n", "4."]
}

Error Responses

Invalid Parameters

{
  "error": {
    "type": "invalid_request_error",
    "message": "maxTokens must be between 1 and 2048",
    "code": "invalid_parameter"
  }
}

Rate Limit Exceeded

{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded. Please try again later.",
    "code": "rate_limit_exceeded"
  }
}

Best Practices

Use temperature wisely: Lower for factual content, higher for creative content
Set appropriate maxTokens: Avoid unnecessarily large values
Use stopSequences: For structured output, use stop sequences to control format
Handle errors: Always implement error handling and retries
Cache when possible: Cache responses for identical prompts

Next Steps

Stream Responses for real-time generation
List Models to see available models
Code Examples for more use cases

Endpoint​

Request Body​

Example Request​

JavaScript​

Python​

Response​

Response Fields​

Finish Reasons​

Parameters in Detail​

temperature​

topP​

stopSequences​

Error Responses​

Invalid Parameters​

Rate Limit Exceeded​

Best Practices​

Next Steps​