Generate Text
Generate text completions using the generation API.
Endpoint
POST /v1/generate
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | The input text to generate from |
maxTokens | integer | No | Maximum tokens to generate (default: 100) |
temperature | number | No | Controls randomness, 0.0 to 2.0 (default: 1.0) |
topP | number | No | Nucleus sampling threshold (default: 1.0) |
stopSequences | array | No | Sequences that stop generation |
model | string | No | Model to use (default: latest) |
Example Request
curl https://api.example.com/v1/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Write a haiku about coding",
"maxTokens": 50,
"temperature": 0.7
}'
JavaScript
const response = await fetch('https://api.example.com/v1/generate', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
prompt: 'Write a haiku about coding',
maxTokens: 50,
temperature: 0.7
})
});
const data = await response.json();
console.log(data.text);
Python
import requests
response = requests.post(
'https://api.example.com/v1/generate',
headers={
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
},
json={
'prompt': 'Write a haiku about coding',
'maxTokens': 50,
'temperature': 0.7
}
)
data = response.json()
print(data['text'])
Response
{
"id": "gen_abc123",
"object": "generation",
"created": 1640000000,
"model": "model-v1",
"text": "Code flows like water\nBugs lurk in every corner\nDebug, then deploy",
"usage": {
"promptTokens": 5,
"completionTokens": 15,
"totalTokens": 20
},
"finishReason": "stop"
}
Response Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier for this generation |
object | string | Object type (always "generation") |
created | integer | Unix timestamp of creation |
model | string | Model used for generation |
text | string | Generated text |
usage | object | Token usage information |
finishReason | string | Reason generation stopped |
Finish Reasons
stop: Natural completion or stop sequence reachedlength: Maximum token limit reachedcontent_filter: Content filtered due to policy
Parameters in Detail
temperature
Controls randomness in generation:
0.0: Deterministic, always picks most likely token1.0: Balanced creativity and coherence (default)2.0: Maximum randomness
// More creative
{ temperature: 1.5 }
// More focused
{ temperature: 0.3 }
topP
Nucleus sampling - considers tokens with cumulative probability up to topP:
1.0: Consider all tokens (default)0.9: Consider top 90% probability mass0.1: Very focused, only high-probability tokens
stopSequences
Array of strings that stop generation when encountered:
{
prompt: "List three items:\n1.",
stopSequences: ["\n\n", "4."]
}
Error Responses
Invalid Parameters
{
"error": {
"type": "invalid_request_error",
"message": "maxTokens must be between 1 and 2048",
"code": "invalid_parameter"
}
}
Rate Limit Exceeded
{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded. Please try again later.",
"code": "rate_limit_exceeded"
}
}
Best Practices
- Use temperature wisely: Lower for factual content, higher for creative content
- Set appropriate maxTokens: Avoid unnecessarily large values
- Use stopSequences: For structured output, use stop sequences to control format
- Handle errors: Always implement error handling and retries
- Cache when possible: Cache responses for identical prompts
Next Steps
- Stream Responses for real-time generation
- List Models to see available models
- Code Examples for more use cases