List Models

Get information about available models.

Endpoint

GET /v1/models

Request

curl https://api.example.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

JavaScript

const response = await fetch('https://api.example.com/v1/models', {
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY'
  }
});

const data = await response.json();
console.log(data.models);

Python

import requests

response = requests.get(
    'https://api.example.com/v1/models',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

data = response.json()
print(data['models'])

Response

{
  "models": [
    {
      "id": "model-v1",
      "name": "Standard Model v1",
      "description": "Our standard text generation model",
      "contextWindow": 8192,
      "maxTokens": 2048,
      "capabilities": ["generate", "stream"],
      "pricing": {
        "prompt": 0.0001,
        "completion": 0.0002
      },
      "created": 1640000000
    },
    {
      "id": "model-v2-large",
      "name": "Large Model v2",
      "description": "Advanced model with larger context",
      "contextWindow": 32768,
      "maxTokens": 4096,
      "capabilities": ["generate", "stream", "function_calling"],
      "pricing": {
        "prompt": 0.0005,
        "completion": 0.001
      },
      "created": 1650000000
    }
  ]
}

Response Fields

Field	Type	Description
`id`	string	Unique model identifier
`name`	string	Human-readable model name
`description`	string	Model description
`contextWindow`	integer	Maximum context size in tokens
`maxTokens`	integer	Maximum tokens that can be generated
`capabilities`	array	List of supported features
`pricing`	object	Pricing per 1K tokens
`created`	integer	Unix timestamp of model release

Get Specific Model

Get details about a specific model:

GET /v1/models/{model_id}

Example

curl https://api.example.com/v1/models/model-v2-large \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "model-v2-large",
  "name": "Large Model v2",
  "description": "Advanced model with larger context",
  "contextWindow": 32768,
  "maxTokens": 4096,
  "capabilities": ["generate", "stream", "function_calling"],
  "pricing": {
    "prompt": 0.0001,
    "completion": 0.0002
  },
  "created": 1650000000,
  "stats": {
    "avgLatency": 850,
    "uptime": 99.9
  }
}

Available Models

model-v1

Our standard model, great for most use cases.

Context: 8,192 tokens
Max Output: 2,048 tokens
Best for: General text generation, summaries, Q&A

model-v2-large

Advanced model with larger context and better reasoning.

Context: 32,768 tokens
Max Output: 4,096 tokens
Best for: Long-form content, complex reasoning, code generation

model-v1-fast

Optimized for speed with good quality.

Context: 4,096 tokens
Max Output: 1,024 tokens
Best for: Quick responses, real-time applications

Using Models

Specify the model in your generation request:

const response = await fetch('https://api.example.com/v1/generate', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'model-v2-large',
    prompt: 'Your prompt here',
    maxTokens: 500
  })
});

Model Selection Guide

Choose the right model for your use case:

Use Case	Recommended Model	Reason
Chat applications	model-v1	Balanced speed and quality
Long documents	model-v2-large	Large context window
Real-time responses	model-v1-fast	Low latency
Code generation	model-v2-large	Better reasoning
Simple tasks	model-v1	Cost-effective

Pricing

Pricing is based on tokens processed:

Total Cost = (Prompt Tokens × Prompt Price) + (Completion Tokens × Completion Price)

Example Calculation

Using model-v1:

Prompt: 100 tokens × $0.0001 = $0.01
Completion: 50 tokens × $0.0002 = $0.01
Total: $0.02

Model Capabilities

generate

Basic text generation capability. All models support this.

stream

Streaming responses for real-time generation. See Stream Responses.

function_calling

Advanced models can call functions/tools. See Function Calling documentation.

Best Practices

Start with standard model: Use model-v1 unless you need specific features
Consider latency: Use model-v1-fast for time-sensitive applications
Monitor costs: Track usage by model to optimize spending
Test different models: Compare quality and speed for your use case

Model Updates

We continuously improve models. Updates are backward-compatible:

Version numbers increment for major changes
Same id receives minor improvements
Breaking changes get new model id

Subscribe to our changelog for updates.

Next Steps

Generate Text using different models
Stream Responses for real-time generation
Best Practices for model selection

Endpoint​

Request​

JavaScript​

Python​

Response​

Response Fields​

Get Specific Model​

Example​

Response​

Available Models​

model-v1​

model-v2-large​

model-v1-fast​

Using Models​

Model Selection Guide​

Pricing​

Example Calculation​

Model Capabilities​

generate​

stream​

function_calling​

Best Practices​

Model Updates​

Next Steps​

Endpoint

Request

JavaScript

Python

Response

Response Fields

Get Specific Model

Example

Response

Available Models

model-v1

model-v2-large

model-v1-fast

Using Models

Model Selection Guide

Pricing

Example Calculation

Model Capabilities

generate

stream

function_calling

Best Practices

Model Updates

Next Steps