What is Guardway Gateway?
Guardway Gateway is a production-ready, enterprise-grade HTTP gateway that provides a unified OpenAI-compatible API for multiple Large Language Model (LLM) providers. It acts as an intelligent proxy layer between your applications and various AI model providers, offering advanced security, observability, cost management, and content moderation features.Purpose and Vision
Guardway Gateway was designed to solve critical challenges organizations face when deploying LLM-based applications in production:Primary Goals
- Security-First Architecture: Provide enterprise-grade security with defense-in-depth principles, including container hardening, built-in guardrails, and comprehensive access controls.
- Unified API Interface: Expose a single OpenAI-compatible API that works across 18+ different LLM providers, eliminating vendor lock-in and simplifying integration.
- Low-Latency Content Moderation: Deliver built-in guardrails with <50ms latency through Small Language Model (SLM) powered detection, requiring zero external API dependencies.
- Enterprise Governance: Enable multi-tenancy, budget management, usage tracking, and audit logging for organizational control and compliance.
- Production-Ready Observability: Provide comprehensive monitoring, tracing, and metrics collection out-of-the-box.
Key Features
Core Capabilities
OpenAI-Compatible API
Drop-in replacement for OpenAI API with streaming support, text embeddings, image generation, and model discovery.
Multi-Provider Support
18+ LLM providers including OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Groq, Cohere, Deepseek, Fireworks, HuggingFace, Together, Perplexity, OpenRouter, XAI (Grok), and more across chat, embeddings, image generation, and speech.
Streaming Support
Real-time Server-Sent Events (SSE) for streaming completions.
Model Context Protocol (MCP)
Native integration with MCP servers for tool use and context enhancement.
Full Multi-Provider List
Full Multi-Provider List
- Chat Models: OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Groq, Cohere, Deepseek, Fireworks, HuggingFace, Together, Perplexity, OpenRouter, XAI (Grok)
- Embeddings: OpenAI, Cohere, Voyage
- Image Generation: OpenAI (DALL-E), Fal
- Speech: AssemblyAI (transcription), ElevenLabs (text-to-speech)
- Generic: OpenAI-compatible servers (LM Studio, vLLM, Ollama)
- Cloud Providers: AWS Bedrock, Azure OpenAI
Security & Compliance
Authentication & Authorization
Multiple auth methods with Role-Based Access Control (RBAC), API key rotation support, and ephemeral tokens with configurable TTL.
Secret Management
Encryption for API keys and secrets. Automatic secret redaction in logs.
Audit Logging
Comprehensive audit trail for all administrative actions including actor, role, method, path, status, and IP address tracking.
Enterprise Governance
Multi-Tenancy
User management with role-based permissions. Team management with shared budgets and quotas. Per-team and per-user API keys. Hierarchical budget allocation.
Budget Management
Per-API-key budget limits (in USD). Per-team budget aggregation. Alert thresholds (50%, 80%, 100%). Webhook notifications on threshold breach. Automatic request blocking when budget exceeded.
Quota Enforcement
Request count limits per API key. Token-based rate limiting (requests/minute and tokens/minute). Per-key and global rate limits.
Cost Tracking
Real-time spend tracking per request. Aggregated costs by model, provider, user, team. Cost estimation for embeddings and completions. Export capabilities for billing integration.
Routing & Load Balancing
Priority-Based Routing
Pattern matching for model routing. Priority-based rule execution. Per-rule budget and token limits with time windows. Automatic window reset.
Advanced Routing Strategies
Lowest latency routing. Lowest cost routing. Least busy routing. Tag-based routing. Auto-router with SLA requirements. Configurable fallback strategies.
Provider Health Monitoring
Continuous health checks. Automatic failover on provider failure. Circuit breaker pattern per provider.
Observability
Distributed Tracing
OpenTelemetry-based distributed tracing. Request correlation and trace context propagation.
Metrics
Request counts, token usage, error rates, latency percentiles, cost tracking, and MCP metrics.
Structured Logging
JSON-formatted logs with configurable log levels, request correlation, and secret redaction.
Request Logs
Complete request/response audit trail. Filterable by model, provider, date, status, and more.
Admin Dashboard
The Admin UI provides a comprehensive management interface including:- Dashboard: Real-time metrics and system overview
- Providers: Provider configuration and management
- Models: Model catalog and capabilities
- Routes: Routing rule editor
- Playground: Interactive API testing with streaming support
- Users & Teams: User and team management with RBAC
- API Keys: Key creation, rotation, and management
- Usage & Spend: Usage analytics and cost tracking
- Logs: Request/response log viewer with filtering
- Guardrails: Content filtering and security rule configuration
- MCP Servers: MCP server configuration and tool management
- Webhooks: Webhook management and event subscriptions
- Settings: System configuration and preferences
Caching
Multi-Tier Caching
In-memory cache and distributed cache options. Configurable TTL per cache type. Cache invalidation on provider config changes.
Cache Types
Completion caching (exact match). Embedding caching (exact match). Model catalog caching (TTL-based).
Use Cases
Enterprise LLM Gateway
Organizations deploying multiple LLM applications can use Guardway Gateway as a central gateway to enforce security policies and content moderation, track and control costs across teams and projects, monitor usage and performance, comply with audit requirements, and avoid vendor lock-in.
Multi-Tenant AI Platform
SaaS providers can use Guardway Gateway to offer AI capabilities to customers with usage-based billing, isolate customer data and budgets, provide customer-specific model routing, and track per-customer usage and costs.
Development and Testing
Development teams can use Guardway Gateway to test applications against multiple providers without code changes, compare model performance and costs, simulate production environments, and debug LLM interactions with detailed logging.
Content Moderation Pipeline
Applications requiring content safety can use Guardway Gateway to detect and block PII in prompts and responses, filter hate speech and harmful content, prevent prompt injection attacks, and comply with content policy requirements.
Cost Optimization
Organizations looking to optimize AI spending can use Guardway Gateway to route requests to lowest-cost providers, set and enforce budget limits, track spending patterns, and identify cost optimization opportunities.
Target Audience
Primary Target Markets
Security-First Enterprises
Banks, financial institutions. Healthcare organizations. Government agencies. Any organization with SOC 2, PCI DSS, HIPAA compliance requirements.
Organizations with Compliance Requirements
Need audit logging and access controls. Require content moderation and PII protection. Must track and control AI spending.
Companies Needing Low-Latency Guardrails
Real-time applications requiring low-latency guardrail overhead. Customer-facing chatbots and assistants. Interactive AI applications.
MCP-Focused Deployments
Applications leveraging Model Context Protocol. Tool-augmented LLM applications. Agent-based systems.
Secondary Markets
- Startups and SMBs: Cost-conscious organizations wanting to optimize AI spending
- Development Teams: Teams building LLM applications and needing testing flexibility
- AI Platforms: SaaS providers offering AI capabilities to customers
