Overview - Guardway

What is Guardway Gateway?

Guardway Gateway is a production-ready, enterprise-grade HTTP gateway that provides a unified OpenAI-compatible API for multiple Large Language Model (LLM) providers. It acts as an intelligent proxy layer between your applications and various AI model providers, offering advanced security, observability, cost management, and content moderation features.

Purpose and Vision

Guardway Gateway was designed to solve critical challenges organizations face when deploying LLM-based applications in production:

Primary Goals

Security-First Architecture: Provide enterprise-grade security with defense-in-depth principles, including container hardening, built-in guardrails, and comprehensive access controls.
Unified API Interface: Expose a single OpenAI-compatible API that works across 18+ different LLM providers, eliminating vendor lock-in and simplifying integration.
Low-Latency Content Moderation: Deliver built-in guardrails with <50ms latency through Small Language Model (SLM) powered detection, requiring zero external API dependencies.
Enterprise Governance: Enable multi-tenancy, budget management, usage tracking, and audit logging for organizational control and compliance.
Production-Ready Observability: Provide comprehensive monitoring, tracing, and metrics collection out-of-the-box.

Key Features

Core Capabilities

OpenAI-Compatible API

Drop-in replacement for OpenAI API with streaming support, text embeddings, image generation, and model discovery.

Multi-Provider Support

18+ LLM providers including OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Groq, Cohere, Deepseek, Fireworks, HuggingFace, Together, Perplexity, OpenRouter, XAI (Grok), and more across chat, embeddings, image generation, and speech.

Streaming Support

Real-time Server-Sent Events (SSE) for streaming completions.

Model Context Protocol (MCP)

Native integration with MCP servers for tool use and context enhancement.

Full Multi-Provider List

Chat Models: OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Groq, Cohere, Deepseek, Fireworks, HuggingFace, Together, Perplexity, OpenRouter, XAI (Grok)
Embeddings: OpenAI, Cohere, Voyage
Image Generation: OpenAI (DALL-E), Fal
Speech: AssemblyAI (transcription), ElevenLabs (text-to-speech)
Generic: OpenAI-compatible servers (LM Studio, vLLM, Ollama)
Cloud Providers: AWS Bedrock, Azure OpenAI

Security & Compliance

Authentication & Authorization

Multiple auth methods with Role-Based Access Control (RBAC), API key rotation support, and ephemeral tokens with configurable TTL.

Secret Management

Encryption for API keys and secrets. Automatic secret redaction in logs.

Audit Logging

Comprehensive audit trail for all administrative actions including actor, role, method, path, status, and IP address tracking.

Built-in Guardrails operate at low latency through Small Language Model (SLM) powered detection with zero external API dependencies. This includes PII Detection, Hate Speech Detection, Prompt Injection Protection, Keyword Filtering, IP Filtering, and third-party integrations for advanced content moderation.

Enterprise Governance

Multi-Tenancy

User management with role-based permissions. Team management with shared budgets and quotas. Per-team and per-user API keys. Hierarchical budget allocation.

Budget Management

Per-API-key budget limits (in USD). Per-team budget aggregation. Alert thresholds (50%, 80%, 100%). Webhook notifications on threshold breach. Automatic request blocking when budget exceeded.

Quota Enforcement

Request count limits per API key. Token-based rate limiting (requests/minute and tokens/minute). Per-key and global rate limits.

Cost Tracking

Real-time spend tracking per request. Aggregated costs by model, provider, user, team. Cost estimation for embeddings and completions. Export capabilities for billing integration.

Routing & Load Balancing

Priority-Based Routing

Pattern matching for model routing. Priority-based rule execution. Per-rule budget and token limits with time windows. Automatic window reset.

Advanced Routing Strategies

Lowest latency routing. Lowest cost routing. Least busy routing. Tag-based routing. Auto-router with SLA requirements. Configurable fallback strategies.

Provider Health Monitoring

Continuous health checks. Automatic failover on provider failure. Circuit breaker pattern per provider.

Observability

Distributed Tracing

OpenTelemetry-based distributed tracing. Request correlation and trace context propagation.

Metrics

Request counts, token usage, error rates, latency percentiles, cost tracking, and MCP metrics.

Structured Logging

JSON-formatted logs with configurable log levels, request correlation, and secret redaction.

Request Logs

Complete request/response audit trail. Filterable by model, provider, date, status, and more.

Admin Dashboard

The Admin UI provides a comprehensive management interface including:

Dashboard: Real-time metrics and system overview
Providers: Provider configuration and management
Models: Model catalog and capabilities
Routes: Routing rule editor
Playground: Interactive API testing with streaming support
Users & Teams: User and team management with RBAC
API Keys: Key creation, rotation, and management
Usage & Spend: Usage analytics and cost tracking
Logs: Request/response log viewer with filtering
Guardrails: Content filtering and security rule configuration
MCP Servers: MCP server configuration and tool management
Webhooks: Webhook management and event subscriptions
Settings: System configuration and preferences

Caching

Multi-Tier Caching

In-memory cache and distributed cache options. Configurable TTL per cache type. Cache invalidation on provider config changes.

Cache Types

Completion caching (exact match). Embedding caching (exact match). Model catalog caching (TTL-based).

Use Cases

Enterprise LLM Gateway

Organizations deploying multiple LLM applications can use Guardway Gateway as a central gateway to enforce security policies and content moderation, track and control costs across teams and projects, monitor usage and performance, comply with audit requirements, and avoid vendor lock-in.

Multi-Tenant AI Platform

SaaS providers can use Guardway Gateway to offer AI capabilities to customers with usage-based billing, isolate customer data and budgets, provide customer-specific model routing, and track per-customer usage and costs.

Development and Testing

Development teams can use Guardway Gateway to test applications against multiple providers without code changes, compare model performance and costs, simulate production environments, and debug LLM interactions with detailed logging.

Content Moderation Pipeline

Applications requiring content safety can use Guardway Gateway to detect and block PII in prompts and responses, filter hate speech and harmful content, prevent prompt injection attacks, and comply with content policy requirements.

Cost Optimization

Organizations looking to optimize AI spending can use Guardway Gateway to route requests to lowest-cost providers, set and enforce budget limits, track spending patterns, and identify cost optimization opportunities.

Target Audience

Primary Target Markets

Security-First Enterprises

Banks, financial institutions. Healthcare organizations. Government agencies. Any organization with SOC 2, PCI DSS, HIPAA compliance requirements.

Organizations with Compliance Requirements

Need audit logging and access controls. Require content moderation and PII protection. Must track and control AI spending.

Companies Needing Low-Latency Guardrails

Real-time applications requiring low-latency guardrail overhead. Customer-facing chatbots and assistants. Interactive AI applications.

MCP-Focused Deployments

Applications leveraging Model Context Protocol. Tool-augmented LLM applications. Agent-based systems.

Secondary Markets

Startups and SMBs: Cost-conscious organizations wanting to optimize AI spending
Development Teams: Teams building LLM applications and needing testing flexibility
AI Platforms: SaaS providers offering AI capabilities to customers

Project Status

Guardway Gateway is production-ready with all core features implemented and tested:

Core gateway functionality

18+ provider adapters

Full admin UI with dark mode

Authentication & authorization (RBAC)

Rate limiting & quotas

Multi-tier caching

Built-in guardrails (PII, hate speech, prompt injection)

MCP server support

Webhooks with retry logic

OpenTelemetry tracing

Prometheus metrics

Structured logging

Multi-tenancy (users, teams, keys)

Cost tracking & budgets

Container hardening

Getting Started

Guardway Gateway

Troubleshooting

Reference

​What is Guardway Gateway?

​Purpose and Vision

​Primary Goals

​Key Features

​Core Capabilities

OpenAI-Compatible API

Multi-Provider Support

Streaming Support

Model Context Protocol (MCP)

​Security & Compliance

Authentication & Authorization

Secret Management

Audit Logging

​Enterprise Governance

Multi-Tenancy

Budget Management

Quota Enforcement

Cost Tracking

​Routing & Load Balancing

Priority-Based Routing

Advanced Routing Strategies

Provider Health Monitoring

​Observability

Distributed Tracing

Metrics

Structured Logging

Request Logs

​Admin Dashboard

​Caching

Multi-Tier Caching

Cache Types

​Use Cases

Enterprise LLM Gateway

Multi-Tenant AI Platform

Development and Testing

Content Moderation Pipeline

Cost Optimization

​Target Audience

​Primary Target Markets

Security-First Enterprises

Organizations with Compliance Requirements

Companies Needing Low-Latency Guardrails

MCP-Focused Deployments

​Secondary Markets

​Project Status

What is Guardway Gateway?

Purpose and Vision

Primary Goals

Key Features

Core Capabilities

Security & Compliance

Enterprise Governance

Routing & Load Balancing

Observability

Admin Dashboard

Caching

Use Cases

Target Audience

Primary Target Markets

Secondary Markets

Project Status