Glossary
LLM/AI Terms
LLM/AI Terms
Adapter
A software component that translates between Guardway’s unified API format and a provider’s specific API format. Each provider (OpenAI, Anthropic, etc.) has its own adapter.Chat Completion
The primary LLM interaction where a model generates a response based on a conversation history (messages). Also called “chat” or “completion.”Completion Tokens
The number of tokens in the generated response from the LLM. Typically costs more per token than prompt tokens.Context Length
The maximum number of tokens (prompt + completion) that a model can process in a single request. For example, GPT-4 has a 128K token context length.Embeddings
Dense vector representations of text that capture semantic meaning. Used for similarity search, clustering, and retrieval-augmented generation (RAG).Few-Shot Learning
Providing a model with examples in the prompt to guide its behavior without fine-tuning. For example, showing 3 examples of the desired output format.Fine-Tuning
The process of training a base model on domain-specific data to specialize its behavior. Results in a custom model.Function Calling
See Tool Use.Guardrails
Security and safety mechanisms that validate, filter, or block LLM inputs and outputs. Examples: PII detection, hate speech filtering, prompt injection detection.Hallucination
When an LLM generates false or nonsensical information presented as fact. A key challenge in production LLM deployments.JSON Mode
A feature where the LLM is constrained to output valid JSON only. Useful for structured data extraction.Max Tokens
The maximum number of tokens the model can generate in its response. Acts as a cost control and prevents runaway generation.Message
A unit in a conversation with an LLM, consisting of a role (system, user, assistant, or tool) and content (the text).Model
A trained neural network capable of text generation, embeddings, image generation, or other AI tasks. Examples: GPT-4, Claude 3, Llama 2.Moderation
Content filtering to detect harmful, unsafe, or inappropriate content. Can be applied to inputs (user prompts) or outputs (LLM responses).Prompt
The input text sent to an LLM. Can include instructions, examples, and the actual query.Prompt Engineering
The practice of crafting effective prompts to get desired behaviors from LLMs without fine-tuning.Prompt Injection
A security attack where malicious input attempts to override the system prompt or manipulate the LLM’s behavior.Prompt Tokens
The number of tokens in the input sent to the LLM. Typically costs less per token than completion tokens.Provider
A company or service that offers LLM APIs. Examples: OpenAI, Anthropic, Google, Cohere, Groq.RAG (Retrieval-Augmented Generation)
A technique where relevant documents are retrieved from a knowledge base and included in the prompt to ground the LLM’s response in factual information.Semantic Cache
A caching system that matches queries based on meaning rather than exact text match. Uses embeddings to find similar queries.Stop Sequence
A string that, when generated by the model, signals the end of generation. Used to create structured outputs.Streaming
Sending the LLM response in chunks as it’s generated, rather than waiting for the complete response. Improves perceived latency.System Prompt
Instructions given to the LLM that set its behavior, persona, and constraints. Typically the first message in a conversation.Temperature
A parameter (0-2) that controls randomness in generation. Higher = more creative/random, lower = more deterministic/focused.Token
The basic unit of text processing for LLMs. Roughly 4 characters or 0.75 words in English. Tokenization varies by model.Tool Use
The ability for an LLM to call external functions/APIs. The model decides when to use a tool, Guardway calls it, and the result is fed back to the model.Top-K
Sampling strategy where only the K most likely next tokens are considered. Reduces randomness.Top-P (Nucleus Sampling)
Sampling strategy where tokens are selected from the smallest set whose cumulative probability exceeds P. More dynamic than top-K.Vision
The ability for an LLM to process and understand images in addition to text. Example: GPT-4V, Claude 3.Zero-Shot Learning
Using an LLM without providing examples, relying solely on instructions. The model must generalize from its training.Gateway Terms
Gateway Terms
API Key
A secret token used to authenticate requests to the Guardway gateway. Each key can have quotas, budgets, and access controls.Budget
A spending limit (in dollars) associated with an API key or team. Requests are blocked when the budget is exceeded.Failover
The automatic switching to a backup provider when the primary provider fails or is unavailable.Gateway
The central service that receives client requests, applies security policies, routes to providers, and returns responses. The core of Guardway.Health Check
An endpoint (/health) that reports the operational status of the gateway and its dependencies (Redis, providers).Latency
The time between sending a request and receiving a complete response. Measured in milliseconds (ms).Middleware
Software components that intercept and process requests before they reach route handlers. Examples: authentication, rate limiting, logging.Multi-tenancy
The ability to serve multiple independent customers (tenants) from a single gateway instance with isolation and access controls.Quota
A limit on the number of requests allowed within a time period. Can be per-API-key, per-user, or per-team.Rate Limiting
Restricting the number of requests allowed in a time window to prevent abuse and manage load. Can limit by requests/minute or tokens/minute.Routing
The process of selecting which provider and model to use for a request based on rules, strategies, or load balancing.Routing Rule
A configuration that maps requests to specific providers based on patterns (model name, user, tags, etc.).Routing Strategy
An algorithm for selecting providers. Examples: lowest-cost, lowest-latency, least-busy, priority-based.Sanitization
The process of removing or redacting sensitive information (like PII) from text while preserving the rest of the content.Store
Guardway’s data persistence layer, backed by Redis, that holds configuration, keys, logs, and metrics.Throughput
The number of requests processed per unit of time, typically measured in requests per second (req/sec).Webhook
An HTTP callback that Guardway can trigger when certain events occur (quota exceeded, budget threshold, etc.).Security Terms
Security Terms
AES-256-GCM
Advanced Encryption Standard with 256-bit keys in Galois/Counter Mode. Used by Guardway to encrypt API keys and secrets at rest.AppArmor
A Linux kernel security module that confines programs to a limited set of resources. Used in Guardway’s container hardening.Attack Surface
The sum of all points where an unauthorized user could try to enter or extract data from a system.Authentication
Verifying the identity of a user or system. In Guardway, this is done via API keys.Authorization
Determining what actions an authenticated user is allowed to perform. In Guardway, this is per-API-key permissions.Capabilities (Linux)
Fine-grained privileges that can be granted to processes instead of full root access. Guardway drops unnecessary capabilities.Defense in Depth
A security strategy employing multiple layers of defense so that if one layer fails, others still provide protection.Encryption at Rest
Encrypting data when it’s stored (e.g., API keys in Redis) so it’s unreadable without the decryption key.Encryption in Transit
Encrypting data while it’s being transmitted over the network, typically using TLS/HTTPS.Fail-Closed
A security posture where errors cause requests to be blocked. More secure but less available.Fail-Open
A security posture where errors allow requests to proceed. More available but less secure.Least Privilege
The principle of granting only the minimum permissions necessary for a task. Applied to processes, users, and API keys.Non-root User
Running processes as a non-privileged user rather than root to limit the impact of security breaches. Guardway containers use UID 1001.PII (Personally Identifiable Information)
Data that can be used to identify an individual. Examples: SSN, email, phone number, name, address.RBAC (Role-Based Access Control)
An access control approach where permissions are assigned to roles, and users are assigned to roles.Read-only Root Filesystem
A security hardening technique where the container’s root filesystem cannot be modified at runtime, preventing certain types of attacks.Secrets Management
Secure storage, access control, and rotation of sensitive data like API keys, passwords, and certificates.Seccomp (Secure Computing Mode)
A Linux kernel feature that limits the system calls a process can make. Guardway uses a restricted seccomp profile.TLS (Transport Layer Security)
Cryptographic protocol for secure communication over networks. HTTPS uses TLS.Zero Trust
A security model that assumes no implicit trust and requires verification for every access request, regardless of location.MCP Terms
MCP Terms
JSON-RPC
A remote procedure call protocol encoded in JSON. Used by MCP for client-server communication.MCP (Model Context Protocol)
A protocol that allows LLMs to interact with external tools, data sources, and services in a standardized way.MCP Server
A service that implements the MCP protocol and exposes tools, resources, or prompts to clients.Prompt (MCP)
A pre-defined prompt template provided by an MCP server that clients can use.Resource (MCP)
A data source or document that an MCP server makes available to clients (e.g., files, database records).Session
A stateful connection between an MCP client and server, maintaining context across multiple requests.stdio Transport
Communication via standard input/output streams. Used by Python and Node.js MCP servers.Tool (MCP)
A function that an MCP server exposes to clients. The LLM can call tools to perform actions or retrieve information.Tool Filter
Access control rules that restrict which MCP tools are available to specific API keys.Related documentation
- Introduction — what Guardway is
- Gateway overview — how the pieces fit together
- Environment variables — full config reference
- API reference — coming soon