Glossary
LLM/AI Terms
LLM/AI Terms
Adapter
A software component that translates between AgSec’s unified API format and a provider’s specific API format. Each provider (OpenAI, Anthropic, etc.) has its own adapter.Chat Completion
The primary LLM interaction where a model generates a response based on a conversation history (messages). Also called “chat” or “completion.”Completion Tokens
The number of tokens in the generated response from the LLM. Typically costs more per token than prompt tokens.Context Length
The maximum number of tokens (prompt + completion) that a model can process in a single request. For example, GPT-4 has a 128K token context length.Embeddings
Dense vector representations of text that capture semantic meaning. Used for similarity search, clustering, and retrieval-augmented generation (RAG).Few-Shot Learning
Providing a model with examples in the prompt to guide its behavior without fine-tuning. For example, showing 3 examples of the desired output format.Fine-Tuning
The process of training a base model on domain-specific data to specialize its behavior. Results in a custom model.Function Calling
See Tool Use.Guardrails
Security and safety mechanisms that validate, filter, or block LLM inputs and outputs. Examples: PII detection, hate speech filtering, prompt injection detection.Hallucination
When an LLM generates false or nonsensical information presented as fact. A key challenge in production LLM deployments.JSON Mode
A feature where the LLM is constrained to output valid JSON only. Useful for structured data extraction.Max Tokens
The maximum number of tokens the model can generate in its response. Acts as a cost control and prevents runaway generation.Message
A unit in a conversation with an LLM, consisting of a role (system, user, assistant, or tool) and content (the text).Model
A trained neural network capable of text generation, embeddings, image generation, or other AI tasks. Examples: GPT-4, Claude 3, Llama 2.Moderation
Content filtering to detect harmful, unsafe, or inappropriate content. Can be applied to inputs (user prompts) or outputs (LLM responses).Prompt
The input text sent to an LLM. Can include instructions, examples, and the actual query.Prompt Engineering
The practice of crafting effective prompts to get desired behaviors from LLMs without fine-tuning.Prompt Injection
A security attack where malicious input attempts to override the system prompt or manipulate the LLM’s behavior.Prompt Tokens
The number of tokens in the input sent to the LLM. Typically costs less per token than completion tokens.Provider
A company or service that offers LLM APIs. Examples: OpenAI, Anthropic, Google, Cohere, Groq.RAG (Retrieval-Augmented Generation)
A technique where relevant documents are retrieved from a knowledge base and included in the prompt to ground the LLM’s response in factual information.Semantic Cache
A caching system that matches queries based on meaning rather than exact text match. Uses embeddings to find similar queries.Stop Sequence
A string that, when generated by the model, signals the end of generation. Used to create structured outputs.Streaming
Sending the LLM response in chunks as it’s generated, rather than waiting for the complete response. Improves perceived latency.System Prompt
Instructions given to the LLM that set its behavior, persona, and constraints. Typically the first message in a conversation.Temperature
A parameter (0-2) that controls randomness in generation. Higher = more creative/random, lower = more deterministic/focused.Token
The basic unit of text processing for LLMs. Roughly 4 characters or 0.75 words in English. Tokenization varies by model.Tool Use
The ability for an LLM to call external functions/APIs. The model decides when to use a tool, AgSec calls it, and the result is fed back to the model.Top-K
Sampling strategy where only the K most likely next tokens are considered. Reduces randomness.Top-P (Nucleus Sampling)
Sampling strategy where tokens are selected from the smallest set whose cumulative probability exceeds P. More dynamic than top-K.Vision
The ability for an LLM to process and understand images in addition to text. Example: GPT-4V, Claude 3.Zero-Shot Learning
Using an LLM without providing examples, relying solely on instructions. The model must generalize from its training.Gateway Terms
Gateway Terms
API Key
A secret token used to authenticate requests to the AgSec gateway. Each key can have quotas, budgets, and access controls.Budget
A spending limit (in dollars) associated with an API key or team. Requests are blocked when the budget is exceeded.Failover
The automatic switching to a backup provider when the primary provider fails or is unavailable.Gateway
The central service that receives client requests, applies security policies, routes to providers, and returns responses. The core of AgSec.Health Check
An endpoint (/health) that reports the operational status of the gateway and its dependencies (Redis, providers).Latency
The time between sending a request and receiving a complete response. Measured in milliseconds (ms).Middleware
Software components that intercept and process requests before they reach route handlers. Examples: authentication, rate limiting, logging.Multi-tenancy
The ability to serve multiple independent customers (tenants) from a single gateway instance with isolation and access controls.Quota
A limit on the number of requests allowed within a time period. Can be per-API-key, per-user, or per-team.Rate Limiting
Restricting the number of requests allowed in a time window to prevent abuse and manage load. Can limit by requests/minute or tokens/minute.Routing
The process of selecting which provider and model to use for a request based on rules, strategies, or load balancing.Routing Rule
A configuration that maps requests to specific providers based on patterns (model name, user, tags, etc.).Routing Strategy
An algorithm for selecting providers. Examples: lowest-cost, lowest-latency, least-busy, priority-based.Sanitization
The process of removing or redacting sensitive information (like PII) from text while preserving the rest of the content.Store
AgSec’s data persistence layer, backed by Redis, that holds configuration, keys, logs, and metrics.Throughput
The number of requests processed per unit of time, typically measured in requests per second (req/sec).Webhook
An HTTP callback that AgSec can trigger when certain events occur (quota exceeded, budget threshold, etc.).Security Terms
Security Terms
AES-256-GCM
Advanced Encryption Standard with 256-bit keys in Galois/Counter Mode. Used by AgSec to encrypt API keys and secrets at rest.AppArmor
A Linux kernel security module that confines programs to a limited set of resources. Used in AgSec’s container hardening.Attack Surface
The sum of all points where an unauthorized user could try to enter or extract data from a system.Authentication
Verifying the identity of a user or system. In AgSec, this is done via API keys.Authorization
Determining what actions an authenticated user is allowed to perform. In AgSec, this is per-API-key permissions.Capabilities (Linux)
Fine-grained privileges that can be granted to processes instead of full root access. AgSec drops unnecessary capabilities.Defense in Depth
A security strategy employing multiple layers of defense so that if one layer fails, others still provide protection.Encryption at Rest
Encrypting data when it’s stored (e.g., API keys in Redis) so it’s unreadable without the decryption key.Encryption in Transit
Encrypting data while it’s being transmitted over the network, typically using TLS/HTTPS.Fail-Closed
A security posture where errors cause requests to be blocked. More secure but less available.Fail-Open
A security posture where errors allow requests to proceed. More available but less secure.Least Privilege
The principle of granting only the minimum permissions necessary for a task. Applied to processes, users, and API keys.Non-root User
Running processes as a non-privileged user rather than root to limit the impact of security breaches. AgSec containers use UID 1001.PII (Personally Identifiable Information)
Data that can be used to identify an individual. Examples: SSN, email, phone number, name, address.RBAC (Role-Based Access Control)
An access control approach where permissions are assigned to roles, and users are assigned to roles.Read-only Root Filesystem
A security hardening technique where the container’s root filesystem cannot be modified at runtime, preventing certain types of attacks.Secrets Management
Secure storage, access control, and rotation of sensitive data like API keys, passwords, and certificates.Seccomp (Secure Computing Mode)
A Linux kernel feature that limits the system calls a process can make. AgSec uses a restricted seccomp profile.TLS (Transport Layer Security)
Cryptographic protocol for secure communication over networks. HTTPS uses TLS.Zero Trust
A security model that assumes no implicit trust and requires verification for every access request, regardless of location.Observability Terms
Observability Terms
Cardinality
The number of unique values for a dimension in time series data. High cardinality (many unique values) can cause performance issues.Dashboard
A visual interface displaying metrics, graphs, and charts for monitoring system health and performance.Distributed Tracing
Following a request as it flows through multiple services, collecting timing and metadata at each step.Event
A discrete occurrence in the system, such as a request received, error encountered, or threshold exceeded.Grafana
An open-source platform for visualizing metrics and logs, commonly used with Prometheus.Instrumentation
Adding code to collect metrics, logs, and traces from an application.Log Aggregation
Collecting logs from multiple sources into a centralized system for search and analysis.Logging
Recording events, errors, and diagnostic information to files or logging services.Metric
A numerical measurement collected over time. Examples: request count, latency, memory usage.OpenTelemetry
An observability framework for generating, collecting, and exporting telemetry data (metrics, logs, traces).P50 (50th Percentile / Median)
The value below which 50% of observations fall. Represents typical performance.P95 (95th Percentile)
The value below which 95% of observations fall. Represents near-worst-case performance, filtering outliers.P99 (99th Percentile)
The value below which 99% of observations fall. Represents worst-case performance for most requests.Prometheus
An open-source monitoring and alerting system that collects and stores metrics as time series data.Span
A single operation within a distributed trace, with start/end times and metadata.Structured Logging
Logging in a machine-readable format (typically JSON) with consistent fields, enabling better search and analysis.Time Series
A sequence of data points indexed by time, used for metrics like request rate over time.Trace
A record of the path a request takes through multiple services, consisting of multiple spans.Trace ID
A unique identifier for a distributed trace, used to correlate spans across services.MCP Terms
MCP Terms
JSON-RPC
A remote procedure call protocol encoded in JSON. Used by MCP for client-server communication.MCP (Model Context Protocol)
A protocol that allows LLMs to interact with external tools, data sources, and services in a standardized way.MCP Server
A service that implements the MCP protocol and exposes tools, resources, or prompts to clients.Prompt (MCP)
A pre-defined prompt template provided by an MCP server that clients can use.Resource (MCP)
A data source or document that an MCP server makes available to clients (e.g., files, database records).Session
A stateful connection between an MCP client and server, maintaining context across multiple requests.stdio Transport
Communication via standard input/output streams. Used by Python and Node.js MCP servers.Tool (MCP)
A function that an MCP server exposes to clients. The LLM can call tools to perform actions or retrieve information.Tool Filter
Access control rules that restrict which MCP tools are available to specific API keys.Infrastructure Terms
Infrastructure Terms
Autoscaling
Automatically adjusting the number of running instances based on load metrics like CPU or request rate.Container
A lightweight, standalone package containing an application and its dependencies. AgSec uses Docker containers.Container Hardening
Security measures applied to containers, such as running as non-root, read-only filesystems, and capability dropping.Docker
A platform for building, shipping, and running containerized applications.Docker Compose
A tool for defining and running multi-container Docker applications using a YAML configuration file.Health Probe
A check performed by orchestrators (Kubernetes) to determine if a container is healthy and should receive traffic.Horizontal Scaling
Adding more instances of a service to handle increased load. Preferred over vertical scaling for stateless services.Image (Container)
A read-only template used to create containers. Contains the application code and dependencies.Infrastructure as Code (IaC)
Managing infrastructure through code (e.g., Terraform, CloudFormation) rather than manual processes.Kubernetes (K8s)
An open-source container orchestration platform for automating deployment, scaling, and management of containerized applications.Liveness Probe
A health check that determines if a container is running. Failed liveness probes cause the container to be restarted.Load Balancer
A component that distributes incoming requests across multiple instances to ensure no single instance is overloaded.Namespace (Kubernetes)
A way to divide cluster resources between multiple users or teams. Provides scope for names.Orchestration
Automated configuration, coordination, and management of computer systems and services. Kubernetes is an orchestrator.Pod (Kubernetes)
The smallest deployable unit in Kubernetes, consisting of one or more containers.Readiness Probe
A health check that determines if a container is ready to receive traffic. Failed readiness probes remove the container from load balancing.Replica
An identical copy of a service instance. Multiple replicas provide redundancy and increased capacity.Service Mesh
Infrastructure layer providing features like traffic management, security, and observability for microservices. Examples: Istio, Linkerd.Vertical Scaling
Increasing the resources (CPU, memory) of existing instances rather than adding more instances.Volume
Persistent storage that can be attached to containers. Used for data that should survive container restarts.HTTP/API Terms
HTTP/API Terms
API (Application Programming Interface)
A set of protocols and tools for building software applications. RESTful APIs use HTTP.Endpoint
A specific URL path that accepts requests. Example:/v1/chat/completions.Header
Metadata sent with HTTP requests and responses. Examples:Content-Type, Authorization.HTTP Method
The action to perform on a resource. Common methods:GET (read), POST (create), PUT (update), DELETE (remove), PATCH (partial update).HTTP Status Code
A three-digit code indicating the result of an HTTP request:- 2xx: Success (200 OK, 201 Created)
- 3xx: Redirection
- 4xx: Client error (400 Bad Request, 401 Unauthorized, 404 Not Found)
- 5xx: Server error (500 Internal Server Error, 503 Service Unavailable)
Idempotency
The property where multiple identical requests have the same effect as a single request.GET, PUT, and DELETE are idempotent.JSON (JavaScript Object Notation)
A lightweight data interchange format that’s easy for humans to read and write and easy for machines to parse and generate.OpenAPI
A specification for describing RESTful APIs in a machine-readable format. Formerly known as Swagger.Query Parameter
Data passed in the URL after a?. Example: /search?q=query&limit=10.Request Body
The data sent with POST/PUT/PATCH requests, typically in JSON format.REST (Representational State Transfer)
An architectural style for distributed systems where resources are identified by URLs and manipulated using standard HTTP methods.SSE (Server-Sent Events)
A standard for servers to push real-time updates to clients over HTTP. Used for streaming LLM responses.Webhook
An HTTP callback that occurs when something happens. AgSec sends webhooks for events like quota exceeded.Database Terms
Database Terms
Cache
Temporary storage for frequently accessed data to reduce latency and load on primary data sources.Cache Eviction
The process of removing entries from a cache when it reaches capacity. Policies include LRU (Least Recently Used).Cache Hit
When requested data is found in the cache, avoiding a slower lookup in the primary data source.Cache Miss
When requested data is not in the cache, requiring a lookup in the primary data source.Cache Warming
Pre-populating a cache with data before it’s requested to improve hit rates.Cardinality (Database)
The number of unique values in a column or field. High cardinality can impact index performance.Connection Pool
A cache of database connections maintained so that connections can be reused when needed, reducing connection overhead.Hash (Data Structure)
A data structure that maps keys to values using a hash function. Redis supports hash data types.Index
A data structure that improves the speed of data retrieval operations. Created on specific fields for faster lookups.Key-Value Store
A database that stores data as a collection of key-value pairs. Redis is a key-value store.LRU (Least Recently Used)
A cache eviction policy that removes the least recently accessed items first when the cache is full.Persistence
Saving data to disk so it survives process restarts. Redis supports RDB (snapshots) and AOF (append-only file) persistence.Pipeline
Sending multiple commands to a database in a single request/response cycle. Reduces network round trips.Redis
An in-memory key-value database used for caching, session storage, and real-time data. AgSec uses Redis for all state.Replication
Copying data from one database to another to provide redundancy and increase availability.Schema
The structure of a database, defining tables, fields, relationships, and constraints.TTL (Time To Live)
The duration for which a cache entry or database record remains valid before expiring and being removed.Vector Database
A database optimized for storing and querying high-dimensional vectors (embeddings). Examples: Qdrant, Pinecone, Weaviate.Related Documentation
- Architecture - System architecture
- Components - Component details
- API Reference - API documentation
- Overview - Product overview
- Configuration - Configuration guide
