What runs in the CP and the DP
Canonical list of services that make up the LangSmith control plane, data plane, and supporting infrastructure.
The split between control plane (CP) and data plane (DP) does not change with topology. What changes is which environment each side runs in (LangChain SaaS, BYOC, your cluster, on-prem, air-gapped). The service lists below are the same across every topology.
Three helm charts cover customer installs:
langsmith: the control plane (UI, auth, trace pipeline, ClickHouse/blob wiring). Also installs LangGraph Deployments support when colocated.langgraph-dataplane: the data plane (operator + listener + agent servers). Used when the DP runs in a different cluster from the CP.langsmith-auth-proxy: the LLM auth proxy. Optional, separate install.Review against the actual chart values before treating exact service or key names below as authoritative. Chart releases occasionally rename or split components.
How traces actually flow
Knowing the trace path matters because it determines where ClickHouse and the blob bucket sit, which is the most important factor in trace residency:
DP agent → HTTP multipart → CP API → CP Redis → asynq worker → ClickHouse + Blob
The DP-side agent only emits HTTP requests. Every storage component for traces (the API endpoint, the ingest queue, the asynq worker, ClickHouse, and the blob bucket) is CP-attached. ClickHouse should always live external to the cluster (managed service) for production; in-cluster ClickHouse is dev/POC only.
This is why "where does the CP run" determines "where do trace payloads land," not "where does the agent run." In Hybrid (SaaS CP, customer DP), agents execute in the customer VPC but traces flow back over HTTPS to the SaaS CP and land in LangChain's ClickHouse. In BYOC CP + on-prem DP, agents run on-prem but traces flow over the private link to ClickHouse in the BYOC cloud account.
Control Plane
The control plane owns user-facing state, platform identity, and the entire trace pipeline. It hosts the UI, handles SSO and API-key auth, manages workspaces/orgs/permissions, ingests and stores traces, and serves the LangGraph Studio debugger.
LangSmith services
| Service | Role | Notes |
|---|---|---|
frontend | Web UI | The smith.langchain.com (or self-hosted equivalent) React app. Static asset bundle plus a small Node server. |
platform-backend | Platform API | Orgs, workspaces, members, API keys, SSO/SCIM, billing, retention policy. Issues short-lived tokens for DP requests. |
host-backend | Auth proxy / fan-out | Authenticates requests from the UI and forwards them to the right deployment. Handles cross-DP routing. |
backend | Trace ingest + query API | Accepts HTTP multipart trace uploads from the SDK and DP agents. Also serves run, dataset, feedback, and evaluation queries. |
ingest-queue | Async trace writer (asynq) | Pulls buffered trace events from Redis and writes them to ClickHouse and blob storage. |
queue | Background-job worker | Dataset processing, evaluation runs, exports, retention sweeps. |
studio | LangGraph debugging UI | Served from the CP frontend. Connects out to the DP-side agent server to step through graph executions. |
subscriptions-api | Billing / metering | Optional. Tracks usage events for billed plans. Not present in self-hosted deployments. |
Backing services (CP)
| Service | Purpose | Notes |
|---|---|---|
| Postgres | CP metadata: orgs, workspaces, users, API keys, dataset and prompt metadata. | Required. |
| Redis | Session state, rate limits, trace ingest queue (asynq), async-job queue. | Required. Heavily used; sized for trace volume, not just sessions. |
| ClickHouse | Trace storage. | Required. Production deployments should use LangChain Managed ClickHouse instead of in-cluster. |
| Object storage (S3 / Blob / GCS) | Trace payloads, large attachments, evaluation artifacts. | Required. Payloads must not go into ClickHouse; that causes cluster issues. |
Data Plane
The data plane is LangGraph Deployments (the agent runtime). It runs customer agent code, holds checkpointer state for in-flight runs, and emits trace events back to the CP. Agents reach the CP for auth (token validation, deployment config) and for trace upload; they reach LLM providers directly (or via an egress proxy).
LangGraph Deployments services
| Service | Role | Notes |
|---|---|---|
operator | LangSmith operator | Watches LangGraphDeployment CRDs and reconciles per-deployment runtimes. Cluster-scoped CRD; chart upgrades must be version-locked across all DPs in the same cluster. |
listener | Per-DP control loop | One per data plane. Pulls deployment changes from the CP over outbound HTTPS, applies them locally. This is what makes hybrid topologies work, since the customer never has to expose inbound from the CP. |
server (per deployment) | Agent runtime | One Deployment+Service per deployed graph. Customer code runs here. |
Optional add-ons (DP)
| Service | Role | Notes |
|---|---|---|
ml-models (Polly) | In-app chat assistant | Context-aware Q&A over traces and datasets. Runs on LangGraph Deployments. |
insights | Trace pattern analysis | Runs on LangGraph Deployments. |
llm-auth-proxy | LLM Auth Proxy | Egress component that sits between the DP and external LLM providers. Centralizes API-key management, applies per-workspace allowlists and rate limits, and produces an auditable record of every model call. Useful when the customer wants the data plane workloads to never hold raw provider credentials, or when policy requires per-workspace egress controls. Installed via the dedicated langsmith-auth-proxy chart, not bundled with the main chart. Toggle "Show LLM Auth Proxy in diagram" on the picker to see how it sits in any topology. |
Backing services (DP)
| Service | Purpose | Notes |
|---|---|---|
| Postgres | LangGraph checkpointer (default). Stores in-flight graph state, threads, and resumable runs. | Default checkpointer. One per DP. |
| MongoDB | LangGraph checkpointer (alternative). | Added via the LangChain + MongoDB partnership as an alternative to Postgres for teams already running MongoDB at scale. Pick one: Postgres or MongoDB, not both. Package: langgraph-checkpoint-mongodb. |
| Object storage | Per-deployment artifacts (optional). | Not always required. Distinct from the CP-side blob bucket that holds trace payloads. |
How topology affects this
The component lists above are the same across every topology. What topology changes:
- Where CP services run. SaaS topologies put the entire CP (including ClickHouse and the blob bucket) in LangChain infrastructure. Self-hosted topologies put it all in the customer cluster (or external managed services attached to it). BYOC puts it in a customer cloud account that LangChain operates.
- Where DP services run. The agent runtime can run in the same cluster as the CP (single-cluster), in separate per-env or per-namespace DPs (multi-DP topologies), or in a different environment entirely (hybrid: SaaS CP + customer-cluster DP; BYOC CP + on-prem DP).
- Where trace data lands. Follows the CP. In hybrid topologies, agent code stays in the customer VPC but trace payloads flow over HTTPS to the SaaS CP and land in LangChain's ClickHouse + blob bucket. In BYOC CP + on-prem DP, agents run on-prem but traces flow back over the private link to the BYOC cloud account.
Component lists do not change with cloud or air-gap. Air-gapped deployments add image-mirror infrastructure (registry like Harbor / Artifactory + an image-sync tool like Skopeo, crane, or regclient) and an on-prem LLM gateway, but the LangSmith services running inside the cluster are the same set.
Helm config when CP and DP are not colocated
When the CP and DP run in different clusters or environments (hybrid, cross-cluster, BYOC CP + on-prem DP), you install two charts: langsmith on the CP side, langgraph-dataplane on the DP side.
DP side: langgraph-dataplane chart
Tells the listener and agent servers where the CP lives and how to authenticate to it. Defaults point at LangChain SaaS (api.host.langchain.com / api.smith.langchain.com); override these for a self-hosted or BYOC CP.
config.hostBackendUrl: URL of the CPhost-backend. The listener calls this to pull deployment config and validate tokens. Default:https://api.host.langchain.com.config.smithBackendUrl: URL of the CP LangSmithbackend(the trace-ingest endpoint). Trace payloads are POSTed here. Default:https://api.smith.langchain.com.config.langsmithApiKey: API key the listener uses to authenticate to the CP. Set viaexistingSecretNamein production, not in plaintext.config.langsmithWorkspaceId: workspace this DP belongs to. Required.config.langgraphListenerId: unique ID for this listener. Required when multiple DPs share a CP.config.hostQueue: SAQ queue name (defaulthost). Set to a unique value per install when multiplelanggraph-dataplanereleases share a Redis instance, otherwise the queues collide.config.existingSecretName: K8s Secret holding the above. Wired up via External Secrets Operator from the cloud secret store (SSM / Key Vault / Secret Manager).ingress.*/gateway.*/istioGateway.*: pick one for how agent traffic enters the cluster. The chart supports plain Ingress, Gateway API HTTPRoute, or Istio VirtualService.
CP side: langsmith chart
The CP-side install holds the JWT secret, license, and platform config. The DP-side langsmithApiKey must be issued from this CP and the JWT secret must match.
config.langsmithLicenseKey: required for self-hosted/BYOC CP.config.basicAuth.jwtSecret: must match the value the DP-side listener uses (via API key issuance). Mismatch is the most common cause of "auth works locally but DP cannot reach CP."config.authType: typicallymixedfor self-hosted (basic auth + OIDC);oauthfor OAuth-with-PKCE.config.deployment.enabled: set true to install LangGraph Deployments support on the CP side. Required for the CP to issue listener IDs and route Studio traffic to DP agent servers.config.existingSecretName: same ESO pattern as DP-side; points at the K8s Secret holding license, JWT secret, basic-auth password, etc.
Wiring it together
For the full secret chain (cloud secret store → ESO → K8s Secret → Helm values) and end-to-end terraform/Helm flow per cloud, see the per-cloud Architecture pages: AWS, Azure, GCP, OCP.
Related
- Deployment topologies: pick a topology by constraints.
- Full topology reference: printable catalog of all topologies.