🏛️

Platform Architecture Assessment

Self-hosted LangSmith Infrastructure Scorecard — 12 topics, scored 0–3

Evaluate a customer's self-hosted LangSmith infrastructure across compute, data services, security, operations, and team maturity. Score each topic based on current state, capture notes, and export a report for the engagement plan.

Total Score
0 / 36
Assessment
D — Significant Gaps
Scored
0 / 12 topics
Infrastructure
⚙️
Compute & ScalingNot scored

Kubernetes node sizing, autoscaling, multi-zone, and resource quotas for LangSmith workloads.

🌐
Networking & IngressNot scored

Ingress controller, TLS certificate management, DNS, firewall rules, network policies, and optional WAF/CDN for LangSmith endpoints.

Data Services
🐘
Services: PostgreSQLNot scored

Managed PostgreSQL for LangSmith metadata, RBAC, and configuration. External managed DB is required for production.

🔴
Services: RedisNot scored

Redis is used by LangSmith for queuing and caching. Version ≥ 5.0 required. External managed Redis is required for production.

🏠
Services: ClickHouseNot scored

ClickHouse stores all LangSmith trace data. In-cluster ClickHouse is for dev/POC only. Production requires LangChain Managed ClickHouse.

🗄️
Services: Blob StorageNot scored

Object/blob storage is always required — LangSmith payload data must not go into ClickHouse. Access should use Workload Identity, not static credentials.

Security
🔐
Security & AccessNot scored

TLS, SSO/OIDC, RBAC, secret management, Workload Identity, network isolation, and audit logging for LangSmith.

Operations
📊
ObservabilityNot scored

Metrics, logging, LangSmith tracing, alerting, and dashboards for monitoring LangSmith infrastructure health.

🛡️
Reliability & Disaster RecoveryNot scored

Backup strategies, RTO/RPO targets, deletion protection, and tested recovery procedures across all critical LangSmith components.

🔄
Maintenance & LifecycleNot scored

OS/Kubernetes/application patching strategy, maintenance windows, zero-downtime upgrades, and rollback procedures for LangSmith.

🏗️
Operations & IaCNot scored

Infrastructure as Code maturity — Terraform state management, Helm deployment automation, runbooks, and CI/CD integration.

People
👥
Team CapabilityNot scored

Team size, Kubernetes/LangSmith expertise, on-call coverage, and knowledge sharing maturity for operating LangSmith in production.

Assessment Summary — 0 / 36 points
Compute & Scaling
Services: PostgreSQL
Services: Redis
Services: ClickHouse
Services: Blob Storage
Networking & Ingress
Security & Access
Observability
Reliability & Disaster Recovery
Maintenance & Lifecycle
Operations & IaC
Team Capability