Self-hosted LangSmith Infrastructure Scorecard — 12 topics, scored 0–3
Evaluate a customer's self-hosted LangSmith infrastructure across compute, data services, security, operations, and team maturity. Score each topic based on current state, capture notes, and export a report for the engagement plan.
Kubernetes node sizing, autoscaling, multi-zone, and resource quotas for LangSmith workloads.
Ingress controller, TLS certificate management, DNS, firewall rules, network policies, and optional WAF/CDN for LangSmith endpoints.
Managed PostgreSQL for LangSmith metadata, RBAC, and configuration. External managed DB is required for production.
Redis is used by LangSmith for queuing and caching. Version ≥ 5.0 required. External managed Redis is required for production.
ClickHouse stores all LangSmith trace data. In-cluster ClickHouse is for dev/POC only. Production requires LangChain Managed ClickHouse.
Object/blob storage is always required — LangSmith payload data must not go into ClickHouse. Access should use Workload Identity, not static credentials.
TLS, SSO/OIDC, RBAC, secret management, Workload Identity, network isolation, and audit logging for LangSmith.
Metrics, logging, LangSmith tracing, alerting, and dashboards for monitoring LangSmith infrastructure health.
Backup strategies, RTO/RPO targets, deletion protection, and tested recovery procedures across all critical LangSmith components.
OS/Kubernetes/application patching strategy, maintenance windows, zero-downtime upgrades, and rollback procedures for LangSmith.
Infrastructure as Code maturity — Terraform state management, Helm deployment automation, runbooks, and CI/CD integration.
Team size, Kubernetes/LangSmith expertise, on-call coverage, and knowledge sharing maturity for operating LangSmith in production.