Supported Engines

path_trace reports per-endpoint support using:

per_session
aggregate_only
unsupported

Engine matrix

Engine	Metric prefix (expected)	Connector detection scheme	TTFT source	TPOT source	`path_trace` per-session label support
vLLM	`vllm:`	Best-effort from connector labels on KV-transfer-related metrics (e.g., `kv_transfer_backend`/`connector`)	Histogram/summary TTFT family exposed by endpoint metrics	Histogram/summary TPOT/token-latency family exposed by endpoint metrics	Usually `per_session` when request/session labels exist; otherwise `aggregate_only`
SGLang	`sglang:`	Best-effort label extraction from KV transfer / routing metric families	SGLang TTFT metric family normalized by adapter	SGLang TPOT metric family normalized by adapter	Varies by deployment labels; `per_session` or `aggregate_only`
LMCache embedded	`lmcache:` or `lmcache_`	vLLM: `LMCacheConnectorV1` / `LMCacheConnectorV1Dynamic`; SGLang: `--enable-lmcache` / `LMCacheLayerwiseConnector`; legacy `LMCacheConnector` is stale unless pinned	Engine TTFT source, not LMCache itself	Engine TPOT source, not LMCache itself	Usually `aggregate_only`; LMCache metrics prove cache behavior, not per-session request identity
LMCache MP	`lmcache_mp_`	Standalone `lmcache server` plus engine connector evidence; vLLM uses `LMCacheMPConnector`; SGLang MP is not proven on current mainline	Engine TTFT source plus LMCache MP lookup/retrieve evidence	Engine TPOT source plus LMCache MP store/retrieve evidence	Usually `aggregate_only`; MP HTTP/Prometheus/OTel/trace evidence is mode proof, not per-session identity by default
NVIDIA Dynamo	`nv_llm:` or `dynamo_`	Best-effort from transfer/connector labels when exposed; empty if not exported	Adapter-normalized Dynamo TTFT metric family	Adapter-normalized Dynamo TPOT metric family	Often `aggregate_only`; can be `per_session` if labels are emitted
llm-d	`llmd_` or `llm_d_`	Best-effort from llm-d transfer/connector labels when present	Adapter-normalized llm-d TTFT metric family	Adapter-normalized llm-d TPOT metric family	Engine-dependent; `per_session`, `aggregate_only`, or `unsupported`

Notes

Prefix detection is best-effort when --engine auto is used.
Explicit --engine is recommended in production automation.
If metrics are insufficient for identity, output includes engine_unidentified.
For LMCache MP, current vLLM source does not expose LMCache MP connector-specific Prometheus metrics from vLLM; collect the standalone LMCache MP /metrics endpoint.
SGLang LMCache should be treated as embedded/layerwise until a current mainline SGLang MP connector contract and live fixture prove otherwise.