CLI Reference
This page is generated from the v0.7.3 Typer help output and curated with operator notes. Use it with:
inferguard --help
inferguard <command> --help
inferguard <group> <subcommand> --help
From a source checkout, use:
PYTHONPATH=src python3 -m inferguard.cli --help
High-level workflow
preflightand/orsimulate-gputo prove the local flow.launch-engineor an externally managed vLLM/SGLang/Dynamo/LMCache endpoint.request-profileto produce per-request evidence.collect-metricsto produce engine and GPU timelines.validate-completedto decide whether the run can be published.diagnose-bottleneck,classify-failures,find-cliffs,compute-cost, andreport-completedfor operator analysis.
LMCache observability workflow
LMCache is mode-specific. For current standalone MP, collect the LMCache server
endpoint, not just the engine endpoint. For embedded compatibility, collect the
serving-engine endpoint and any lmcache:* metrics/logs it exposes.
Recommended MP evidence packet:
inferguard collect-lmcache \
--output-dir modal-out/lmcache-packet \
--engine-metrics-file modal-out/vllm.prom \
--lmcache-metrics-file modal-out/lmcache.prom \
--lmcache-health-file modal-out/lmcache-health.json \
--lmcache-status-file modal-out/lmcache-status.json \
--lmcache-version-file modal-out/lmcache-version.txt \
--lmcache-lmc-version-file modal-out/lmcache-lmc-version.txt \
--lmcache-commit-id-file modal-out/lmcache-commit-id.txt \
--lmcache-quota-file modal-out/lmcache-quota.json \
--engine-log-file modal-out/vllm.log \
--lmcache-log-file modal-out/lmcache.log \
--lmcache-trace-file modal-out/lmcache-trace.lct \
--lmcache-trace-replay-output modal-out/trace-replay \
--lmcache-otel-file modal-out/lmcache-otel.jsonl \
--lmcache-lookup-hash-path modal-out/lookup-hashes \
--expect-mode mp \
--json
Standalone reports:
inferguard lmcache-compat \
--engine-metrics-file modal-out/vllm.prom \
--lmcache-metrics-file modal-out/lmcache.prom \
--lmcache-http-evidence-file modal-out/lmcache-packet/lmcache_http_evidence.json \
--lmcache-log-evidence-file modal-out/lmcache-packet/lmcache_log_evidence.json \
--lmcache-trace-evidence-file modal-out/lmcache-packet/lmcache_trace_evidence.json \
--lmcache-trace-replay-evidence-file modal-out/lmcache-packet/lmcache_trace_replay_evidence.json \
--lmcache-otel-evidence-file modal-out/lmcache-packet/lmcache_otel_evidence.json \
--lmcache-lookup-hash-evidence-file modal-out/lmcache-packet/lmcache_lookup_hash_evidence.json \
--expect-mode mp \
--fail-on missing-required \
--json
inferguard observability-coverage \
--engine-metrics-file modal-out/vllm.prom \
--lmcache-metrics-file modal-out/lmcache.prom \
--lmcache-http-evidence-file modal-out/lmcache-packet/lmcache_http_evidence.json \
--lmcache-log-evidence-file modal-out/lmcache-packet/lmcache_log_evidence.json \
--lmcache-trace-evidence-file modal-out/lmcache-packet/lmcache_trace_evidence.json \
--lmcache-trace-replay-evidence-file modal-out/lmcache-packet/lmcache_trace_replay_evidence.json \
--lmcache-otel-evidence-file modal-out/lmcache-packet/lmcache_otel_evidence.json \
--lmcache-lookup-hash-evidence-file modal-out/lmcache-packet/lmcache_lookup_hash_evidence.json \
--expected-engine vllm \
--expect-lmcache-mode mp \
--json
Coverage accounting is 68 / 100 after the accepted live Packet A fixture.
The active score source is
/Users/chen/Projects/Touchdown-Labs/docs/sdlc/195-2026-05-07-lmcache-vllm-inferguard-100-coverage-ssot.md,
which supersedes the earlier docs 188/189/190 trackers.
Run Packet B from the full InferGuard repo checkout, not from the old Touchdown-Labs OSS mirror:
cd /Users/chen/Projects/inferguard
python scripts/lmcache_mp_packet_commands.py
INFERGUARD_LMCACHE_LOCAL_SOURCE=/Users/chen/Projects/LMCache \
modal run scripts/lmcache_mp_modal_packet_lab.py::run_packet_b
B1 status as of 2026-05-07: accepted. Live Packet A landed from Modal run
https://modal.com/apps/ocwc22/main/ap-cH4YAMKOZxmsVOf58YzHPo, volume
lmcache-mp-lab:/packet-a/20260507T230057Z, and is pinned by
tests/fixtures/lmcache_live/packet_a/. Packet B lifecycle is the next
score-moving gate.
Local B1 missing-family diagnostic smoke
A non-scoreable Packet A failure-mode fixture lives at
tests/fixtures/lmcache_live/packet_a_missing_prometheus/. It is intentionally
marked score_points=0 and
acceptance_status=rejected_missing_prometheus_families. Use it to test the
Diagnostic CLI output shape before a real packet lands:
PACKET=tests/fixtures/lmcache_live/packet_a_missing_prometheus
inferguard lmcache-compat \
--engine-metrics-file "$PACKET/vllm_metrics_loaded.prom" \
--lmcache-metrics-file "$PACKET/lmcache_metrics_loaded.prom" \
--lmcache-http-evidence-file "$PACKET/lmcache_http_evidence.json" \
--lmcache-log-evidence-file "$PACKET/lmcache_log_evidence.json" \
--lmcache-lookup-hash-evidence-file "$PACKET/lmcache_lookup_hash_evidence.json" \
--expect-mode mp \
--fail-on missing-required \
--json
inferguard observability-coverage \
--engine-metrics-file "$PACKET/vllm_metrics_loaded.prom" \
--lmcache-metrics-file "$PACKET/lmcache_metrics_loaded.prom" \
--lmcache-http-evidence-file "$PACKET/lmcache_http_evidence.json" \
--lmcache-log-evidence-file "$PACKET/lmcache_log_evidence.json" \
--lmcache-lookup-hash-evidence-file "$PACKET/lmcache_lookup_hash_evidence.json" \
--expected-engine vllm \
--expect-lmcache-mode mp \
--json
Expected result: lmcache-compat exits nonzero under
--fail-on missing-required, reports detected_mode=mp, and lists missing
lmcache_mp Prometheus families including lookup_tokens and l1_memory.
HTTP/log/lookup-hash evidence is shown as
live_alternate_not_scoreable; it explains the failure mode but does not replace
lmcache_mp_lookup_requested_tokens_total,
lmcache_mp_lookup_hit_tokens_total, or lmcache_mp_l1_memory_usage_bytes.
Do not move the 68/100 score from this rejected fixture; it remains a diagnostic
regression for older LMCache installs.
Use these exact next commands when updating endpoint, signal, or rule status:
| Lane | Status now | Missing proof | Exact next command |
|---|---|---|---|
| Safe MP HTTP endpoints | partial | Live captures for root, config, version, quota, threads, periodic threads, and periodic thread health. | curl -fsS "$LMCACHE_HTTP/api/status" -o "$PACKET_DIR/lmcache-status.json" |
| MP Prometheus signals | partial | Live L2, nonzero lookup, sampled lifecycle, and throughput packets. | inferguard lmcache-compat --lmcache-metrics-file "$PACKET_DIR/lmcache.prom" --output "$PACKET_DIR/lmcache_compat_report.json" --expect-mode mp |
| Embedded LMCache signals | partial | Live vLLM embedded and SGLang --enable-lmcache fixtures. |
inferguard observability-coverage --engine-metrics-file "$PACKET_DIR/vllm_embedded.prom" --output "$PACKET_DIR/vllm_embedded_coverage.json" --expect-lmcache-mode embedded |
| Trace, OTel, replay, lookup-hash evidence | partial | Real .lct, collector OTel export, replay output, and live lookup-hash JSONL. |
inferguard collect-lmcache --output-dir "$PACKET_DIR" --lmcache-trace-file "$PACKET_DIR/lmcache-trace.lct" --lmcache-otel-file "$PACKET_DIR/lmcache-otel.jsonl" --lmcache-trace-replay-output "$PACKET_DIR/trace-replay" --lmcache-lookup-hash-path "$PACKET_DIR/lookup-hashes" |
| Log, P2P, and PD evidence | partial | Live MP logs plus two-engine P2P and prefiller/decoder packets. | inferguard collect-lmcache --output-dir "$PACKET_DIR/logs" --engine-log-file "$PACKET_DIR/vllm.log" --lmcache-log-file "$PACKET_DIR/lmcache.log" |
| Diagnostic rules | missing | Calibrated findings from live packets, not only pass-through parser codes. | inferguard diagnose-bottleneck --job-dir "$JOB_DIR" --output-dir "$PACKET_DIR/diagnose-bottleneck" |
| Packet A score gate | live_validated | Accepted live vLLM + standalone LMCache MP fixture imported and pinned. | cd /Users/chen/Projects/inferguard && uv run pytest -q tests/test_lmcache_live_fixtures.py tests/test_lmcache_mp_modal_packet_lab.py |
| Packet B lifecycle gate | next | Live sampled lifecycle/L0-L1 proof with compact fixture. | cd /Users/chen/Projects/inferguard && INFERGUARD_LMCACHE_LOCAL_SOURCE=/Users/chen/Projects/LMCache modal run scripts/lmcache_mp_modal_packet_lab.py::run_packet_b |
Current source-backed caveats:
- vLLM embedded LMCache uses
LMCacheConnectorV1orLMCacheConnectorV1Dynamic; legacyLMCacheConnectorshould be treated as a stale/pinned stack. - vLLM MP uses
LMCacheMPConnector, but current vLLM connector code does not export LMCache MP connector-specific Prometheus metrics. Collectlmcache_mp_*from the standalone LMCache server. - SGLang current mainline LMCache evidence is embedded/layerwise via
--enable-lmcacheandLMCacheLayerwiseConnector. SGLang MP is not a supported claim until source and live fixtures prove the connector contract.
Source-backed checklist links:
- LMCache MP observability: https://docs.lmcache.ai/mp/observability.html
- LMCache MP HTTP API: https://docs.lmcache.ai/mp/http_api.html
- LMCache production metrics: https://docs.lmcache.ai/production/observability/metrics.html
- LMCache production vLLM metrics endpoint: https://docs.lmcache.ai/production/observability/vllm_endpoint.html
- LMCache trace recording/replay: https://docs.lmcache.ai/mp/tracing_and_debugging.html
- vLLM
LMCacheMPConnector: https://docs.vllm.ai/en/v0.20.1/api/vllm/distributed/kv_transfer/kv_connector/v1/lmcache_mp_connector/
Use this exact status language in CLI output reviews and release notes:
- Current LMCache observability status is 68 / 100, partial.
- MP parser/report support is fixture_backed for core families and parser_only for live-only throughput, gauges, and replay proofs.
- Embedded production metrics are fixture_backed for core aliases and parser_only for live backend, P2P, local CPU, memory-management, and profiling packets.
- Packet A is
live_validated; no other lane islive_validateduntil a real packet is collected and replayed throughcollect-lmcache,lmcache-compat,observability-coverage, anddiagnose-bottleneck.
Full docs/CLI closeout command set:
cd /Users/chen/Projects/inferguard
python scripts/lmcache_mp_packet_commands.py
INFERGUARD_LMCACHE_LOCAL_SOURCE=/Users/chen/Projects/LMCache \
modal run scripts/lmcache_mp_modal_packet_lab.py::run_packet_b
inferguard collect-lmcache \
--output-dir "$PACKET_DIR/lmcache-packet" \
--engine-metrics-file "$PACKET_DIR/vllm.prom" \
--lmcache-metrics-file "$PACKET_DIR/lmcache.prom" \
--lmcache-health-file "$PACKET_DIR/lmcache-health.json" \
--lmcache-status-file "$PACKET_DIR/lmcache-status.json" \
--lmcache-version-file "$PACKET_DIR/lmcache-version.txt" \
--lmcache-lmc-version-file "$PACKET_DIR/lmcache-lmc-version.txt" \
--lmcache-commit-id-file "$PACKET_DIR/lmcache-commit-id.txt" \
--lmcache-quota-file "$PACKET_DIR/lmcache-quota.json" \
--engine-log-file "$PACKET_DIR/vllm.log" \
--lmcache-log-file "$PACKET_DIR/lmcache.log" \
--lmcache-trace-file "$PACKET_DIR/lmcache-trace.lct" \
--lmcache-trace-replay-output "$PACKET_DIR/trace-replay" \
--lmcache-otel-file "$PACKET_DIR/lmcache-otel.jsonl" \
--lmcache-lookup-hash-path "$PACKET_DIR/lookup-hashes" \
--expect-mode mp \
--json
inferguard lmcache-compat \
--engine-metrics-file "$PACKET_DIR/vllm.prom" \
--lmcache-metrics-file "$PACKET_DIR/lmcache.prom" \
--lmcache-http-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_http_evidence.json" \
--lmcache-log-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_log_evidence.json" \
--lmcache-trace-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_trace_evidence.json" \
--lmcache-trace-replay-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_trace_replay_evidence.json" \
--lmcache-otel-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_otel_evidence.json" \
--lmcache-lookup-hash-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_lookup_hash_evidence.json" \
--expect-mode mp \
--fail-on missing-required \
--json
inferguard observability-coverage \
--engine-metrics-file "$PACKET_DIR/vllm.prom" \
--lmcache-metrics-file "$PACKET_DIR/lmcache.prom" \
--lmcache-http-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_http_evidence.json" \
--lmcache-log-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_log_evidence.json" \
--lmcache-trace-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_trace_evidence.json" \
--lmcache-trace-replay-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_trace_replay_evidence.json" \
--lmcache-otel-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_otel_evidence.json" \
--lmcache-lookup-hash-evidence-file "$PACKET_DIR/lmcache-packet/lmcache_lookup_hash_evidence.json" \
--expected-engine vllm \
--expect-lmcache-mode mp \
--json
inferguard diagnose-bottleneck \
--job-dir "$JOB_DIR" \
--output-dir "$PACKET_DIR/diagnose-bottleneck"
Every 100% checklist update must account for these metric families:
| Surface | Families |
|---|---|
| LMCache MP | StorageManager counters; L1 counters/memory/failures/lifecycle; StorageManager real reuse; L2 counters/failures/throughput/in-flight gauges; lookup hit rate; L0 lifecycle; L0-L1 throughput; engine counter; active prefetch jobs; EventBus; CacheBlend. |
| KV cache offload | Native vLLM CPU offload (vllm:kv_offload_total_bytes, vllm:kv_offload_total_time, vllm:simple_cpu_offload_*) and LMCache MP L0-L1 KV movement (lmcache_mp_l0_l1_store_throughput_gbs, lmcache_mp_l0_l1_load_throughput_gbs). Treat native vLLM CPU offload as useful pressure evidence, not LMCache proof. |
| Embedded production LMCache | Core request; token; hit rate; performance and latency; detailed profiling; cache usage and lifecycle; remote backend and network; local CPU backend; memory management; P2P transfer; health/internal; chunk statistics. |
| Workload packets | accepted MP Packet A; Packet B lifecycle; MP L2; embedded vLLM; embedded SGLang; CacheBlend; P2P; PD; trace replay and lookup hash; release readiness. |
Exit-code conventions
| Code | Typical meaning |
|---|---|
0 |
Command succeeded or report was written without a failing threshold. |
1 |
Strict validation/reporting gate did not pass. |
2 |
Findings crossed a configured threshold, or all benchmark requests failed. |
3 |
Input, parsing, endpoint, or artifact-writing failure. |
Check each command's stdout summary and generated JSON for exact status.
inferguard
Usage: inferguard [OPTIONS] COMMAND [ARGS]...
InferGuard โ read-only disaggregated-serving diagnostics.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --version Print version and exit. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ preflight Run read-only launch compatibility checks before a benchmark. โ
โ analyze Analyze an existing result directory without launching benchmarks. โ
โ emit-bundle Emit a deployment bundle from a router verdict. โ
โ validate-completed Validate completed runs before any publishability or operator claim. โ
โ request-profile Profile per-request TTFT, TPOT, E2E latency, and failures. โ
โ collect-metrics Collect normalized engine and GPU metric timelines for live evidence. โ
โ ingest-agentx Convert AgentX result CSV outputs into canonical InferGuard schemas. โ
โ agentx-ingest Convert AgentX result CSV outputs into canonical InferGuard schemas. โ
โ launch-engine Launch or validate a vLLM, SGLang, LMCache, or Dynamo-SGLang engine. โ
โ diagnose-bottleneck Diagnose one completed job into a bottleneck verdict. โ
โ classify-failures Classify failed job evidence into operator-actionable failure classes. โ
โ report-completed Build a refusal-gated operator recommendation from completed evidence. โ
โ compute-cost Compute cost-per-useful-task and safe concurrency from run evidence. โ
โ find-cliffs Find capacity cliffs across completed sweep evidence. โ
โ simulate-gpu Generate synthetic GPU/Slurm artifacts for local bundle smoke testing. โ
โ serve-mimic Serve a tiny fake OpenAI-compatible endpoint for synthetic smoke tests. โ
โ disagg Disaggregated serving diagnostics. โ
โ bench OpenAI-compatible endpoint benchmarks. โ
โ profile Live endpoint profiler for existing /metrics traffic. โ
โ agent Agent trace harness commands. โ
โ daemon Local harness daemon sidecar. โ
โ telemetry Local-only telemetry consent and payload audit commands. โ
โ workload Pre-flight workload fingerprinting. โ
โ router Rule-based execution-path routing. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard preflight
Usage: inferguard preflight [OPTIONS]
Run read-only launch compatibility checks before a benchmark.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --model TEXT Model family or HF id for compatibility checks. [default: deepseek-ai/DeepSeek-V4-Pro] โ
โ --engine TEXT Engine hint: vllm, sglang, dynamo, lmcache, llm-d, or auto. [default: vllm] โ
โ --kv-offloading-backend TEXT KV offload backend, e.g. native when OFFLOADING=cpu. โ
โ --disable-hybrid-kv-cache-manager --no-disable-hybrid-kv-cache-manager Whether the serving launch disables the hybrid KV cache manager. [default: no-disable-hybrid-kv-cache-manager] โ
โ --config PATH Optional config.json/run config containing topology/preflight fields. โ
โ --detect-tokenizer-mismatch Probe client/server tokenizer-count drift before rollout. โ
โ --endpoint TEXT Optional OpenAI-compatible /v1/chat/completions endpoint for tokenizer probe. โ
โ --sample-text TEXT Known text sent for tokenizer-mismatch probing. [default: Hello world โ
โ This is a test of tokenization.] โ
โ --client-tokenizer TEXT Client tokenizer label/version used for preflight evidence. [default: inferguard-estimator] โ
โ --server-tokenizer TEXT Optional server tokenizer label/version used for preflight evidence. โ
โ --client-token-count INTEGER Optional explicit client token count for tokenizer probe/testing. โ
โ --json Emit machine-readable JSON. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard analyze
Usage: inferguard analyze [OPTIONS] RESULTS_DIR
Analyze an existing result directory without launching benchmarks.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * results_dir PATH Directory containing benchmark artifacts. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --output-dir PATH Destination for generated reports. โ
โ --format TEXT Output format: json, md, or both. [default: both] โ
โ --fail-on TEXT Exit threshold: never, warning, or critical. [default: critical] โ
โ --strict --best-effort Fail on missing required artifacts. [default: best-effort] โ
โ --timeline-glob TEXT Discovery pattern for timeline JSONL files. [default: **/inferguard_timeline.jsonl] โ
โ --cost-per-gpu-hour FLOAT GPU-hour cost for cost-per-task accounting. โ
โ --gpus INTEGER GPU count for cost-per-task accounting. โ
โ --operator-brief --no-operator-brief Emit operator_brief.{json,md}; defaults on when --gpus is provided. โ
โ --cost-currency TEXT Currency label for cost output. [default: USD] โ
โ --plots After report writes, render SVG plots into <output-dir>/plots/. โ
โ --emit-agentx-shape PATH Write per-cell agg_*.json files in AgentX/InferenceX shape. โ
โ --json Also print the generated JSON report to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard emit-bundle
Usage: inferguard emit-bundle [OPTIONS] VERDICT
Emit a deployment bundle from a router verdict.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * verdict PATH Router verdict JSON from `inferguard router classify`. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --output PATH Destination bundle directory. [required] โ
โ --target TEXT Bundle target. Currently: slurm. [default: slurm] โ
โ --json Print bundle manifest JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard validate-completed
Usage: inferguard validate-completed [OPTIONS]
Validate completed runs before any publishability or operator claim.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --results-root PATH Run directory to validate. [required] โ
โ --matrix-plan PATH Override matrix_plan.json location. โ
โ --artifact-contract PATH Override expected_artifact_contract.json location. โ
โ --output-dir PATH Output directory for validation artifacts. โ
โ --strict Return non-zero unless the run is live_complete. โ
โ --label-overrides PATH JSON {claim_id: claim_status} for human-reviewed downgrades. โ
โ --json-only Skip markdown rendering. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard request-profile
Usage: inferguard request-profile [OPTIONS]
Profile per-request TTFT, TPOT, E2E latency, and failures.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --output-dir PATH Output directory for request-profile artifacts. [required] โ
โ * --endpoint TEXT OpenAI-compatible chat-completions endpoint. [required] โ
โ * --model TEXT Model name sent in profile requests. [required] โ
โ * --input-jsonl PATH JSONL request/profile input file. [required] โ
โ --concurrency TEXT Closed-loop concurrency level. โ
โ --timeout-seconds FLOAT HTTP timeout per request. [default: 300.0] โ
โ --arrival-mode TEXT Arrival mode: closed_loop or poisson. โ
โ --rate-rps FLOAT Poisson arrival rate in requests per second. โ
โ --max-requests INTEGER Maximum request rows to issue. โ
โ --api-key TEXT Optional bearer token for the endpoint. โ
โ --stream Use streaming chat completions. โ
โ --include-usage Request OpenAI stream usage when streaming. โ
โ --continuous-usage-stats Request continuous usage stats when supported. โ
โ --workload-label TEXT Workload label stamped into artifacts. โ
โ --job-id TEXT Optional job id stamped into artifacts. โ
โ --seed INTEGER Deterministic scheduler seed. [default: 0] โ
โ --engine TEXT Engine label stamped into artifacts. โ
โ --model-profile TEXT Model architecture/profile label. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard collect-metrics
Usage: inferguard collect-metrics [OPTIONS]
Collect normalized engine and GPU metric timelines for live evidence.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --output-dir PATH Output directory for metrics artifacts. [required] โ
โ * --engine TEXT Engine: vllm, sglang, lmcache, or dynamo-sglang. [required] โ
โ * --engine-metrics-url TEXT Serving-engine Prometheus metrics URL. [required] โ
โ * --dcgm-metrics-url TEXT DCGM exporter Prometheus metrics URL. [required] โ
โ * --duration-seconds INTEGER Collection duration in seconds. [required] โ
โ --interval-seconds FLOAT Engine scrape interval in seconds. [default: 1.0] โ
โ --dcgm-interval-seconds FLOAT DCGM timestamp window in seconds. [default: 5.0] โ
โ --lmcache-metrics-url TEXT Optional LMCache metrics URL. โ
โ --label-job-id TEXT Job id label for normalized metrics. โ
โ --label-engine-version TEXT Engine version label for normalized metrics. โ
โ --label-hardware TEXT Hardware label for normalized metrics. โ
โ --keep-raw-samples Keep raw Prometheus samples alongside normalized timelines. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard ingest-agentx
Usage: inferguard ingest-agentx [OPTIONS]
Convert AgentX result CSV outputs into canonical InferGuard schemas.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --output-dir PATH Output directory for canonical InferGuard artifacts. [required] โ
โ --agentx-results-dir PATH AgentX result directory containing metadata and CSV output. โ
โ --agentx-result PATH Single AgentX detailed result CSV. โ
โ --job-id TEXT Optional job id stamped into artifacts. โ
โ --engine TEXT Engine label stamped into artifacts. โ
โ --workload-label TEXT Workload label stamped into artifacts. โ
โ --model-profile TEXT Model architecture/profile label. โ
โ --model TEXT Fallback model/profile label for single CSV ingest. โ
โ --concurrency TEXT Concurrency label for single CSV ingest. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard agentx-ingest
Usage: inferguard agentx-ingest [OPTIONS]
Convert AgentX result CSV outputs into canonical InferGuard schemas.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --output-dir PATH Output directory for canonical InferGuard artifacts. [required] โ
โ --agentx-results-dir PATH AgentX result directory containing metadata and CSV output. โ
โ --agentx-result PATH Single AgentX detailed result CSV. โ
โ --job-id TEXT Optional job id stamped into artifacts. โ
โ --engine TEXT Engine label stamped into artifacts. โ
โ --workload-label TEXT Workload label stamped into artifacts. โ
โ --model-profile TEXT Model architecture/profile label. โ
โ --model TEXT Fallback model/profile label for single CSV ingest. โ
โ --concurrency TEXT Concurrency label for single CSV ingest. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard launch-engine
Usage: inferguard launch-engine [OPTIONS]
Launch or validate a vLLM, SGLang, LMCache, or Dynamo-SGLang engine.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --output-dir PATH Output directory for launch artifacts. [required] โ
โ * --engine TEXT Engine: vllm, sglang, lmcache, or dynamo-sglang. [required] โ
โ --external-launch Validate an already-launched endpoint instead of spawning. โ
โ --endpoint-url,--endpoint TEXT Endpoint URL for external-launch or healthcheck. โ
โ --model-path TEXT Model path or id passed to the serving engine. โ
โ --host TEXT Engine bind host. โ
โ --port INTEGER Engine bind port. โ
โ --tensor-parallel-size INTEGER Tensor parallel size. [default: 1] โ
โ --pipeline-parallel-size INTEGER Pipeline parallel size. [default: 1] โ
โ --data-parallel-size INTEGER Data parallel size. [default: 1] โ
โ --max-model-len INTEGER Maximum model context length. โ
โ --gpu-memory-utilization FLOAT vLLM GPU memory utilization. [default: 0.9] โ
โ --mem-fraction-static FLOAT SGLang static memory fraction. [default: 0.9] โ
โ --enable-prefix-caching Enable prefix caching when supported. โ
โ --enable-chunked-prefill Enable chunked prefill when supported. โ
โ --chunked-prefill-size INTEGER Chunked prefill size. โ
โ --enable-cache-report Enable engine cache reporting flags. โ
โ --enable-metrics Enable engine metrics flags. โ
โ --kv-cache-dtype TEXT KV cache dtype. โ
โ --quantization TEXT Quantization mode. โ
โ --hardware TEXT Hardware label for launch warnings. โ
โ --kv-transfer-config TEXT KV transfer configuration JSON/string. โ
โ --healthcheck-timeout-seconds INTEGER Healthcheck timeout in seconds. [default: 600] โ
โ --healthcheck-prompt TEXT Healthcheck canary prompt. [default: Hello, are you up?] โ
โ --canary-completion-tokens INTEGER Healthcheck canary completion tokens. [default: 16] โ
โ --extra-args TEXT Extra engine CLI arguments. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard diagnose-bottleneck
Usage: inferguard diagnose-bottleneck [OPTIONS]
Diagnose one completed job into a bottleneck verdict.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --job-dir PATH Completed job directory to diagnose. [required] โ
โ --validation-report PATH Optional validation report path. โ
โ --rule-config PATH Optional bottleneck rule config. โ
โ --output-dir PATH Output directory for diagnosis artifacts. โ
โ --strict Return non-zero when evidence is insufficient. โ
โ --json-only Skip markdown rendering. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard classify-failures
Usage: inferguard classify-failures [OPTIONS]
Classify failed job evidence into operator-actionable failure classes.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --job-dir PATH Completed or failed job directory to classify. [required] โ
โ --regex-config PATH Optional regex classification config. โ
โ --max-failures INTEGER Maximum ranked failures to emit. [default: 20] โ
โ --output-dir PATH Output directory for classification artifacts. โ
โ --json-only Skip markdown rendering. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard report-completed
Usage: inferguard report-completed [OPTIONS]
Build a refusal-gated operator recommendation from completed evidence.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --results-root PATH Completed run root to summarize. [required] โ
โ --output-dir PATH Output directory for recommendation artifacts. โ
โ --strict Return non-zero when recommendation evidence is insufficient. โ
โ --json-only Skip markdown rendering. โ
โ --cost-input PATH JSON {"<sku>": <usd_per_gpu_hour>} for cost claims. โ
โ --workload-fingerprint PATH Optional WorkloadFingerprint JSON. โ
โ --slo PATH Optional SLO JSON. โ
โ --useful-task-definition PATH Optional useful-task criteria JSON. โ
โ --useful-task-min-tokens INTEGER Minimum completion tokens for a useful task. [default: 1] โ
โ --useful-task-slo-ttft-ms FLOAT Useful-task TTFT SLO in milliseconds. โ
โ --slo-ttft-ms FLOAT TTFT SLO in milliseconds. โ
โ --slo-e2e-ms FLOAT E2E latency SLO in milliseconds. โ
โ --slo-success-rate FLOAT Success-rate SLO. [default: 0.95] โ
โ --success-rate-floor FLOAT Compatibility alias for --slo-success-rate. [default: 0.95] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard compute-cost
Usage: inferguard compute-cost [OPTIONS]
Compute cost-per-useful-task and safe concurrency from run evidence.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --results-root PATH Completed run root to price. [required] โ
โ * --cost-input PATH JSON {"<sku>": <usd_per_gpu_hour>} for cost claims. [required] โ
โ --output-dir PATH Output directory for cost artifacts. โ
โ --json-only Skip markdown rendering. โ
โ --slo PATH Optional SLO JSON. โ
โ --useful-task-definition PATH Optional useful-task criteria JSON. โ
โ --useful-task-min-tokens INTEGER Minimum completion tokens for a useful task. [default: 1] โ
โ --useful-task-slo-ttft-ms FLOAT Useful-task TTFT SLO in milliseconds. โ
โ --slo-ttft-ms FLOAT TTFT SLO in milliseconds. โ
โ --slo-e2e-ms FLOAT E2E latency SLO in milliseconds. โ
โ --slo-success-rate FLOAT Success-rate SLO. [default: 0.95] โ
โ --success-rate-floor FLOAT Compatibility alias for --slo-success-rate. [default: 0.95] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard find-cliffs
Usage: inferguard find-cliffs [OPTIONS]
Find capacity cliffs across completed sweep evidence.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --results-root PATH Completed sweep root to analyze. [required] โ
โ --output-dir PATH Output directory for capacity cliff artifacts. โ
โ --cliffs TEXT Comma-separated capacity cliff subset; default is all. โ
โ --ttft-p99-floor-ms FLOAT TTFT p99 floor in milliseconds. [default: 1000.0] โ
โ --success-rate-floor FLOAT Minimum acceptable success rate. [default: 0.95] โ
โ --strict Return non-zero when any cliff lacks enough evidence. โ
โ --json-only Skip markdown rendering. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard simulate-gpu
Usage: inferguard simulate-gpu [OPTIONS]
Generate synthetic GPU/Slurm artifacts for local bundle smoke testing.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --results-root PATH Run directory where matrix and synthetic GPU artifacts will be written. โ
โ --plan PATH Existing matrix_plan.json to simulate. Preserves the legacy gmi_gpu_mimic.py flag. โ
โ --gpu-profiles,--gpu-mimic-profile PATH Optional GPU mimic profile catalog JSON. โ
โ --provider TEXT Provider profile. Currently only gmi. [default: gmi] โ
โ --cluster-profile PATH Optional standalone JSON/YAML cluster profile. โ
โ --stage TEXT Matrix stage label. [default: single-node-smoke] โ
โ --max-jobs INTEGER Maximum jobs to render into the synthetic matrix. [default: 1] โ
โ --hardware TEXT Hardware alias: h100, h200, b200, b300, gb200, or gb300. [default: b200] โ
โ --engine TEXT Engine alias: vllm or sglang. [default: vllm] โ
โ --model-profile TEXT Model profile alias, e.g. dsv4-pro or deepseek_v4_pro. [default: dsv4-pro] โ
โ --workload TEXT Workload alias, e.g. long_context_chat. [default: long_context_chat] โ
โ --context-lengths TEXT Comma-separated context lengths. Defaults to 8192. โ
โ --concurrency TEXT Comma-separated concurrency levels. Defaults to 1. โ
โ --arrival-mode TEXT Arrival mode label. [default: closed_loop] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard serve-mimic
Usage: inferguard serve-mimic [OPTIONS]
Serve a tiny fake OpenAI-compatible endpoint for synthetic smoke tests.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --host TEXT Bind host for the synthetic endpoint. [default: 127.0.0.1] โ
โ --port INTEGER Bind port for the synthetic endpoint. [default: 8000] โ
โ --model TEXT Model id returned by the OpenAI-compatible endpoint. โ
โ --model-profile TEXT Fallback model id/profile label. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard disagg
Usage: inferguard disagg [OPTIONS] COMMAND [ARGS]...
Disaggregated serving diagnostics.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ status Scrape prefill + decode (+ optional transfer) and print findings. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard disagg status
Usage: inferguard disagg status [OPTIONS]
Scrape prefill + decode (+ optional transfer) and print findings.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --prefill TEXT Prefill endpoint base URL. [required] โ
โ * --decode TEXT Decode endpoint base URL. [required] โ
โ --transfer TEXT Optional transfer-layer metrics URL. โ
โ --engine TEXT Engine hint: auto, vllm, sglang, dynamo, llm-d. [default: auto] โ
โ --json Emit machine-readable JSON instead of a table. โ
โ --timeout FLOAT HTTP timeout per scrape (seconds). [default: 5.0] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench
Usage: inferguard bench [OPTIONS] COMMAND [ARGS]...
OpenAI-compatible endpoint benchmarks.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ replay Replay trace JSONL records against a streaming chat-completions endpoint. โ
โ upstream Run vLLM/SGLang native benchmark CLIs and normalize their artifacts. โ
โ compare Compare two bench run directories for cross-engine parity. โ
โ agentx-replay Run AgentX trace replay and convert detailed_results.csv to InferGuard artifacts. โ
โ kv-stress Generate synthetic KVCast prompts and infer cache pressure from request shape. โ
โ kvcast Run KVCast synthetic cache stress modes. โ
โ cold-start Capture first-60s cold-start ramp from endpoint readiness. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench replay
Usage: inferguard bench replay [OPTIONS]
Replay trace JSONL records against a streaming chat-completions endpoint.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --endpoint TEXT OpenAI-compatible /v1/chat/completions endpoint. [required] โ
โ * --model TEXT Model name sent in chat requests. [required] โ
โ * --trace-dir PATH Directory containing InferGuard trace JSONL files. [required] โ
โ --concurrency TEXT Comma-separated concurrency levels, e.g. 1,4,8,16,32. [default: 1,4,8,16,32] โ
โ --output-dir PATH Directory for run.json/config.json/JSONL/summary/report. [default: inferguard_bench_replay] โ
โ --output-tokens INTEGER Fallback max output tokens when trace does not specify expected_output_tokens. [default: 512] โ
โ --timeout FLOAT HTTP timeout per request in seconds. [default: 300.0] โ
โ --duration-seconds FLOAT Run each concurrency level for this many seconds instead of one finite pass. โ
โ --warmup-seconds FLOAT Exclude this many initial seconds per level from summary metrics. [default: 0.0] โ
โ --metrics-url TEXT Optional engine metrics URL to scrape during the bench. โ
โ --metrics-interval FLOAT Seconds between engine metrics scrapes. [default: 5.0] โ
โ --metrics-engine TEXT Engine hint for metrics detection: auto, vllm, sglang, dynamo, llm-d. [default: auto] โ
โ --force Allow writing into a non-empty output directory; known artifact files may be overwritten. โ
โ --redact-prompts Replace prompt content with <redacted> in requests.jsonl. โ
โ --track-cache-lineage Track request-level prefix-cache lineage scaffold. โ
โ --idle-active-mix-mode Alternate active request windows with idle windows for S-14 cost economics. โ
โ --active-window-seconds FLOAT Active traffic window length for --idle-active-mix-mode. [default: 60.0] โ
โ --idle-window-seconds FLOAT Idle traffic window length for --idle-active-mix-mode. [default: 30.0] โ
โ --inject-giant-prefill-tokens INTEGER Inject one oversized prefill request; requires --allow-chaos. โ
โ --allow-chaos Allow chaos-mode replay injections. โ
โ --canary-eval-set TEXT Held-out eval set path or HuggingFace dataset id for canary quality scoring. โ
โ --tool-call-schema PATH JSON schema describing expected tool-call response format. โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench upstream
Usage: inferguard bench upstream [OPTIONS] ENGINE
Run vLLM/SGLang native benchmark CLIs and normalize their artifacts.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * engine TEXT Upstream engine to run: vllm or sglang. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --profile TEXT Profile: vLLM random|sharegpt|prefix-repetition|sonnet; SGLang random. [required] โ
โ * --model TEXT Model name passed to the upstream bench. [required] โ
โ --endpoint TEXT Engine endpoint base URL, e.g. http://localhost:8000. [default: http://localhost:8000] โ
โ --num-prompts INTEGER Number of prompts passed to the upstream bench. [default: 100] โ
โ --request-rate FLOAT Optional upstream request-rate limit. โ
โ --dataset-path PATH Optional upstream dataset path for dataset-backed profiles. โ
โ --output-dir PATH Directory for run/config/requests/metrics/summary artifacts. [default: inferguard_bench_upstream] โ
โ --timeout FLOAT Subprocess timeout in seconds. [default: 300.0] โ
โ --enable-radix-cache --disable-radix-cache Set SGLANG_ENABLE_RADIX_CACHE=1/0 for SGLang upstream runs. โ
โ --force Allow writing into a non-empty output directory. โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench compare
Usage: inferguard bench compare [OPTIONS] RUN_A_DIR RUN_B_DIR
Compare two bench run directories for cross-engine parity.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * run_a_dir PATH First InferGuard bench run directory. [required] โ
โ * run_b_dir PATH Second InferGuard bench run directory. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --output-dir PATH Directory for compare.json and compare.md. [default: inferguard_bench_compare] โ
โ --label-a TEXT Display label for the first run, e.g. vllm. โ
โ --label-b TEXT Display label for the second run, e.g. sglang. โ
โ --min-identity-overlap FLOAT Required trace_id+turn_index overlap ratio; must be > this value. [default: 0.5] โ
โ --strict-identity Fail instead of warning when trace identity overlap is too low. โ
โ --cost-per-gpu-hour FLOAT Optional GPU-hour cost for cost-per-task deltas. โ
โ --gpus INTEGER GPU count for cost-per-task deltas. โ
โ --blue-green Treat run A as blue/baseline and run B as green/candidate; emit rollout p99 regression findings. โ
โ --force Allow overwriting compare artifacts in a non-empty output directory. โ
โ --json Print compare JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench agentx-replay
Usage: inferguard bench agentx-replay [OPTIONS]
Run AgentX trace replay and convert detailed_results.csv to InferGuard artifacts.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --endpoint TEXT OpenAI-compatible API endpoint base URL. [required] โ
โ * --model TEXT Model label for InferGuard artifacts. [required] โ
โ * --trace-source TEXT Hugging Face dataset name or local trace directory. [required] โ
โ --concurrency INTEGER AgentX concurrent users; used for start-users and max-users. [default: 1] โ
โ --duration-seconds INTEGER AgentX replay duration in seconds; warns below 900s/15min. [default: 1800] โ
โ --output-dir PATH Directory for InferGuard AgentX replay artifacts. [default: inferguard_bench_agentx_replay] โ
โ --tester-path PATH Path to trace_replay_tester.py or a kv-cache-tester checkout. โ
โ --allow-network-clone Clone kv-cache-tester into ~/.cache/inferguard/agentx-tester if missing. โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench kv-stress
Usage: inferguard bench kv-stress [OPTIONS]
Generate synthetic KVCast prompts and infer cache pressure from request shape.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --endpoint TEXT OpenAI-compatible /v1/chat/completions endpoint. [required] โ
โ * --model TEXT Model name sent in chat requests. [required] โ
โ --context-lengths TEXT Comma-separated approximate input token targets. [default: 8192,32768,65536,131072,524288,1048576] โ
โ --concurrency TEXT Comma-separated concurrency levels, e.g. 1,4,8,16. [default: 1,4,8,16] โ
โ --output-tokens INTEGER Max streamed output tokens per request. [default: 512] โ
โ --mode TEXT KVCast mode: cold-pressure, prefix-reuse, mixed-agent, eviction-probe, or fragmentation-probe. [default: cold-pressure] โ
โ --requests-per-level INTEGER Synthetic requests generated per context length. [default: 4] โ
โ --output-dir PATH Directory for run.json/config.json/JSONL/summary/report. [default: inferguard_bench_kv_stress] โ
โ --timeout FLOAT HTTP timeout per request in seconds. [default: 300.0] โ
โ --duration-seconds FLOAT Run each concurrency level for this many seconds instead of one finite pass. โ
โ --warmup-seconds FLOAT Exclude this many initial seconds per level from summary metrics. [default: 0.0] โ
โ --metrics-url TEXT Optional engine metrics URL to scrape during the bench. โ
โ --metrics-interval FLOAT Seconds between engine metrics scrapes. [default: 5.0] โ
โ --metrics-engine TEXT Engine hint for metrics detection: auto, vllm, sglang, dynamo, llm-d. [default: auto] โ
โ --force Allow writing into a non-empty output directory; known artifact files may be overwritten. โ
โ --redact-prompts Replace prompt content with <redacted> in requests.jsonl. โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench kvcast
Usage: inferguard bench kvcast [OPTIONS]
Run KVCast synthetic cache stress modes.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --endpoint TEXT OpenAI-compatible /v1/chat/completions endpoint. [required] โ
โ * --model TEXT Model name sent in chat requests. [required] โ
โ --context-lengths TEXT Comma-separated approximate input token targets. [default: 8192,32768,65536,131072,524288,1048576] โ
โ --concurrency TEXT Comma-separated concurrency levels, e.g. 1,4,8,16. [default: 1,4,8,16] โ
โ --mode TEXT KVCast mode: cold-pressure, prefix-reuse, mixed-agent, eviction-probe, fragmentation-probe, multi-tenant-storm, or retry-storm. [default: cold-pressure] โ
โ --output-tokens INTEGER Max streamed output tokens per request. [default: 512] โ
โ --requests-per-level INTEGER Synthetic requests generated per context length. [default: 4] โ
โ --output-dir PATH Directory for run.json/config.json/JSONL/summary/report. [default: inferguard_bench_kvcast] โ
โ --timeout FLOAT HTTP timeout per request in seconds. [default: 300.0] โ
โ --duration-seconds FLOAT Run each concurrency level for this many seconds instead of one finite pass. โ
โ --warmup-seconds FLOAT Exclude this many initial seconds per level from summary metrics. [default: 0.0] โ
โ --arrival-mode TEXT Arrival scheduler: steady or poisson. [default: steady] โ
โ --arrival-rate-rps FLOAT Mean request arrivals per second for --arrival-mode poisson. โ
โ --metrics-url TEXT Optional engine metrics URL to scrape during the bench. โ
โ --metrics-interval FLOAT Seconds between engine metrics scrapes. [default: 5.0] โ
โ --metrics-engine TEXT Engine hint for metrics detection: auto, vllm, sglang, dynamo, llm-d. [default: auto] โ
โ --force Allow writing into a non-empty output directory; known artifact files may be overwritten. โ
โ --redact-prompts Replace prompt content with <redacted> in requests.jsonl. โ
โ --customers INTEGER Customer count for --mode multi-tenant-storm. [default: 1] โ
โ --sla-tiers TEXT Comma-separated SLA tier policies, e.g. premium=p99<2s,standard=p99<5s. โ
โ --track-cache-lineage Track request-level prefix-cache lineage scaffold. โ
โ --burst-multiplier FLOAT Retry-storm burst QPS multiplier over --baseline-rps. [default: 50.0] โ
โ --burst-window-seconds FLOAT Retry-storm burst duration in seconds. [default: 30.0] โ
โ --baseline-rps FLOAT Retry-storm baseline request rate before/after burst. [default: 4.0] โ
โ --inject-crash-after-seconds FLOAT Test-only crash injection delay; requires --allow-chaos. โ
โ --allow-chaos Allow test-only crash injection scaffolding. โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard bench cold-start
Usage: inferguard bench cold-start [OPTIONS]
Capture first-60s cold-start ramp from endpoint readiness.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --endpoint TEXT OpenAI-compatible /v1/chat/completions endpoint. [required] โ
โ * --model TEXT Model name sent in chat requests. [required] โ
โ --trace-dir PATH Optional InferGuard trace JSONL directory. โ
โ --output-dir PATH Directory for cold-start artifacts. [default: inferguard_bench_cold_start] โ
โ --capture-seconds FLOAT Cold-start capture window from process spawn/readiness. [default: 60.0] โ
โ --context-lengths TEXT Synthetic context lengths when --trace-dir is omitted. [default: 1024] โ
โ --concurrency TEXT Comma-separated concurrency levels. [default: 1] โ
โ --output-tokens INTEGER Max streamed output tokens per request. [default: 64] โ
โ --metrics-url TEXT Optional engine metrics URL to scrape during cold start. โ
โ --metrics-interval FLOAT Seconds between engine metrics scrapes. [default: 5.0] โ
โ --metrics-engine TEXT Engine hint for metrics detection: auto, vllm, sglang, dynamo, llm-d. [default: auto] โ
โ --force Allow writing into a non-empty output directory. โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard profile
Usage: inferguard profile [OPTIONS] COMMAND [ARGS]...
Live endpoint profiler for existing /metrics traffic.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ live Observe an existing endpoint without generating traffic. โ
โ retro Summarize an existing profile/timeline JSONL file. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard profile live
Usage: inferguard profile live [OPTIONS]
Observe an existing endpoint without generating traffic.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --endpoint TEXT Serving endpoint base URL or /metrics URL to observe. [required] โ
โ --duration FLOAT Sampling window in seconds. [default: 60.0] โ
โ --interval FLOAT Seconds between /metrics scrapes. [default: 2.0] โ
โ --engine TEXT Engine hint: auto, vllm, sglang, dynamo, lmcache, llm-d. [default: auto] โ
โ --output-dir PATH Directory for profile.jsonl/profile_summary.json/profile.md. [default: inferguard_profile_live] โ
โ --format TEXT Streaming output format: table or json. [default: table] โ
โ --timeout FLOAT HTTP timeout per metrics scrape (seconds). [default: 5.0] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard profile retro
Usage: inferguard profile retro [OPTIONS] INPUT_PATH
Summarize an existing profile/timeline JSONL file.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * input_path PATH Existing profile.jsonl or metrics timeline JSONL file. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --output-dir PATH Directory for profile_summary.json/profile.md. [default: inferguard_profile_retro] โ
โ --json Print summary JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard agent
Usage: inferguard agent [OPTIONS] COMMAND [ARGS]...
Agent trace harness commands.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ trace Wrap a subprocess and emit a local ``agent-trace/v1`` JSONL file. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard agent trace
Usage: inferguard agent trace [OPTIONS]
Wrap a subprocess and emit a local ``agent-trace/v1`` JSONL file.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --framework TEXT Agent framework: langgraph, crewai, autogen, claude_code, cursor_sdk, raw_openai. [default: raw_openai] โ
โ --output-dir PATH Directory for agent-trace/v1 JSONL output. [default: inferguard_agent_trace] โ
โ --save-prompts --no-save-prompts Write prompt text to prompts-local.jsonl for local debugging only. [default: no-save-prompts] โ
โ --rig-label TEXT Optional rig label: h100, h200, b200, gb200, auto. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard daemon
Usage: inferguard daemon [OPTIONS] COMMAND [ARGS]...
Local harness daemon sidecar.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ start Start the foreground harness daemon sidecar. โ
โ stop Stop the recorded foreground daemon process when possible. โ
โ status Print daemon state and a one-shot local snapshot. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard daemon start
Usage: inferguard daemon start [OPTIONS]
Start the foreground harness daemon sidecar.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --port INTEGER Loopback Prometheus metrics port. [default: 9466] โ
โ --host TEXT Metrics bind host; cluster leaders default to 0.0.0.0. โ
โ --watch-dir PATH Directory containing agent-trace/v1 JSONL files. โ
โ --prometheus --no-prometheus Expose loopback /metrics endpoint. [default: prometheus] โ
โ --leader Run as a cluster fan-in leader and merge follower ranks. โ
โ --follower TEXT Run as a cluster follower and POST snapshots to LEADER_URL. โ
โ --cluster-token PATH Path to operator-generated cluster bearer token. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard daemon stop
Usage: inferguard daemon stop [OPTIONS]
Stop the recorded foreground daemon process when possible.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --port INTEGER Expected daemon port. [default: 9466] โ
โ --watch-dir PATH Expected watch directory. โ
โ --prometheus --no-prometheus Expected Prometheus state. [default: prometheus] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard daemon status
Usage: inferguard daemon status [OPTIONS]
Print daemon state and a one-shot local snapshot.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --port INTEGER Daemon port to report. [default: 9466] โ
โ --watch-dir PATH Optionally load trace files before reporting status. โ
โ --prometheus --no-prometheus Prometheus endpoint expectation. [default: prometheus] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard telemetry
Usage: inferguard telemetry [OPTIONS] COMMAND [ARGS]...
Local-only telemetry consent and payload audit commands.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ status Show local telemetry state without contacting the network. โ
โ enable Enable local telemetry spooling after explicit consent. โ
โ disable Disable telemetry, delete the consent token, and clear local state. โ
โ log Show recent local telemetry events and pending payload files. โ
โ verify-payload Render the exact local-only telemetry payload that would be uploaded. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard telemetry status
Usage: inferguard telemetry status [OPTIONS]
Show local telemetry state without contacting the network.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard telemetry enable
Usage: inferguard telemetry enable [OPTIONS]
Enable local telemetry spooling after explicit consent.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * --consent-token TEXT Consent token issued out-of-band by Touchdown. [required] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard telemetry disable
Usage: inferguard telemetry disable [OPTIONS]
Disable telemetry, delete the consent token, and clear local state.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard telemetry log
Usage: inferguard telemetry log [OPTIONS]
Show recent local telemetry events and pending payload files.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --limit INTEGER Maximum recent events to show. [default: 50] โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard telemetry verify-payload
Usage: inferguard telemetry verify-payload [OPTIONS] PATH
Render the exact local-only telemetry payload that would be uploaded.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * path PATH Payload-pending JSON file or directory. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard workload
Usage: inferguard workload [OPTIONS] COMMAND [ARGS]...
Pre-flight workload fingerprinting.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ analyze Generate a pre-flight workload fingerprint without launching benchmarks. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard workload analyze
Usage: inferguard workload analyze [OPTIONS] LOG_DIR
Generate a pre-flight workload fingerprint without launching benchmarks.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * log_dir PATH Directory containing OpenAI-style JSONL logs. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --format TEXT Input format. Currently: openai-jsonl. [default: openai-jsonl] โ
โ --emit PATH Write workload fingerprint JSON. โ
โ --emit-md PATH Write human-readable workload report markdown. โ
โ --privacy-class TEXT public, private, or regulated. [default: public] โ
โ --latency-sensitivity TEXT tight, loose, or batch. [default: loose] โ
โ --json Print fingerprint JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard router
Usage: inferguard router [OPTIONS] COMMAND [ARGS]...
Rule-based execution-path routing.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ classify Classify bottlenecks and rank execution paths from run artifacts. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
inferguard router classify
Usage: inferguard router classify [OPTIONS] RUN_DIR
Classify bottlenecks and rank execution paths from run artifacts.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * run_dir PATH Directory containing InferGuard or AgentX artifacts. [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --workload-fingerprint PATH Fingerprint JSON from `inferguard workload analyze`. โ
โ --slo TEXT Comma-separated SLOs, e.g. p95_ttft_ms=1000,error_rate_max=0.01. โ
โ --hardware-fleet TEXT Comma-separated hardware labels, e.g. h200,b200,gb200. โ
โ --emit PATH Write router verdict JSON. โ
โ --emit-md PATH Write router verdict markdown. โ
โ --json Print verdict JSON to stdout. โ
โ --help Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ