Skip to content

Admin runtime controls

Rolling-window metrics snapshot

Machine-readable metrics snapshot — same data the admin dashboard’s HTMX fragment renders.

Categorised per capability (asr / llm / tts) over a rolling time window (default 60 minutes). Includes p50/p95/p99 latency, error rates, GPU-slot wait distribution, and a tail of the last 50 samples per category for spot-checks. The full schema is the MetricsSnapshot Pydantic model in aistack.api._schemas.

Restart loses the in-process samples; for cross-restart trend analysis use the JSONL access log under AISTACK_OBS_LOG_DIR.

Successful response.

POST /admin/api/observability/clear-payload

Section titled “POST /admin/api/observability/clear-payload”

Delete every captured request/response payload

Wipe every payload-capture file under the configured payload directory.

Useful between bench runs or when the on-disk usage approaches the configured AISTACK_OBS_PAYLOAD_MAX_GB cap and you want to reclaim immediately rather than wait for the size sweep. Returns the re-rendered observability HTMX fragment.

Flip an observability toggle (live)

Flip one observability toggle in process memory.

Live-only — restart returns to env-var defaults (AISTACK_OBS_METRICS_ENABLED, AISTACK_OBS_ACCESS_LOG_ENABLED, AISTACK_OBS_PAYLOAD_ENABLED). Returns the re-rendered observability HTMX fragment so the admin dashboard swaps it in place without a separate fetch.

Content type: multipart/form-data

FieldTypeRequiredDescription
keystringyesWhich toggle to flip. One of ‘metrics’ / ‘access_log’ / ‘payload’.
valuestringyesTruthy (‘1’, ‘on’, ‘true’, ‘yes’, ‘y’) turns it on; anything else turns it off.

Drop loaded ASR weights from cache

Drop every ASR weight currently resident in the model cache.

Useful between bench runs to free VRAM without restarting uvicorn. In-flight requests keep their own reference to the loaded model and finish normally; only the cache slot is released, so the next call triggers a cold load.

TTS and LLM cache entries are untouched.

Returns the re-rendered cache fragment when called from HTMX (so the admin UI swaps it in place), or JSON {evicted, remaining} otherwise (so scripts and bench runners can read the count directly).


Response shape for GET /admin/api/metrics.

Built by aistack.observability.metrics.snapshot(). Stable across /v1 — adding new categories or new top-level keys is allowed, renaming or removing requires a version bump.

FieldTypeRequiredDescription
uptime_secnumberyesProcess uptime in seconds.
window_secintegeryesRolling window duration the percentiles are computed over.
categoriesobject (string → MetricsCategorySnapshot)yesPer-capability metrics. Keys: ‘asr’, ‘llm’, ‘tts’ (only those that received traffic since startup).