Baseline decode demo for the frozen LLM runtime. Run this after training via run10.sh, speedrun.sh, or run1000.sh so the ~/.cache/nanochat directory contains checkpoints. Refer to Training & Operations if you need to re-create the environment or rerun the training scripts.
- Command (CLI):
python -m scripts.chat_cli -p "Why is the sky blue?" - Command (web UI):
python -m scripts.chat_web - Batch eval:
python -m scripts.chat_eval -- -i sft - Prompt source: CLI accepts inline prompts via
-por reads from stdin interactively; eval pulls standard datasets baked into nanochat. - Telemetry: CLI prints responses and logs structured events through nanochat’s report generator + WANDB instrumentation.
- Weights & Biases: export
WANDB_RUN=<name>orMEGACONTEXT_ENABLE_WANDB=1before training so later CLI/eval runs attach to the same project. - Maintenance: refresh this note when runtime flags, configs, or expected outputs change.
Expected Output
- CLI prints the continuation to stdout (prefixed with
>>>). Example:>>> Why is the sky blue? MegaContext: Rayleigh scattering ... [generated tokens] report/report.mdis refreshed with the latest chat samples plus per-phase metrics (Base Loss, Mid Eval, Chat Eval, Samples).- If WANDB is enabled, a run named
<WANDB_RUN>-chat(or similar) appears with latency + allocator charts. - The CLI prints the checkpoint + tokenizer it loaded (e.g.,
Loaded base checkpoint: ~/.cache/nanochat/base/...). Capture this in hand-off notes so other contributors can reproduce.
If any of these artifacts are missing, see the troubleshooting section below.
Sample log excerpt
{
"prompt_tokens": 512,
"working_budget": 2048,
"swap_rate": 0.12,
"residency": 0.95,
"latency_ms": 153.7,
"notes": "JT cycle 42, configs/SampleText_TinyGPT2.yaml"
}Troubleshooting
scripts.chat_clicannot find checkpoints: confirm~/.cache/nanochat/base(or theNANOCHAT_BASE_DIRoverride) contains recent artifacts. Usels $NANOCHAT_BASE_DIR/*to verify. Re-runrun10.shif empty.- WANDB authentication errors: set
WANDB_API_KEYbefore launching the CLI/eval scripts or specifyWANDB_MODE=offline. Runs fall back to DummyWandb if the key is missing. - Model/tokenizer mismatch: ensure CLI output shows the same tokenizer hash that was trained in the run (
tok_train_step=...). If not, rebuild viapython -m scripts.tok_trainand rerunscripts.tok_eval. - Latent mismatches across runs: confirm the CLI uses the same
--max_seq_lenand--device_batch_sizedefaults as training. Override via flags if needed. - Unstable generations / empty output: inspect
report/report.mdfor swap spikes; retrain LensNet/mid stages if residency drops below 80 %. Also re-runpython -m scripts.chat_eval -- -i sftto ensure eval metrics are healthy. - No
report/report.mdupdate: rerunpython -m nanochat.report generateafter the CLI session. Missing sections often indicate earlier training stage failures; check logs under$NANOCHAT_BASE_DIR/report.
Escalate persistent runtime issues by attaching the failing log + config to the relevant PRD or Migration Status entry.
Notes
- Training & Operations outlines shared logging conventions and acceptance criteria for runtime demos, tied to MegaContext End-to-End Training checkpoints produced by the nanochat scripts.
- Hook the nanochat chat/eval commands into MegaPrediction Training’s gist-first inference path once the shared readout head lands.