MegaContext System Invariants
This document defines the fundamental invariants that must hold throughout MegaContext’s operation. These constraints ensure correctness, predictability, and maintain the substitutability property that allows gists to replace tokens seamlessly.
Core Invariants
1. Budget Invariant
Definition:
sum(entry_costs in Working Context) ≤ W_max
What it means:
- The total token-equivalent cost of all entries in the Working Context must never exceed the budget W_max
- LOD0 blocks cost 32 tokens each
- LOD1 and LOD2 gists cost 1 token each
Why it matters:
- Ensures constant GPU memory usage regardless of total history size
- Prevents out-of-memory errors during inference
- Makes performance predictable and scalable
Enforcement:
- Focus Allocator checks budget before every expand operation
- Every expand must be balanced by corresponding collapses
- System rejects operations that would violate the budget
Example:
# Valid working context
entries = [
L0_block, # costs 32 tokens
L0_block, # costs 32 tokens
L1_gist, # costs 1 token
L2_gist, # costs 1 token
]
total_cost = 32 + 32 + 1 + 1 = 66 tokens
W_max = 8192
assert total_cost <= W_max # ✓ Valid2. Contiguity Invariant
Definition:
entry[i].end_token == entry[i+1].start_token (for all consecutive entries)
What it means:
- Entries in the Working Context must tile the timeline without gaps or overlaps
- Every token position in the covered range appears in exactly one entry
- Entries appear in chronological order (older → newer)
Why it matters:
- Ensures the base model sees a coherent narrative flow
- Maintains RoPE positional encoding consistency
- Prevents discontinuous jumps that would confuse the model
- Allows the model to understand temporal relationships
Enforcement:
- Focus Allocator only performs block-aligned swaps
- Expand/collapse operations preserve temporal adjacency
- Tree assembly checks for contiguity before materializing Working Context
Visual Example:
✓ Valid (contiguous):
[LOD0: 0-32] [LOD1: 32-64] [LOD0: 64-96] [LOD2: 96-1120]
└───────┘└────────┘└────────┘└──────────┘
No gaps, perfect adjacency
✗ Invalid (gap):
[LOD0: 0-32] [LOD1: 32-64] [LOD0: 96-128]
└───────┘└────────┘ └────────┘
↑ GAP (tokens 64-96 missing)
✗ Invalid (overlap):
[LOD0: 0-32] [LOD1: 28-60] [LOD0: 60-92]
└───────┘└────────┘└────────┘
↑ OVERLAP (tokens 28-32 appear twice)
3. Block Alignment Invariant
Definition:
entry.start_token % K == 0
entry.end_token % K == 0
What it means:
- All entry boundaries must align with K-token block boundaries
- No entry can start or end mid-block (e.g., at token 17 or token 50)
- LOD0 blocks are exactly 32 tokens
- LOD1 gists represent exactly 32 tokens (one LOD0 block)
- LOD2 gists represent exactly 1,024 tokens (32 LOD0 blocks)
Why it matters:
- Matches GistNet compression granularity (32→1)
- Enables clean expand/collapse operations without partial blocks
- Simplifies MegaContext Tree storage (deterministic offsets)
- Ensures gists always represent complete, meaningful units
Enforcement:
- MegaContext Tree ingests tokens in 32-token batches
- Focus Allocator only swaps complete blocks
- GistNet only generates gists for full 32-token spans
Example:
# Valid entries
L0_block_1 = Entry(start=0, end=32) # ✓ Aligned (0 % 32 == 0, 32 % 32 == 0)
L0_block_2 = Entry(start=32, end=64) # ✓ Aligned
L1_gist = Entry(start=64, end=96) # ✓ Aligned (represents 32 tokens)
L2_gist = Entry(start=96, end=1120) # ✓ Aligned (represents 1024 tokens)
# Invalid entries
bad_entry = Entry(start=17, end=49) # ✗ Misaligned (17 % 32 != 0)4. Level Consistency Invariant
Definition:
entry covers span [s, e) → level ∈ {0, 1, 2} is legal for that span size
What it means:
- LOD0 entries must cover exactly 32 tokens
- LOD1 entries must cover exactly 32 tokens (compressing one LOD0 block)
- LOD2 entries must cover exactly 1,024 tokens (compressing 32 LOD0 blocks)
- Cannot have an LOD1 gist representing 64 tokens or an LOD2 gist representing 512 tokens
Why it matters:
- Ensures gists accurately represent their source spans
- Maintains the hierarchical 32-ary tree structure
- Allows Focus Allocator to predict the effect of expand/collapse operations
- Simplifies MegaContext Tree navigation
Enforcement:
- GistNet only produces gists from 32-child spans
- MegaContext Tree stores gists at fixed levels corresponding to span size
- Focus Allocator checks level legality before operations
Example:
# Valid level assignments
entry_1 = Entry(level=0, start=0, end=32) # ✓ LOD0 covers 32 tokens
entry_2 = Entry(level=1, start=32, end=64) # ✓ LOD1 covers 32 tokens
entry_3 = Entry(level=2, start=64, end=1088) # ✓ LOD2 covers 1024 tokens
# Invalid level assignments
bad_1 = Entry(level=1, start=0, end=64) # ✗ LOD1 can't cover 64 tokens
bad_2 = Entry(level=2, start=0, end=512) # ✗ LOD2 can't cover 512 tokens5. RoPE Invariant
Definition:
For gists: position_id = start_token + (K / 2)
For LOD0 blocks: position_ids = [start_token, start_token+1, ..., end_token-1]
What it means:
- Gists are positioned at the central token index of their span for RoPE
- LOD0 tokens use their actual sequential positions
- This ensures RoPE phase information remains consistent when swapping LODs
Why it matters:
- RoPE encodes positional information as sinusoidal phase rotations
- Misaligned positions would break relative position relationships
- Using central position for gists minimizes phase error
- Preserves the base model’s ability to attend correctly across different LODs
Enforcement:
- Working Context assembly applies RoPE position IDs during materialization
- Gists inherit the central position of their span from the MegaContext Tree
- Focus Allocator preserves absolute token positions during swaps
Example:
# LOD1 gist representing tokens [64, 96)
gist_position = 64 + (32 / 2) = 80 # Central position
# LOD0 block representing tokens [64, 96)
token_positions = [64, 65, 66, ..., 95] # Sequential positions
# When swapping LOD1→LOD0 or LOD0→LOD1 for span [64, 96):
# - RoPE still sees positions centered around 80
# - Relative distances to other spans remain consistentWhy Central Position?
- Minimizes maximum distance from gist to any token it represents
- For a 32-token span [0, 32), central position 16 is at most 16 tokens from any original token
- Edge positions (0 or 31) would be up to 31 tokens away, increasing RoPE phase error
Derivative Invariants
These invariants follow from the core invariants above:
6. Tree-Context Consistency
Definition: Every entry in the Working Context corresponds to a node in the MegaContext Tree at the appropriate level.
Implications:
- Cannot have an LOD0 entry for a span that doesn’t exist in
LOD0.ctx - Cannot have an LOD1 gist that wasn’t generated by GistNet
- Expand operations require children to exist in the tree
- Collapse operations require parent gist to exist in the tree
7. Monotonic Timeline
Definition:
entry[i].start_token < entry[i+1].start_token (for all i)
Implications:
- Working Context entries always proceed forward in time
- No time-travel or out-of-order entries
- Follows from Contiguity Invariant + Block Alignment Invariant
8. No Partial Swaps
Definition: When expanding or collapsing, all 32 children/parent must be swapped atomically.
Implications:
- Cannot expand half of an LOD1 gist (e.g., only 16 of its 32 tokens)
- Cannot collapse just 10 of 32 LOD0 blocks into an incomplete LOD1 gist
- Follows from Block Alignment Invariant + Level Consistency Invariant
Invariant Violations & Recovery
Detection
Budget Violation:
if sum(entry.cost for entry in working_context) > W_max:
raise BudgetViolationError("Working context exceeds W_max")Contiguity Violation:
for i in range(len(working_context) - 1):
if working_context[i].end != working_context[i+1].start:
raise ContiguityViolationError(f"Gap between entry {i} and {i+1}")Block Alignment Violation:
for entry in working_context:
if entry.start % K != 0 or entry.end % K != 0:
raise AlignmentViolationError(f"Entry {entry} not K-aligned")Recovery Strategies
If an invariant is violated:
- Rollback: Revert to previous valid Working Context state
- Recompute: Rebuild Working Context from MegaContext Tree
- Logging: Record violation for debugging and telemetry
- Graceful degradation: Fall back to simpler focus policy (e.g., recency-only)
Testing Invariants
See tests/test_invariants.py for comprehensive invariant checks originally run during the POC milestone (still enforced under MegaContext End-to-End Training):
test_budget_invariant()- Verifies budget never exceeds W_maxtest_contiguity_invariant()- Checks for gaps and overlapstest_block_alignment()- Validates K-alignmenttest_level_consistency()- Ensures LODs match span sizestest_rope_positions()- Validates RoPE position assignments
Summary
These invariants are the foundation of MegaContext’s correctness:
- Budget Invariant → Constant memory usage
- Contiguity Invariant → Coherent narrative flow
- Block Alignment Invariant → Clean LOD swaps
- Level Consistency Invariant → Hierarchical structure integrity
- RoPE Invariant → Position encoding consistency
By maintaining these invariants, MegaContext ensures that gists can seamlessly substitute for tokens without breaking the frozen base model’s expectations.
See Architecture Details for how these invariants relate to the overall system design, and Focus Allocator for how they guide operational decisions.