Tree Operations defines the complete operational lifecycle of the MegaContext Tree: how raw tokens flow into LOD0 blocks, how GistNet compresses them into hierarchical gists, how the tree structure grows incrementally, and how gists are refreshed when GistNet is retrained during alternating optimization.
- Scope: Token ingest buffering, LOD0/LOD1/LOD2 gist generation, tree index updates, gist refresh procedures
- Key invariant: Block alignment maintained throughout all operations—always 32-token boundaries
- Interfaces: Consumes tokens from input stream, invokes GistNet for compression, updates tree index
- Performance: Incremental O(log₃₂ N) updates per block; gist refresh is O(N) but rare
- POC constraints: Synchronous ingest, fixed GistNet checkpoint, no background workers
Details
Overview: Operational Flow
Tree Operations encompasses four primary procedures:
- Token Ingest: Buffering and writing raw tokens to LOD0 storage
- Gist Generation: Running GistNet when complete 32-block groups form at any level
- Tree Structure Updates: Maintaining parent/child pointers and metadata index
- Gist Refresh: Recomputing gists with new GistNet checkpoints during training
All operations maintain the block alignment invariants—node boundaries always fall on 32-token multiples, ensuring the Contiguity Invariant holds when the Focus Allocator swaps levels of detail.
Token Ingest Pipeline
Buffering Strategy
The MegaContext Tree grows in discrete 32-token blocks. To handle arbitrary-length input streams:
- Accumulation buffer: Maintain an
l0_bufferthat collects incoming tokens - Block threshold: When buffer reaches ≥32 tokens, extract a complete 32-token block
- Partial residue: Remaining tokens (< 32) stay buffered until next ingest call
- No partial writes: Never write incomplete blocks—preserves deterministic offsets
Example:
- Ingest 50 tokens → Write 32 to LOD0, buffer 18
- Ingest 20 more → Write 32 (18+14), buffer 6
- Ingest 100 more → Write 96 (3 complete blocks), buffer 10
LOD0 Block Storage
When a complete 32-token block is ready:
- Allocate span ID: Generate unique
span_idfor this LOD0 node - Compute offset: Determine byte position in
LOD0.ctxfile (block_index × 32 × sizeof(token_id)) - Write payload: Append 32 token IDs to
LOD0.ctx(or overwrite at computed offset if preallocated) - Create metadata: Add node entry to in-memory tree index with:
level = 0start_token,end_token(absolute positions in global sequence)parent_id(computed from block index ÷ 32)data_offset(byte position in file)
- Update sibling tracker: Mark this block as ready in parent’s child list
Storage layout: See Storage Format for binary encoding details.
Hierarchical Gist Generation
LOD1 Gist Creation
When 32 consecutive LOD0 blocks complete:
- Check parent full: Detect when an LOD1 parent has all 32 LOD0 children ready
- Load LOD0 embeddings: Retrieve token embeddings for the 32×32 = 1,024 token span
- Run GistNet: Invoke
gistnet.compress(l0_embeddings, level=1)to produce single LOD1 gist vector - Write LOD1 gist: Append gist embedding to
LOD1.ctxfile - Create LOD1 metadata: Add node to tree index with 32
child_idspointing to LOD0 children - Version tracking: Record which GistNet checkpoint (
gist_version) generated this gist
Key insight: GistNet operates on raw embeddings at each level—LOD0 embeddings for LOD1 gists, LOD1 gist embeddings for LOD2 gists.
LOD2 Gist Creation
When 32 consecutive LOD1 gists complete:
- Check LOD2 parent full: Detect when 32 LOD1 siblings are ready
- Load LOD1 gist embeddings: Retrieve the 32 LOD1 gist vectors (span of 1,024 LOD0 tokens)
- Run GistNet: Invoke
gistnet.compress(l1_gists, level=2)to produce single LOD2 gist - Write LOD2 gist: Append to
LOD2.ctxfile - Create LOD2 metadata: Add node with 32
child_idspointing to LOD1 gists - Recursive propagation: If 32 LOD2 gists complete, continue to LOD3 (future)
Compression hierarchy: Each LOD2 gist represents 32 LOD1 gists × 32 LOD0 blocks = 1,024 raw tokens.
Propagation Rules
After writing any level-L node:
if node.siblings_complete(): # All 32 children of parent exist
parent_gist = gistnet.compress(node.siblings(), level=node.level+1)
parent_node = tree.add_gist(parent_gist, level=node.level+1)
# Recursively check parent's parent
propagate_upward(parent_node)Lazy evaluation: Gist generation happens only when all 32 children are ready—no partial gists.
Ingest Pseudocode
Complete ingest procedure with recursive gist generation:
def ingest_tokens(tree, tokens):
"""
Incrementally ingest tokens into the MegaContext Tree.
Args:
tree: MegaContextTree instance
tokens: List of new token IDs to append
Returns:
Number of complete LOD0 blocks written
"""
tree.l0_buffer.extend(tokens)
blocks_written = 0
while len(tree.l0_buffer) >= 32:
# Extract complete 32-token block
block = tree.l0_buffer[:32]
tree.l0_buffer = tree.l0_buffer[32:]
# Write LOD0 block to tree
l0_node = tree.add_l0_block(block)
blocks_written += 1
# Check if LOD1 parent is now complete
if l0_node.parent_full(): # 32 LOD0 siblings ready
l1_gist = gistnet.compress(l0_node.siblings())
l1_node = tree.add_l1_gist(l1_gist)
# Check if LOD2 grandparent is now complete
if l1_node.parent_full(): # 32 LOD1 siblings ready
l2_gist = gistnet.compress(l1_node.siblings())
tree.add_l2_gist(l2_gist)
# Continue recursively for LOD3+ (future)
return blocks_written
def add_l0_block(tree, block):
"""
Append a complete 32-token LOD0 block to the tree.
Args:
block: List of exactly 32 token IDs
Returns:
LOD0 node metadata object
"""
assert len(block) == 32, "LOD0 blocks must be exactly 32 tokens"
# Generate unique span ID
span_id = tree.next_span_id()
tree.next_span_id += 1
# Compute absolute token range
start_token = tree.total_l0_tokens
end_token = start_token + 32
tree.total_l0_tokens += 32
# Compute parent span ID (deterministic from index)
block_index = start_token // 32
parent_id = compute_l1_parent_id(block_index)
# Write to storage
offset = tree.l0_storage.append(block) # Returns byte offset
# Create metadata node
node = TreeNode(
span_id=span_id,
level=0,
start_token=start_token,
end_token=end_token,
parent_id=parent_id,
child_ids=None, # LOD0 nodes are leaves
data_offset=offset,
timestamp=now(),
gist_version=None # LOD0 stores raw tokens, not gists
)
tree.nodes[span_id] = node
tree.register_child(parent_id, span_id)
return node
def add_l1_gist(tree, gist_embedding):
"""
Append an LOD1 gist node to the tree.
Args:
gist_embedding: GistNet output vector (d_model dimensions)
Returns:
LOD1 node metadata object
"""
span_id = tree.next_span_id()
tree.next_span_id += 1
# LOD1 gists represent 32 LOD0 blocks = 1,024 tokens
l1_index = tree.total_l1_gists
start_token = l1_index * 1024
end_token = start_token + 1024
tree.total_l1_gists += 1
# Compute parent (LOD2) and children (32 LOD0 blocks)
parent_id = compute_l2_parent_id(l1_index)
child_ids = [tree.find_l0_child(start_token + i*32) for i in range(32)]
# Write gist embedding to LOD1 storage
offset = tree.l1_storage.append(gist_embedding)
# Create metadata
node = TreeNode(
span_id=span_id,
level=1,
start_token=start_token,
end_token=end_token,
parent_id=parent_id,
child_ids=child_ids,
data_offset=offset,
timestamp=now(),
gist_version=tree.gistnet_checkpoint_id
)
tree.nodes[span_id] = node
# Update parent pointers in children
for child_id in child_ids:
tree.nodes[child_id].parent_id = span_id
tree.register_child(parent_id, span_id)
return nodeNotes:
- Similar logic applies to
add_l2_gist()and higher levels parent_full()checks if all 32 siblings exist in tree indexsiblings()retrieves embeddings for the 32 sibling nodes- Block alignment enforced by fixed 32-token groups
Gist Refresh Operations
When to Refresh
Gist refresh occurs during alternating optimization when GistNet is retrained:
- Training phase: GistNet updates on batches sampled from tree (gist reconstruction + ΔNLL objectives)
- Save checkpoint: New
gistnet_v{N}.ptcheckpoint written to disk - Refresh decision: System can optionally refresh existing gists with new checkpoint
- Propagation: Refreshed LOD1 gists trigger LOD2 refresh; LOD2 gists trigger LOD3 refresh (future)
POC constraint: Gist refresh is not implemented in POC—all gists frozen to initial GistNet checkpoint. Refresh becomes critical during full training loops.
Refresh Procedure
When a new GistNet checkpoint is loaded:
def refresh_gists(tree, new_gistnet_checkpoint):
"""
Recompute all gist nodes using a newly trained GistNet.
Args:
new_gistnet_checkpoint: Path to new GistNet weights
Returns:
Statistics: nodes refreshed per level, time elapsed
"""
gistnet.load_weights(new_gistnet_checkpoint)
checkpoint_id = extract_version(new_gistnet_checkpoint)
stats = {level: 0 for level in [1, 2, 3]} # Track refreshed nodes
# Refresh LOD1 gists first (bottom-up propagation)
for l1_node in tree.get_level_nodes(level=1):
# Retrieve 32 LOD0 children
l0_children = [tree.nodes[cid] for cid in l1_node.child_ids]
l0_embeddings = [tree.load_embedding(node) for node in l0_children]
# Recompute gist with new GistNet
new_gist = gistnet.compress(l0_embeddings, level=1)
# Overwrite in place (same offset in LOD1.ctx)
tree.l1_storage.write_at(l1_node.data_offset, new_gist)
# Update metadata
l1_node.gist_version = checkpoint_id
l1_node.timestamp = now() # Optional: track refresh time
stats[1] += 1
# Refresh LOD2 gists (now that LOD1 gists are updated)
for l2_node in tree.get_level_nodes(level=2):
l1_children = [tree.nodes[cid] for cid in l2_node.child_ids]
l1_gists = [tree.load_embedding(node) for node in l1_children]
new_gist = gistnet.compress(l1_gists, level=2)
tree.l2_storage.write_at(l2_node.data_offset, new_gist)
l2_node.gist_version = checkpoint_id
stats[2] += 1
# Continue for LOD3+ if present
return statsIncremental Refresh (Future)
For efficiency, refresh only affected subtrees:
- Track dirty nodes: Mark LOD0 spans that changed since last refresh
- Propagate dirty flags: Mark LOD1 parents of dirty LOD0 blocks, LOD2 grandparents, etc.
- Selective recomputation: Refresh only dirty gists, skip clean subtrees
- Versioning: Tree can contain mixed
gist_versionvalues during partial refresh
Use case: When MegaCuration updates a subset of the tree (e.g., file edits in Cognitive Core), avoid refreshing entire tree.
Version Metadata
Each gist node tracks which GistNet checkpoint generated it:
| Field | Type | Purpose |
|---|---|---|
gist_version | uint16 | Checkpoint ID (e.g., 0 = initial, 1 = after first retrain) |
timestamp | uint64 | Unix timestamp of last gist computation |
Backward compatibility: Older gists remain valid—Working Context can mix gist versions. LensNet may learn to discount stale gists (low gist_version) during focus scoring.
Tree Index Updates
Metadata Synchronization
After each write operation (LOD0 block or LOD1/LOD2 gist):
- Add node to index: Insert metadata entry in
tree.nodes[span_id]dictionary - Update parent’s children list: Append
span_idto parent’schild_idsarray - Update child’s parent pointer: Set
parent_idfield in newly written node - Increment counters: Update
total_l0_tokens,total_l1_gists, etc.
Index structure (POC):
class MegaContextTree:
nodes: Dict[span_id, TreeNode] # Metadata index
l0_storage: BinaryFile # LOD0.ctx file handle
l1_storage: BinaryFile # LOD1.ctx file handle
l2_storage: BinaryFile # LOD2.ctx file handle
l0_buffer: List[token_id] # Partial block accumulator
total_l0_tokens: int # Global token count
total_l1_gists: int # Number of LOD1 nodes
total_l2_gists: int # Number of LOD2 nodes
next_span_id: int # Monotonic span ID allocator
gistnet_checkpoint_id: int # Current GistNet versionDeterministic Span IDs
Span IDs can be computed deterministically from level + index:
def compute_span_id(level, index_at_level):
"""
Generate deterministic span ID for hierarchical tree.
Args:
level: 0 (LOD0), 1 (LOD1), 2 (LOD2), etc.
index_at_level: Sequential index within that level
Returns:
Unique span ID (uint64)
"""
# Encode level in high bits, index in low bits
return (level << 56) | index_at_levelAdvantage: Given a token position, can compute which LOD0/LOD1/LOD2 node contains it without index lookups.
Offset Calculation
Because all blocks are 32-aligned, byte offsets are deterministic:
def l0_offset(block_index, token_size=4):
"""Byte offset in LOD0.ctx for block_index."""
return block_index * 32 * token_size # 4 bytes/token = 128 bytes/block
def l1_offset(gist_index, embedding_size=4096*4):
"""Byte offset in LOD1.ctx for gist_index."""
return gist_index * embedding_size # 4 bytes/float × 4096 dimsImplication: Can memory-map .ctx files and compute addresses directly—no external B-tree or index file needed. See Storage Format.
API Interface
The MegaContext Tree exposes these operations to other modules:
Public Methods
class MegaContextTree:
# Ingest API
def ingest_tokens(tokens: List[int]) -> int:
"""Append tokens to tree; returns number of complete blocks written."""
def flush_buffer() -> None:
"""Force write partial buffer (< 32 tokens) with padding—used at EOD."""
# Query API
def get_node(span_id: int) -> TreeNode:
"""Retrieve metadata for a node by span ID."""
def get_embedding(span_id: int) -> np.ndarray:
"""Load embedding (token or gist) from storage."""
def get_span_at_position(token_pos: int, level: int) -> TreeNode:
"""Find which LOD0/LOD1/LOD2 node contains this absolute token position."""
def get_level_nodes(level: int, start_pos: int, end_pos: int) -> List[TreeNode]:
"""Retrieve all nodes at a level covering a token range."""
# Update API
def refresh_gists(gistnet_checkpoint: str) -> Dict[int, int]:
"""Recompute all gists with new GistNet; returns stats per level."""
def refresh_span(span_id: int) -> None:
"""Recompute a single gist node (incremental refresh)."""
# Telemetry API
def mark_accessed(span_id: int) -> None:
"""Increment access_count for this span (for MegaCuration)."""
def get_statistics() -> Dict[str, Any]:
"""Return tree size, depth, gist version distribution, etc."""Integration Points
- Runtime Loop: Calls
ingest_tokens()after each model forward pass - Working Context assembly: Calls
get_embedding()to load LOD0/LOD1/LOD2 content - LensNet: Calls
get_level_nodes(level=1)to fetch tail LOD1 gists for conditioning - Focus Allocator: Calls
get_span_at_position()to locate nodes for expand/collapse - Training & Operations: Calls
refresh_gists()after GistNet retraining
Performance Characteristics
Time Complexity
| Operation | Complexity | Notes |
|---|---|---|
ingest_tokens(B) | O(B/32 × log₃₂ N) | Amortized O(1) per token; propagates log₃₂ N levels |
get_node(span_id) | O(1) | Hash table lookup in memory index |
get_embedding(span_id) | O(1) | Direct offset calculation + file read |
get_span_at_position(pos, level) | O(1) | Deterministic: index = pos // 32^(level+1) |
refresh_gists() | O(N) | Must recompute every gist node; parallelizable |
Key insight: Incremental ingest is O(log N) per block, but amortizes to O(1) per token because most ingests don’t trigger gist generation.
Space Complexity
| Component | Size | Scaling |
|---|---|---|
LOD0 storage (LOD0.ctx) | N × 4 bytes | Linear in total tokens |
LOD1 storage (LOD1.ctx) | N/32 × 16 KB | ~0.5 KB per 32 tokens (4096-dim × fp32) |
LOD2 storage (LOD2.ctx) | N/1024 × 16 KB | ~16 bytes per token at LOD2 |
| Metadata index | N/32 × ~128 bytes | ~4 bytes per token (POC in-memory) |
Total: ~0.54 KB per token for full tree (LOD0+LOD1+LOD2+metadata). For 1M tokens: ~540 MB.
Disk I/O (Future)
With memory-mapped files:
- Ingest writes: Sequential append, ~128 bytes per LOD0 block + ~16 KB per LOD1 gist
- Random reads: O(1) seeks via deterministic offsets
- Refresh writes: Random overwrites at existing offsets (in-place update)
See Storage Format for mmap strategies.
POC Simplifications
The proof-of-concept omits several production features:
| Feature | POC Status | Production Plan |
|---|---|---|
| LOD3+ levels | Not implemented | Add when N > 32K tokens (~3 levels sufficient for 1M tokens) |
| Gist refresh | Disabled | Implement during alternating optimization phase |
| Async ingest | Synchronous | Background worker threads for gist generation |
| Disk storage | RAM-only | Memory-mapped .ctx files in Phase 2 |
| Incremental refresh | N/A | Dirty-tracking for subtree refresh |
| Partial block padding | Not handled | Pad final buffer with special token at EOD |
See POC Scope for complete constraints and POC Architecture for module boundaries.
Error Handling & Edge Cases
Incomplete Blocks
Problem: Input ends mid-block (e.g., 17 tokens in buffer at end-of-document).
Solutions:
- Pad with special token: Append 15 copies of
<PAD>token to complete block - Leave buffered: Keep partial block in memory until next ingest (session resumes)
- Flush API: Expose
flush_buffer(pad_token)for explicit handling
POC approach: Leave buffered—assume continuous interaction. No explicit EOD handling.
GistNet Errors
Problem: GistNet forward pass fails (e.g., OOM, NaN outputs).
Mitigation:
- Checkpoint validation: Test GistNet on sample inputs before refresh
- Rollback: Keep old gist version if new computation fails
- Graceful degradation: Skip failed gist, continue with stale version
Concurrent Modifications
Problem: Multiple threads ingesting tokens or refreshing gists simultaneously.
Mitigation:
- Ingest lock: Single writer for LOD0 buffer and tree index
- Read-only refresh: Gist refresh doesn’t add nodes, only updates embeddings
- Versioned reads: Working Context snapshots span IDs before reading embeddings
POC simplification: Single-threaded—no concurrency in demo runs.
Summary
Tree Operations defines the operational heart of the MegaContext Tree: the incremental ingest pipeline that buffers tokens into 32-aligned blocks, the hierarchical GistNet compression that produces LOD1/LOD2 gists, the metadata updates that maintain tree structure, and the gist refresh procedures that keep the tree synchronized with retrained GistNet checkpoints. These operations maintain the block alignment invariants that enable seamless Working Context LOD transitions. All tree growth is incremental (O(log N) per block) and deterministic (offsets computed without indexes), making the tree both efficient and debuggable. During the POC, operations are synchronous and gist refresh is disabled—full alternating optimization with live refresh is a post-POC enhancement.
Related Pages
Parent/Overview Pages
- MegaContext Tree – Parent page: hierarchical tree structure, compression hierarchy, and core concepts
- POC Architecture – System-wide architecture showing how Tree Operations fits into the overall design
- Architecture Details – Complete system architecture and component interactions
Sibling Detail Pages
- Node Metadata – Metadata schema for tree nodes (span IDs, parent/child relationships, offsets)
- Storage Format – Binary file layouts, deterministic offsets, and memory-mapped I/O strategies
- GistNet Architecture Details – Detailed architecture of the compression model used in operations
Related System Components
- GistNet – The compression model that generates LOD1/LOD2 gists during hierarchical operations
- GistNet Training – How GistNet is trained and produces gists for tree operations
- Working Context – Primary consumer of tree content; uses operations to load spans
- Focus Allocator – Triggers span expansion/collapse operations using tree queries
- LensNet – Uses tree operations to fetch tail LOD1 gists for conditioning
Implementation Guides
- POC Implementation – Practical implementation details and constraints for the proof-of-concept
- Training & Operations – Alternating optimization framework including gist refresh procedures
- MegaContext End-to-End Training – How tree operations integrate with the training loop
Related Concepts
- Contiguity Invariant – Block alignment maintained by all tree operations
- LOD) – Level of detail hierarchy managed by operations
- Gist Embedding – The compressed representations generated during operations
- substitutability – Core principle ensuring gists can replace tokens
- ΔNLL – Metric used during gist refresh to validate quality