Tree Operations defines the complete operational lifecycle of the MegaContext Tree: how raw tokens flow into LOD0 blocks, how GistNet compresses them into hierarchical gists, how the tree structure grows incrementally, and how gists are refreshed when GistNet is retrained during alternating optimization.


  • Scope: Token ingest buffering, LOD0/LOD1/LOD2 gist generation, tree index updates, gist refresh procedures
  • Key invariant: Block alignment maintained throughout all operations—always 32-token boundaries
  • Interfaces: Consumes tokens from input stream, invokes GistNet for compression, updates tree index
  • Performance: Incremental O(log₃₂ N) updates per block; gist refresh is O(N) but rare
  • POC constraints: Synchronous ingest, fixed GistNet checkpoint, no background workers

Details

Overview: Operational Flow

Tree Operations encompasses four primary procedures:

  1. Token Ingest: Buffering and writing raw tokens to LOD0 storage
  2. Gist Generation: Running GistNet when complete 32-block groups form at any level
  3. Tree Structure Updates: Maintaining parent/child pointers and metadata index
  4. Gist Refresh: Recomputing gists with new GistNet checkpoints during training

All operations maintain the block alignment invariants—node boundaries always fall on 32-token multiples, ensuring the Contiguity Invariant holds when the Focus Allocator swaps levels of detail.


Token Ingest Pipeline

Buffering Strategy

The MegaContext Tree grows in discrete 32-token blocks. To handle arbitrary-length input streams:

  1. Accumulation buffer: Maintain an l0_buffer that collects incoming tokens
  2. Block threshold: When buffer reaches ≥32 tokens, extract a complete 32-token block
  3. Partial residue: Remaining tokens (< 32) stay buffered until next ingest call
  4. No partial writes: Never write incomplete blocks—preserves deterministic offsets

Example:

  • Ingest 50 tokens → Write 32 to LOD0, buffer 18
  • Ingest 20 more → Write 32 (18+14), buffer 6
  • Ingest 100 more → Write 96 (3 complete blocks), buffer 10

LOD0 Block Storage

When a complete 32-token block is ready:

  1. Allocate span ID: Generate unique span_id for this LOD0 node
  2. Compute offset: Determine byte position in LOD0.ctx file (block_index × 32 × sizeof(token_id))
  3. Write payload: Append 32 token IDs to LOD0.ctx (or overwrite at computed offset if preallocated)
  4. Create metadata: Add node entry to in-memory tree index with:
    • level = 0
    • start_token, end_token (absolute positions in global sequence)
    • parent_id (computed from block index ÷ 32)
    • data_offset (byte position in file)
  5. Update sibling tracker: Mark this block as ready in parent’s child list

Storage layout: See Storage Format for binary encoding details.


Hierarchical Gist Generation

LOD1 Gist Creation

When 32 consecutive LOD0 blocks complete:

  1. Check parent full: Detect when an LOD1 parent has all 32 LOD0 children ready
  2. Load LOD0 embeddings: Retrieve token embeddings for the 32×32 = 1,024 token span
  3. Run GistNet: Invoke gistnet.compress(l0_embeddings, level=1) to produce single LOD1 gist vector
  4. Write LOD1 gist: Append gist embedding to LOD1.ctx file
  5. Create LOD1 metadata: Add node to tree index with 32 child_ids pointing to LOD0 children
  6. Version tracking: Record which GistNet checkpoint (gist_version) generated this gist

Key insight: GistNet operates on raw embeddings at each level—LOD0 embeddings for LOD1 gists, LOD1 gist embeddings for LOD2 gists.

LOD2 Gist Creation

When 32 consecutive LOD1 gists complete:

  1. Check LOD2 parent full: Detect when 32 LOD1 siblings are ready
  2. Load LOD1 gist embeddings: Retrieve the 32 LOD1 gist vectors (span of 1,024 LOD0 tokens)
  3. Run GistNet: Invoke gistnet.compress(l1_gists, level=2) to produce single LOD2 gist
  4. Write LOD2 gist: Append to LOD2.ctx file
  5. Create LOD2 metadata: Add node with 32 child_ids pointing to LOD1 gists
  6. Recursive propagation: If 32 LOD2 gists complete, continue to LOD3 (future)

Compression hierarchy: Each LOD2 gist represents 32 LOD1 gists × 32 LOD0 blocks = 1,024 raw tokens.

Propagation Rules

After writing any level-L node:

if node.siblings_complete():  # All 32 children of parent exist
    parent_gist = gistnet.compress(node.siblings(), level=node.level+1)
    parent_node = tree.add_gist(parent_gist, level=node.level+1)
    # Recursively check parent's parent
    propagate_upward(parent_node)

Lazy evaluation: Gist generation happens only when all 32 children are ready—no partial gists.


Ingest Pseudocode

Complete ingest procedure with recursive gist generation:

def ingest_tokens(tree, tokens):
    """
    Incrementally ingest tokens into the MegaContext Tree.
 
    Args:
        tree: MegaContextTree instance
        tokens: List of new token IDs to append
 
    Returns:
        Number of complete LOD0 blocks written
    """
    tree.l0_buffer.extend(tokens)
    blocks_written = 0
 
    while len(tree.l0_buffer) >= 32:
        # Extract complete 32-token block
        block = tree.l0_buffer[:32]
        tree.l0_buffer = tree.l0_buffer[32:]
 
        # Write LOD0 block to tree
        l0_node = tree.add_l0_block(block)
        blocks_written += 1
 
        # Check if LOD1 parent is now complete
        if l0_node.parent_full():  # 32 LOD0 siblings ready
            l1_gist = gistnet.compress(l0_node.siblings())
            l1_node = tree.add_l1_gist(l1_gist)
 
            # Check if LOD2 grandparent is now complete
            if l1_node.parent_full():  # 32 LOD1 siblings ready
                l2_gist = gistnet.compress(l1_node.siblings())
                tree.add_l2_gist(l2_gist)
 
                # Continue recursively for LOD3+ (future)
 
    return blocks_written
 
 
def add_l0_block(tree, block):
    """
    Append a complete 32-token LOD0 block to the tree.
 
    Args:
        block: List of exactly 32 token IDs
 
    Returns:
        LOD0 node metadata object
    """
    assert len(block) == 32, "LOD0 blocks must be exactly 32 tokens"
 
    # Generate unique span ID
    span_id = tree.next_span_id()
    tree.next_span_id += 1
 
    # Compute absolute token range
    start_token = tree.total_l0_tokens
    end_token = start_token + 32
    tree.total_l0_tokens += 32
 
    # Compute parent span ID (deterministic from index)
    block_index = start_token // 32
    parent_id = compute_l1_parent_id(block_index)
 
    # Write to storage
    offset = tree.l0_storage.append(block)  # Returns byte offset
 
    # Create metadata node
    node = TreeNode(
        span_id=span_id,
        level=0,
        start_token=start_token,
        end_token=end_token,
        parent_id=parent_id,
        child_ids=None,  # LOD0 nodes are leaves
        data_offset=offset,
        timestamp=now(),
        gist_version=None  # LOD0 stores raw tokens, not gists
    )
 
    tree.nodes[span_id] = node
    tree.register_child(parent_id, span_id)
 
    return node
 
 
def add_l1_gist(tree, gist_embedding):
    """
    Append an LOD1 gist node to the tree.
 
    Args:
        gist_embedding: GistNet output vector (d_model dimensions)
 
    Returns:
        LOD1 node metadata object
    """
    span_id = tree.next_span_id()
    tree.next_span_id += 1
 
    # LOD1 gists represent 32 LOD0 blocks = 1,024 tokens
    l1_index = tree.total_l1_gists
    start_token = l1_index * 1024
    end_token = start_token + 1024
    tree.total_l1_gists += 1
 
    # Compute parent (LOD2) and children (32 LOD0 blocks)
    parent_id = compute_l2_parent_id(l1_index)
    child_ids = [tree.find_l0_child(start_token + i*32) for i in range(32)]
 
    # Write gist embedding to LOD1 storage
    offset = tree.l1_storage.append(gist_embedding)
 
    # Create metadata
    node = TreeNode(
        span_id=span_id,
        level=1,
        start_token=start_token,
        end_token=end_token,
        parent_id=parent_id,
        child_ids=child_ids,
        data_offset=offset,
        timestamp=now(),
        gist_version=tree.gistnet_checkpoint_id
    )
 
    tree.nodes[span_id] = node
 
    # Update parent pointers in children
    for child_id in child_ids:
        tree.nodes[child_id].parent_id = span_id
 
    tree.register_child(parent_id, span_id)
 
    return node

Notes:

  • Similar logic applies to add_l2_gist() and higher levels
  • parent_full() checks if all 32 siblings exist in tree index
  • siblings() retrieves embeddings for the 32 sibling nodes
  • Block alignment enforced by fixed 32-token groups

Gist Refresh Operations

When to Refresh

Gist refresh occurs during alternating optimization when GistNet is retrained:

  1. Training phase: GistNet updates on batches sampled from tree (gist reconstruction + ΔNLL objectives)
  2. Save checkpoint: New gistnet_v{N}.pt checkpoint written to disk
  3. Refresh decision: System can optionally refresh existing gists with new checkpoint
  4. Propagation: Refreshed LOD1 gists trigger LOD2 refresh; LOD2 gists trigger LOD3 refresh (future)

POC constraint: Gist refresh is not implemented in POC—all gists frozen to initial GistNet checkpoint. Refresh becomes critical during full training loops.

Refresh Procedure

When a new GistNet checkpoint is loaded:

def refresh_gists(tree, new_gistnet_checkpoint):
    """
    Recompute all gist nodes using a newly trained GistNet.
 
    Args:
        new_gistnet_checkpoint: Path to new GistNet weights
 
    Returns:
        Statistics: nodes refreshed per level, time elapsed
    """
    gistnet.load_weights(new_gistnet_checkpoint)
    checkpoint_id = extract_version(new_gistnet_checkpoint)
 
    stats = {level: 0 for level in [1, 2, 3]}  # Track refreshed nodes
 
    # Refresh LOD1 gists first (bottom-up propagation)
    for l1_node in tree.get_level_nodes(level=1):
        # Retrieve 32 LOD0 children
        l0_children = [tree.nodes[cid] for cid in l1_node.child_ids]
        l0_embeddings = [tree.load_embedding(node) for node in l0_children]
 
        # Recompute gist with new GistNet
        new_gist = gistnet.compress(l0_embeddings, level=1)
 
        # Overwrite in place (same offset in LOD1.ctx)
        tree.l1_storage.write_at(l1_node.data_offset, new_gist)
 
        # Update metadata
        l1_node.gist_version = checkpoint_id
        l1_node.timestamp = now()  # Optional: track refresh time
 
        stats[1] += 1
 
    # Refresh LOD2 gists (now that LOD1 gists are updated)
    for l2_node in tree.get_level_nodes(level=2):
        l1_children = [tree.nodes[cid] for cid in l2_node.child_ids]
        l1_gists = [tree.load_embedding(node) for node in l1_children]
 
        new_gist = gistnet.compress(l1_gists, level=2)
        tree.l2_storage.write_at(l2_node.data_offset, new_gist)
 
        l2_node.gist_version = checkpoint_id
        stats[2] += 1
 
    # Continue for LOD3+ if present
 
    return stats

Incremental Refresh (Future)

For efficiency, refresh only affected subtrees:

  1. Track dirty nodes: Mark LOD0 spans that changed since last refresh
  2. Propagate dirty flags: Mark LOD1 parents of dirty LOD0 blocks, LOD2 grandparents, etc.
  3. Selective recomputation: Refresh only dirty gists, skip clean subtrees
  4. Versioning: Tree can contain mixed gist_version values during partial refresh

Use case: When MegaCuration updates a subset of the tree (e.g., file edits in Cognitive Core), avoid refreshing entire tree.

Version Metadata

Each gist node tracks which GistNet checkpoint generated it:

FieldTypePurpose
gist_versionuint16Checkpoint ID (e.g., 0 = initial, 1 = after first retrain)
timestampuint64Unix timestamp of last gist computation

Backward compatibility: Older gists remain valid—Working Context can mix gist versions. LensNet may learn to discount stale gists (low gist_version) during focus scoring.


Tree Index Updates

Metadata Synchronization

After each write operation (LOD0 block or LOD1/LOD2 gist):

  1. Add node to index: Insert metadata entry in tree.nodes[span_id] dictionary
  2. Update parent’s children list: Append span_id to parent’s child_ids array
  3. Update child’s parent pointer: Set parent_id field in newly written node
  4. Increment counters: Update total_l0_tokens, total_l1_gists, etc.

Index structure (POC):

class MegaContextTree:
    nodes: Dict[span_id, TreeNode]  # Metadata index
    l0_storage: BinaryFile  # LOD0.ctx file handle
    l1_storage: BinaryFile  # LOD1.ctx file handle
    l2_storage: BinaryFile  # LOD2.ctx file handle
 
    l0_buffer: List[token_id]  # Partial block accumulator
    total_l0_tokens: int  # Global token count
    total_l1_gists: int  # Number of LOD1 nodes
    total_l2_gists: int  # Number of LOD2 nodes
 
    next_span_id: int  # Monotonic span ID allocator
    gistnet_checkpoint_id: int  # Current GistNet version

Deterministic Span IDs

Span IDs can be computed deterministically from level + index:

def compute_span_id(level, index_at_level):
    """
    Generate deterministic span ID for hierarchical tree.
 
    Args:
        level: 0 (LOD0), 1 (LOD1), 2 (LOD2), etc.
        index_at_level: Sequential index within that level
 
    Returns:
        Unique span ID (uint64)
    """
    # Encode level in high bits, index in low bits
    return (level << 56) | index_at_level

Advantage: Given a token position, can compute which LOD0/LOD1/LOD2 node contains it without index lookups.

Offset Calculation

Because all blocks are 32-aligned, byte offsets are deterministic:

def l0_offset(block_index, token_size=4):
    """Byte offset in LOD0.ctx for block_index."""
    return block_index * 32 * token_size  # 4 bytes/token = 128 bytes/block
 
def l1_offset(gist_index, embedding_size=4096*4):
    """Byte offset in LOD1.ctx for gist_index."""
    return gist_index * embedding_size  # 4 bytes/float × 4096 dims

Implication: Can memory-map .ctx files and compute addresses directly—no external B-tree or index file needed. See Storage Format.


API Interface

The MegaContext Tree exposes these operations to other modules:

Public Methods

class MegaContextTree:
    # Ingest API
    def ingest_tokens(tokens: List[int]) -> int:
        """Append tokens to tree; returns number of complete blocks written."""
 
    def flush_buffer() -> None:
        """Force write partial buffer (< 32 tokens) with padding—used at EOD."""
 
    # Query API
    def get_node(span_id: int) -> TreeNode:
        """Retrieve metadata for a node by span ID."""
 
    def get_embedding(span_id: int) -> np.ndarray:
        """Load embedding (token or gist) from storage."""
 
    def get_span_at_position(token_pos: int, level: int) -> TreeNode:
        """Find which LOD0/LOD1/LOD2 node contains this absolute token position."""
 
    def get_level_nodes(level: int, start_pos: int, end_pos: int) -> List[TreeNode]:
        """Retrieve all nodes at a level covering a token range."""
 
    # Update API
    def refresh_gists(gistnet_checkpoint: str) -> Dict[int, int]:
        """Recompute all gists with new GistNet; returns stats per level."""
 
    def refresh_span(span_id: int) -> None:
        """Recompute a single gist node (incremental refresh)."""
 
    # Telemetry API
    def mark_accessed(span_id: int) -> None:
        """Increment access_count for this span (for MegaCuration)."""
 
    def get_statistics() -> Dict[str, Any]:
        """Return tree size, depth, gist version distribution, etc."""

Integration Points


Performance Characteristics

Time Complexity

OperationComplexityNotes
ingest_tokens(B)O(B/32 × log₃₂ N)Amortized O(1) per token; propagates log₃₂ N levels
get_node(span_id)O(1)Hash table lookup in memory index
get_embedding(span_id)O(1)Direct offset calculation + file read
get_span_at_position(pos, level)O(1)Deterministic: index = pos // 32^(level+1)
refresh_gists()O(N)Must recompute every gist node; parallelizable

Key insight: Incremental ingest is O(log N) per block, but amortizes to O(1) per token because most ingests don’t trigger gist generation.

Space Complexity

ComponentSizeScaling
LOD0 storage (LOD0.ctx)N × 4 bytesLinear in total tokens
LOD1 storage (LOD1.ctx)N/32 × 16 KB~0.5 KB per 32 tokens (4096-dim × fp32)
LOD2 storage (LOD2.ctx)N/1024 × 16 KB~16 bytes per token at LOD2
Metadata indexN/32 × ~128 bytes~4 bytes per token (POC in-memory)

Total: ~0.54 KB per token for full tree (LOD0+LOD1+LOD2+metadata). For 1M tokens: ~540 MB.

Disk I/O (Future)

With memory-mapped files:

  • Ingest writes: Sequential append, ~128 bytes per LOD0 block + ~16 KB per LOD1 gist
  • Random reads: O(1) seeks via deterministic offsets
  • Refresh writes: Random overwrites at existing offsets (in-place update)

See Storage Format for mmap strategies.


POC Simplifications

The proof-of-concept omits several production features:

FeaturePOC StatusProduction Plan
LOD3+ levelsNot implementedAdd when N > 32K tokens (~3 levels sufficient for 1M tokens)
Gist refreshDisabledImplement during alternating optimization phase
Async ingestSynchronousBackground worker threads for gist generation
Disk storageRAM-onlyMemory-mapped .ctx files in Phase 2
Incremental refreshN/ADirty-tracking for subtree refresh
Partial block paddingNot handledPad final buffer with special token at EOD

See POC Scope for complete constraints and POC Architecture for module boundaries.


Error Handling & Edge Cases

Incomplete Blocks

Problem: Input ends mid-block (e.g., 17 tokens in buffer at end-of-document).

Solutions:

  1. Pad with special token: Append 15 copies of <PAD> token to complete block
  2. Leave buffered: Keep partial block in memory until next ingest (session resumes)
  3. Flush API: Expose flush_buffer(pad_token) for explicit handling

POC approach: Leave buffered—assume continuous interaction. No explicit EOD handling.

GistNet Errors

Problem: GistNet forward pass fails (e.g., OOM, NaN outputs).

Mitigation:

  1. Checkpoint validation: Test GistNet on sample inputs before refresh
  2. Rollback: Keep old gist version if new computation fails
  3. Graceful degradation: Skip failed gist, continue with stale version

Concurrent Modifications

Problem: Multiple threads ingesting tokens or refreshing gists simultaneously.

Mitigation:

  1. Ingest lock: Single writer for LOD0 buffer and tree index
  2. Read-only refresh: Gist refresh doesn’t add nodes, only updates embeddings
  3. Versioned reads: Working Context snapshots span IDs before reading embeddings

POC simplification: Single-threaded—no concurrency in demo runs.


Summary

Tree Operations defines the operational heart of the MegaContext Tree: the incremental ingest pipeline that buffers tokens into 32-aligned blocks, the hierarchical GistNet compression that produces LOD1/LOD2 gists, the metadata updates that maintain tree structure, and the gist refresh procedures that keep the tree synchronized with retrained GistNet checkpoints. These operations maintain the block alignment invariants that enable seamless Working Context LOD transitions. All tree growth is incremental (O(log N) per block) and deterministic (offsets computed without indexes), making the tree both efficient and debuggable. During the POC, operations are synchronous and gist refresh is disabled—full alternating optimization with live refresh is a post-POC enhancement.


Parent/Overview Pages

  • MegaContext Tree – Parent page: hierarchical tree structure, compression hierarchy, and core concepts
  • POC Architecture – System-wide architecture showing how Tree Operations fits into the overall design
  • Architecture Details – Complete system architecture and component interactions

Sibling Detail Pages

  • Node Metadata – Metadata schema for tree nodes (span IDs, parent/child relationships, offsets)
  • Storage Format – Binary file layouts, deterministic offsets, and memory-mapped I/O strategies
  • GistNet Architecture Details – Detailed architecture of the compression model used in operations
  • GistNet – The compression model that generates LOD1/LOD2 gists during hierarchical operations
  • GistNet Training – How GistNet is trained and produces gists for tree operations
  • Working Context – Primary consumer of tree content; uses operations to load spans
  • Focus Allocator – Triggers span expansion/collapse operations using tree queries
  • LensNet – Uses tree operations to fetch tail LOD1 gists for conditioning

Implementation Guides

  • Contiguity Invariant – Block alignment maintained by all tree operations
  • LOD) – Level of detail hierarchy managed by operations
  • Gist Embedding – The compressed representations generated during operations
  • substitutability – Core principle ensuring gists can replace tokens
  • ΔNLL – Metric used during gist refresh to validate quality