Fractal Glyph Tape

Agent Memory OS: Dense, fractal, cross-lingual phrase memory.

Intelligent memory retrieval for AI agents. Fractal Glyph Tape (FGT) clusters phrases, assigns them glyph codes, and uses foveated memory to deliver the right context at the right time—achieving +46.7pp accuracy gain at a 256-token budget on synthetic multi-turn dialogs.

Memory Console Try the Demo Explore the Map Get the Code Read the Research

# Encode text to glyph representation

$ echo "Can you send me that file?" | fgt encode

谷阜

# Decode glyph back to phrase family

$ echo "谷阜" | fgt decode

Phrase family #1247: File-sharing request
• "Can you send me that file?" (en)
• "Mind emailing the document?" (en)
• "你能发给我那个文件吗？" (zh)
• "¿Puedes enviarme ese archivo?" (es)

What's in this repo?

A complete research prototype for phrase-level semantic compression and cross-lingual LLMs

Agent Memory Service

Production-ready REST API for intelligent memory retrieval. Foveated allocation delivers +46.7pp accuracy gain at a 256-token budget on synthetic multi-turn dialogs.

Semantic Compression

Smaller corpora and logs with reconstructable meaning. 50-70% compression on our test corpora while preserving semantic content.

Effective Context Extension

More usable signal per token under fixed context windows. Fit 2.5-4x more semantic content in the same token budget on our internal benchmarks.

Cross-Lingual Bridging

Shared glyph IDs for phrase families spanning multiple languages. 90-95% cross-lingual precision on EN↔ES↔ZH retrieval experiments.

All metrics are from internal experiments; see README and docs/PHASE-5-RESULTS.md for setup and limitations.

Implementation Includes:

✓Memory Service API – REST endpoints for agent memory read/write

✓Foveated retrieval – 3-zone allocation (early/relevant/recent)

✓Memory Console UI – interactive chat with context visualization

✓Multilingual embeddings & clustering – phrase families with metadata

✓Glyph encoding system – integer glyph IDs → Mandarin glyph strings

✓Fractal tape builder – 2D projection + recursive triangular addressing

✓Hybrid tokenizer – wraps base tokenizer with glyph-aware spans

✓Benchmark suite – Phase 5 validation with +46.7pp accuracy gains

Why It Matters

Three core capabilities that transform how LLMs handle language

Intelligent Memory Retrieval

•Foveated allocation strategy: 30% early context, 30% relevant, 40% recent
•Delivers the right memories at the right time for agent decision-making
•+46.7pp accuracy improvement over naive truncation under tight budgets

Semantic Compression

•Replace repeated patterns with short glyph codes
•Store one shared phrase-family table instead of millions of near-duplicates
•50-70% compression while preserving semantic content

Cross-Lingual by Design

•English, Spanish, Chinese, and other languages sharing the same intent cluster together
•Glyph IDs act as language-agnostic anchors for retrieval and analysis
•90-95% precision across language pairs

+46.7pp

Accuracy Gain (256 tokens)

50-70%

Compression Ratio

90-95%

Cross-Lingual Precision

How It Works

Three steps to a navigable phrase memory

Step 1

Cluster

We embed and cluster phrases into phrase families, keeping examples, statistics, and language labels.

Step 2

Glyph & Fractal

Each family gets a glyph code and a coordinate on a fractal tape—a recursive triangular map of phrase space.

Step 3

Integrate

A hybrid tokenizer and LLM adapter let existing models consume glyph-coded text and learn to expand glyphs into natural language.

Full Pipeline

Corpus

→

Embeddings

→

Clusters

→

Glyph IDs

→

Fractal Tape

→

Tokenizer

→

LLM

Quickstart: Agent Memory API

# Start the memory service
python -m src.memory.service

# Write to agent memory
curl -X POST http://localhost:8000/api/memory/write \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "my-agent", "turn": {...}}'

# Read with foveated retrieval
curl -X POST http://localhost:8000/api/memory/read \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "my-agent", "token_budget": 256}'

# Try the Memory Console
# Visit http://localhost:3000/memory-console

Build Your Own Tape

# 1) Create environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2) Build a demo tape
python scripts/run_full_build.py --config configs/demo.yaml

# 3) Try the CLI
echo "Can you send me that file?" | fgt encode
echo "谷阜" | fgt decode

# 4) Launch the visualizer
uvicorn fgt.viz.app:app --reload

For Researchers and Builders

If you care about:

•Tokenization and representation learning
•Semantic compression and storage
•Cross-lingual alignment
•Long-context LLMs

…then FGT is designed to be picked apart, extended, and argued with.

FGT is research software

We invite feedback, experiments, and extensions. If you're working on tokenization, compression, or cross-lingual LLMs, this is for you.

Fractal Glyph Tape

What's in this repo?

Agent Memory Service

Semantic Compression

Effective Context Extension

Cross-Lingual Bridging

Implementation Includes:

Why It Matters

Intelligent Memory Retrieval

Semantic Compression

Cross-Lingual by Design

How It Works

Cluster

Glyph & Fractal

Integrate

Full Pipeline

Quickstart: Agent Memory API

Build Your Own Tape

For Researchers and Builders

Python implementation

45+ docs with specs, math, and experiment protocols

Ready-made scripts for experiments