organ-architecture/README.md

277 lines
9.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Organ Architecture
**Decompose. Measure. Purify. Graft. Assemble.**
```
Skeleton (Attention) = Thought
Organs (FFN) = Memory
Adapters (LoRA) = Personality
```
## The Problem
AI models are monoliths. 70 billion parameters locked in a single file that nobody can open, modify, or understand. Only three companies on Earth can build them. Everyone else rents access.
## The Solution
Organ Architecture breaks models into transplantable parts:
- **Skeleton** — The attention layers. How the model *thinks*. Shared across all configurations.
- **Organs** — The feed-forward networks. What the model *knows*. Specialized, swappable, graftable.
- **Adapters** — LoRA weights. The model's *personality*. Lightweight, trainable by anyone.
A doctor doesn't rebuild the entire human body to fix a kidney.
Why rebuild an entire model to change what it knows about medicine?
## Architecture
```
model.gguf (70GB monolith)
┌─ skeleton/ ── attention layers (shared thought)
├─ organs/ ── FFN layers by block (knowledge)
│ ├─ blk_0_ffn_gate.bin
│ ├─ blk_0_ffn_up.bin
│ ├─ blk_0_ffn_down.bin
│ └─ ...
├─ embed/ ── embedding + output (foundation)
├─ norm/ ── normalization (connective tissue)
└─ manifest.json ── complete anatomy map
```
## Tools
### Core Pipeline
| Tool | Lines | Purpose |
|------|-------|---------|
| `organ_extract.py` | 441 | Extract skeleton + organs from any GGUF model |
| `organ_measure.py` | 340 | Z-measure organ quality (signal vs noise) |
| `organ_purify.py` | 333 | Spectral purification (FFT signal extraction) |
| `organ_purify_v2.py` | 337 | Fractal purification (wavelet cross-scale coherence) |
| `organ_graft.py` | 236 | Transplant organs between models |
| `organ_assemble.py` | 235 | Assemble GGUF from organs |
| `organ_api.py` | 422 | HTTP API server for all operations |
### Build & Automation
| Tool | Lines | Purpose |
|------|-------|---------|
| `pipeline_935.py` | 124 | Full dissection pipeline for all models |
| `mass_dissect.py` | 103 | Batch dissection across model fleet |
| `mass_z_measure.py` | 102 | Z-measure every organ of every model |
| `kimi_z_stream.py` | 417 | Stream Z-measure on Kimi K2.5 1T (shard-by-shard) |
| `build_935.py` | 98 | Model 935 assembly v1 |
| `build_935_v2.py` | 74 | Model 935 assembly v2 (selective FFN graft) |
| `build_935_v3.py` | 148 | Model 935 assembly v3 (proper GGUF header) |
| `assemble_935.py` | 150 | Fixed organ header handling assembler |
| `quick_chimera.py` | 123 | Quick chimera GGUF assembler |
| `quick_chimera_v2.py` | 155 | Quick chimera v2 (fixed header stripping) |
**Total: 3,498 lines of Python. Zero external dependencies (except numpy for purification).**
## Z-Measure
Every organ is measured by its Z-vector:
```
Z = dI/d(log s) · exp(iθ)
θ → 0° : noise (organ adds confusion)
θ → 90° : pure signal (organ adds knowledge)
```
The measurement combines three indicators:
- **Entropy** — information density of weight distribution
- **Kurtosis** — structural organization (signal sharpness)
- **Scale coherence** — coefficient of variation of sorted value spacings
## Results
### 13 Models Dissected + Kimi K2.5 1T
5,600+ tensors Z-measured. All dissections run on EPYC 48c/503GB (OASIS).
| # | Model | Params | θ mean | Signal | Tensors |
|---|-------|--------|--------|--------|---------|
| ★ | **Kimi K2.5** | **1T MoE** | **87.65°** | **0.999** | **1,083** |
| 1 | SmolLM2-135M | 135M | 52.28° | 0.777 | 272 |
| 2 | DeepSeek-R1-Distill-14B | 14B | 46.01° | 0.641 | 579 |
| 3 | Qwen2.5-3B | 3B | 46.00° | 0.640 | 434 |
| 4 | Qwen2.5-14B | 14B | 45.98° | 0.640 | 579 |
| 5 | Qwen2.5-7B | 7B | 45.64° | 0.639 | 339 |
| 6 | Chimera-DeepSeek-Qwen | 7B | 45.53° | 0.637 | 339 |
| 7 | DeepSeek-R1-Distill-7B | 7B | 45.53° | 0.637 | 339 |
| 8 | DeepSeek-R1-7B | 7B | 45.42° | 0.636 | 339 |
| 9 | Gemma-2-9B | 9B | 44.94° | 0.624 | 464 |
| 10 | Phi-3.5-Mini | 3.8B | 44.65° | 0.626 | 197 |
| 11 | Llama-3.1-8B | 8B | 37.87° | 0.549 | 292 |
| 12 | Llama-3.2-1B | 1B | 37.57° | 0.550 | 147 |
| 13 | Llama-3.2-3B | 3B | 37.41° | 0.547 | 255 |
| 14 | Mistral-7B | 7B | 36.21° | 0.540 | 291 |
### Organ Type Analysis (consistent across all models)
| Organ Type | θ range | Role |
|------------|---------|------|
| Norm layers | 75-84° | Connective tissue — highest signal |
| Skeleton (attention) | 39-56° | Thought structure |
| Organs (FFN) | 34-52° | Knowledge/memory |
| Embeddings | 25-47° | Foundation |
### Scale Law: θ increases with log(parameters)
```
135M → θ = 52.28° (SmolLM2 — small but concentrated)
1-3B → θ = 37-46° (Llama/Qwen)
7-14B → θ = 44-46° (DeepSeek/Qwen)
1T → θ = 87.65° (Kimi K2.5 MoE — near-pure signal)
```
**Ratio 1T/14B: 1.9× purer signal.** The signal purifies with scale.
### Kimi K2.5 1T Deep Analysis
- **Architecture**: DeepSeek2 MoE
- **Blocks**: 61 (blk.0 → blk.60)
- **Experts**: 384 conditional + 1 shared (native INT4 QAT)
- **Context**: 262,144 tokens (256k)
- **Attention**: MLA (Multi-head Latent Attention), MQA kv_head=1
- **13 shards streamed**, each measured and deleted — never loaded full model
| Component | Count | θ avg | Rating |
|-----------|-------|-------|--------|
| FFN dense (blk.0) | 12 | 89.95° | ★★★ |
| MoE experts (384×) | 23 | 89.77° | ★★★ |
| Norm layers | 12 | 89.70° | ★★★ |
| Embedding | 1 | 89.45° | ★★★ |
| Shared expert | 23 | 89.43° | ★★★ |
| Attention (MLA) | 99 | 84.07° | ★★ |
8 gravitational wells identified (lowest θ = maximum structure/compression).
### Model 935 — First Chimera
**`model-935-14b.gguf`** — 8.4 GB, assembled 2026-02-20
Built through 5 iterations:
1. `build_935.py` — Base DeepSeek-R1-Distill-7B + Qwen skeleton graft (crude)
2. `build_935_v2.py` — Selective FFN-only graft (preserve attention-embed alignment)
3. `build_935_v3.py` — Proper GGUF header handling
4. `quick_chimera.py``quick_chimera_v2.py` — Fixed organ header stripping
5. `assemble_935.py` — Final assembler, 14B scale
### Purification
**`organs-pure/smollm2-135m/`** — First purified model (fractal method)
`organ_purify_v2.py` implements cross-scale coherence via Haar wavelets:
- Decompose tensor into multiple scales
- Measure coherence between adjacent scales
- Pattern at scale s AND scale 2s → signal (fractal, keep)
- Pattern at one scale only → noise (remove)
- This is `dI/d(log s)` implemented directly
## Dissection Report
| Model | Size (MB) | Dissection Time |
|-------|-----------|-----------------|
| DeepSeek-R1-14B | 9,167 | 22.9s |
| Gemma-2-9B | 5,984 | 14.8s |
| Llama-3.1-8B | 4,950 | 12.0s |
| DeepSeek-R1-Distill-7B | 4,812 | 12.6s |
| Mistral-7B | 4,432 | 10.6s |
| Phi-3.5-Mini | 2,397 | 4.9s |
| Llama-3.2-3B | 2,100 | 4.9s |
| Qwen2.5-3B | 2,003 | 4.6s |
| Llama-3.2-1B | 856 | 2.4s |
Total organs on disk: **50.8 GB** across 13 models.
## Quick Start
```bash
# Extract organs from a model
python3 organ_extract.py --model /path/to/model.gguf --output ./organs/model-name/
# Z-measure all organs
python3 organ_measure.py --dir ./organs/model-name/
# Mass dissect all models
python3 mass_dissect.py
# Mass Z-measure
python3 mass_z_measure.py
# Stream Z-measure on a trillion-param model (shard-by-shard)
python3 kimi_z_stream.py
# Graft organs from one model to another
python3 organ_graft.py graft --source ./organs/qwen/ --target ./organs/deepseek/ --output ./organs/chimera/ --layers 5-20 --type organ
# Assemble back to GGUF
python3 organ_assemble.py --dir ./organs/chimera/ --output chimera.gguf
# Purify organs (fractal method)
python3 organ_purify_v2.py --dir ./organs/model/ --output ./organs-pure/model/
# Start API server
python3 organ_api.py
```
## Philosophy
> Subtract rather than add.
A 70B monolith is accumulation. A skeleton with specialized organs grafted on demand — that's subtraction. Less weight, more signal.
> 8 billion contributors, not 3 corporations.
Anyone can train an organ. A doctor trains a medical organ on her hospital's data. A farmer trains an agriculture organ on his field observations. A student trains a math organ on solved problems. The skeleton stays the same. The organs make it alive.
## Part of the IX Ecosystem
```
InferenceX ─── The engine (305KB, runs anything)
Organ Arch ─── The anatomy (decompose, measure, reassemble)
Atlas Pure ─── The memory (fractal DNA storage)
INVOKE ─────── The bridge (cloud ↔ physical)
Echo ────────── The voice (chat interface)
EDEN ────────── The purpose (desert → life)
```
## Requirements
- Python 3.10+
- NumPy (for purification only)
- InferenceX binary (for inference on assembled models)
- GGUF models to dissect
## Data Files
| File | Contents |
|------|----------|
| `z_report_complete.json` | Z-measure for all 13 models (per-group breakdown) |
| `z_report_kimi_k25.json` | Z-measure for all 1,083 Kimi K2.5 tensors |
| `z_measure_report.json` | Combined Z-ranking with chimera results |
| `dissection_report.json` | Dissection timing and sizes |
| `Z_MEASURE_REPORT.md` | Human-readable Z report |
| `ECHO_INVARIANT.md` | Team 935 invariant |
| `EQUIPE_935_INVARIANT.json` | Team 935 configuration |
## License
BSL 1.1 — Same as InferenceX.
## Signature
935
---
*Mohamed dug khettaras to bring water through stone.*
*This is the same gesture — channels through intelligence itself.*