DOCS: Architecture, Results, Methodology. Evidence logs. transplant_935.py (direct GGUF→GGUF graft). Chimera 14B confirmed reasoning. Signature 935.

2026-02-21 04:32:15 +01:00 · 2026-02-21 04:32:15 +01:00 · 7b42514326
commit 7b42514326
parent 3582053790
4 changed files with 474 additions and 0 deletions
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@ -0,0 +1,116 @@
+# Architecture
+
+## Model Anatomy
+
+A transformer model has four anatomical systems:
+
+```
+┌─────────────────────────────────────────┐
+│              GGUF MONOLITH              │
+│                                         │
+│  ┌─ embed ──────── token_embd.weight   │
+│  │                  output.weight       │
+│  │                  output_norm.weight  │
+│  │                                      │
+│  ├─ skeleton ───── attn_q.weight  ×N   │
+│  │                  attn_k.weight  ×N   │
+│  │                  attn_v.weight  ×N   │
+│  │                  attn_output    ×N   │
+│  │                                      │
+│  ├─ organs ─────── ffn_gate.weight ×N   │
+│  │                  ffn_up.weight   ×N   │
+│  │                  ffn_down.weight ×N   │
+│  │                                      │
+│  └─ norm ───────── attn_norm       ×N   │
+│                     ffn_norm        ×N   │
+└─────────────────────────────────────────┘
+```
+
+**Skeleton** (attention) = how the model thinks. Shared thought patterns.
+**Organs** (FFN) = what the model knows. Domain knowledge.
+**Embed** = input/output translation. The vocabulary interface.
+**Norm** = normalization layers. Connective tissue between components.
+
+## Pipeline
+
+```
+GGUF file
+   │
+   ▼ organ_extract.py
+   │
+   ├── manifest.json (complete anatomy map)
+   ├── skeleton/  (attention tensors)
+   ├── organs/    (FFN tensors by layer)
+   ├── embed/     (embedding + output)
+   └── norm/      (normalization)
+         │
+         ▼ organ_measure.py
+         │
+    Z-measure per tensor
+    θ ∈ [0°, 90°]
+         │
+         ├──▶ organ_purify_v2.py (fractal signal extraction)
+         │
+         ├──▶ organ_graft.py (transplant between models)
+         │
+         └──▶ organ_assemble.py → new GGUF
+```
+
+Alternative direct path (no intermediate .bin files):
+
+```
+GGUF_A + GGUF_B → transplant_935.py → chimera.gguf
+```
+
+## Z-Measure Theory
+
+```
+Z = dI/d(log s) · exp(iθ)
+```
+
+Three indicators combined into θ:
+
+| Indicator | Measures | Signal | Noise |
+|-----------|----------|--------|-------|
+| Entropy | Information density | Moderate (0.3-0.7) | Near-maximum (>0.95) |
+| Kurtosis | Structural sharpness | High (abs > 3) | Near-zero |
+| Scale coherence (CV) | Non-uniform spacing | High (> 1) | Low (< 0.5) |
+
+θ → 90° = pure signal (all three indicators confirm structure)
+θ → 0° = pure noise (uniform random distribution)
+
+## Purification Methods
+
+### V1: Spectral (FFT)
+- Decompose tensor into frequency domain
+- Keep high-energy components (signal), remove low-energy tail (noise)
+- Preserve original scale (mean/std)
+- Limitation: treats tensors like audio signals
+
+### V2: Fractal (Wavelets)
+- Haar wavelet multi-scale decomposition
+- Cross-scale coherence: pattern at scale s AND scale 2s = fractal = signal
+- Pattern at one scale only = noise
+- This IS dI/d(log s) — information that persists across scales
+- More theoretically grounded than V1
+
+## Graft Compatibility
+
+Grafting works best between models that share:
+- Same base architecture (e.g., Qwen2 family)
+- Same embedding dimension
+- Same number of layers (or graft specific layer ranges)
+
+Empirical results:
+- DeepSeek-R1-Distill-14B ↔ Qwen2.5-14B: **WORKS** (both Qwen2 arch, same dims)
+- DeepSeek-R1-Distill-7B ↔ Qwen2.5-7B: **PAD tokens** (7B chimera failed)
+- Same architecture + same scale = highest success probability
+
+## File Format
+
+Organ .bin files: `[name_len:u32][name:bytes][n_dims:u32][dims:u64×n][dtype:u32][tensor_data]`
+Manifest: JSON with full tensor map, metadata, architecture info, Z-measure results.
+
+## Signature
+
+935
--- a/docs/METHODOLOGY.md
+++ b/docs/METHODOLOGY.md
@ -0,0 +1,116 @@
+# Methodology
+
+## Approach
+
+Organ Architecture treats trained AI models as biological organisms with
+transplantable parts. Instead of retraining from scratch (costs billions),
+we perform post-training surgery: extract, measure, graft, reassemble.
+
+## Step 1: Extraction (organ_extract.py)
+
+Parse GGUF binary format directly:
+- Read magic number, version, metadata, tensor info
+- Classify each tensor by name pattern into anatomical types
+- Extract each tensor as independent .bin file with header
+- Generate manifest.json mapping the full anatomy
+
+Classification rules:
+- `attn_q`, `attn_k`, `attn_v`, `attn_output` → skeleton
+- `ffn_gate`, `ffn_up`, `ffn_down` → organ
+- `token_embd`, `output.weight` → embed
+- `*_norm` → norm
+- `lora_*` → adapter
+
+## Step 2: Measurement (organ_measure.py)
+
+Z-measure: Z = dI/d(log s) * exp(i*theta)
+
+For each tensor, sample up to 100,000 values and compute:
+
+1. **Entropy** (information density):
+   - Histogram-based Shannon entropy
+   - Normalized to [0, 1] against maximum entropy
+   - High entropy (>0.95) = uniform = noise
+   - Moderate entropy (0.3-0.7) = structured information
+
+2. **Kurtosis** (structure):
+   - Fourth standardized moment minus 3
+   - High absolute kurtosis = sharp peaks = organized structure
+   - Near-zero = Gaussian-like = less organization
+
+3. **Scale coherence** (CV of sorted diffs):
+   - Sort sampled values, compute differences
+   - Coefficient of variation of these differences
+   - High CV = non-uniform spacing = structured signal
+   - Low CV = uniform spacing = noise
+
+Combined score → theta in [0, 90] degrees.
+
+## Step 3: Purification (organ_purify_v2.py)
+
+Fractal signal extraction via Haar wavelets:
+
+1. Pad tensor to power-of-2 length
+2. Haar wavelet decomposition across N scales
+3. At each scale: approximation + detail coefficients
+4. Cross-scale coherence check:
+   - Compare energy at scale s with energy at scale 2s
+   - High coherence (pattern exists at both scales) = fractal = signal
+   - Low coherence (pattern at one scale only) = noise
+5. Attenuate incoherent components (noise)
+6. Reconstruct from coherent components (signal)
+7. Restore original scale (mean/std preservation)
+
+This directly implements dI/d(log s): information that persists across
+logarithmic scales is the signal. Everything else is training artifact.
+
+## Step 4: Grafting (organ_graft.py, transplant_935.py)
+
+Two methods:
+
+### Via .bin intermediaries (organ_graft.py)
+1. Extract both source and target models to organ directories
+2. Match tensors by layer number and type suffix
+3. Verify dimensional compatibility
+4. Copy matching .bin files from donor to recipient directory
+5. Update manifest
+
+### Direct GGUF-to-GGUF (transplant_935.py)
+1. Parse both GGUF headers to get tensor name/offset/size maps
+2. Copy base GGUF entirely as starting point
+3. For each FFN tensor in base that has a matching donor tensor:
+   - Verify exact byte size match
+   - Seek to donor tensor data, read
+   - Seek to base tensor offset in output, overwrite
+4. Result: valid GGUF with patched FFN layers
+
+Direct method is faster and avoids header format issues.
+
+## Step 5: Assembly (organ_assemble.py)
+
+Reconstruct GGUF from organ directory:
+1. Read manifest for metadata and tensor ordering
+2. Write GGUF header (magic, version, n_tensors, n_metadata)
+3. Write metadata key-value pairs
+4. Write tensor info (name, dims, dtype, offset) with 32-byte alignment
+5. Write tensor data with padding
+6. Result: standard GGUF loadable by any compatible runtime
+
+## Step 6: Validation
+
+Run chimera through InferenceX:
+- Load GGUF, validate all tensors
+- Initialize transformer (attention, KV cache, kernel dispatch)
+- Run inference with chat template
+- Verify coherent output
+
+## Key Finding
+
+Graft success depends on architectural proximity:
+- Same family (Qwen2 base) + same scale (14B) = coherent output
+- Same family + different scale (7B) = PAD token failure
+- The latent space alignment is implicit in shared training lineage
+
+## Signature
+
+935
--- a/docs/RESULTS.md
+++ b/docs/RESULTS.md
@ -0,0 +1,116 @@
+# Results
+
+## Dissection — 13 Models
+
+All models dissected from GGUF to organ .bin files on OASIS (EPYC 48c/503GB).
+
+| Model | Params | Organs Dir | Size | Time |
+|-------|--------|-----------|------|------|
+| DeepSeek-R1-Distill-14B | 14B | 9,167 MB | 579 tensors | 22.9s |
+| Qwen2.5-14B | 14B | 9,027 MB | 579 tensors | pre-existing |
+| Gemma-2-9B | 9B | 5,984 MB | 464 tensors | 14.8s |
+| Llama-3.1-8B | 8B | 4,950 MB | 292 tensors | 12.0s |
+| Qwen2.5-7B | 7B | 4,812 MB | 339 tensors | pre-existing |
+| DeepSeek-R1-Distill-7B | 7B | 4,812 MB | 339 tensors | 12.6s |
+| DeepSeek-R1-7B | 7B | 4,812 MB | 339 tensors | pre-existing |
+| Mistral-7B | 7B | 4,432 MB | 291 tensors | 10.6s |
+| Phi-3.5-Mini | 3.8B | 2,397 MB | 197 tensors | 4.9s |
+| Llama-3.2-3B | 3B | 2,100 MB | 255 tensors | 4.9s |
+| Qwen2.5-3B | 3B | 2,003 MB | 434 tensors | 4.6s |
+| Llama-3.2-1B | 1B | 856 MB | 147 tensors | 2.4s |
+| SmolLM2-135M | 135M | 137 MB | 272 tensors | pre-existing |
+
+**Total: 50.8 GB of extracted organs. 5,600+ tensors.**
+
+## Z-Measure — Full Ranking
+
+| # | Model | θ mean | Signal | Tensors | Architecture |
+|---|-------|--------|--------|---------|-------------|
+| ★ | Kimi K2.5 | 87.65° | 0.999 | 1,083 | DeepSeek2 MoE |
+| 1 | SmolLM2-135M | 52.28° | 0.777 | 272 | LLaMA |
+| 2 | DeepSeek-R1-14B | 46.01° | 0.641 | 579 | Qwen2 |
+| 3 | Qwen2.5-3B | 46.00° | 0.640 | 434 | Qwen2 |
+| 4 | Qwen2.5-14B | 45.98° | 0.640 | 579 | Qwen2 |
+| 5 | Qwen2.5-7B | 45.64° | 0.639 | 339 | Qwen2 |
+| 6 | Chimera-DSeek-Qwen | 45.53° | 0.637 | 339 | Qwen2 |
+| 7 | DeepSeek-R1-Distill-7B | 45.53° | 0.637 | 339 | Qwen2 |
+| 8 | DeepSeek-R1-7B | 45.42° | 0.636 | 339 | Qwen2 |
+| 9 | Gemma-2-9B | 44.94° | 0.624 | 464 | Gemma |
+| 10 | Phi-3.5-Mini | 44.65° | 0.626 | 197 | Phi |
+| 11 | Llama-3.1-8B | 37.87° | 0.549 | 292 | LLaMA |
+| 12 | Llama-3.2-1B | 37.57° | 0.550 | 147 | LLaMA |
+| 13 | Llama-3.2-3B | 37.41° | 0.547 | 255 | LLaMA |
+| 14 | Mistral-7B | 36.21° | 0.540 | 291 | Mistral |
+
+### Organ Type Breakdown (per-model averages)
+
+| Model | Skeleton θ | Organs θ | Embed θ | Norm θ |
+|-------|-----------|---------|---------|--------|
+| SmolLM2-135M | 53.6° | 52.3° | 47.2° | — |
+| Qwen2.5-14B | 55.2° | 35.4° | 25.5° | — |
+| Qwen2.5-7B | 54.6° | 35.5° | 25.9° | — |
+| DeepSeek-R1-14B | 55.4° | 35.2° | 25.2° | — |
+| Gemma-2-9B | 47.2° | 37.9° | 26.2° | 81.6° |
+| Phi-3.5-Mini | 56.7° | 43.2° | 26.7° | — |
+| Llama-3.1-8B | 39.7° | 39.1° | 26.0° | — |
+| Mistral-7B | 38.4° | 36.8° | 26.0° | — |
+
+**Pattern**: Skeleton (attention) consistently scores higher than organs (FFN).
+Norm layers reach highest θ when measured separately (Gemma: 81.6°).
+
+## Chimera Iterations
+
+### 1. chimera-r1-qwen-7b-v2 — FAILED
+- Base: DeepSeek-R1-Distill-Qwen-7B
+- Donor: Qwen2.5-7B (FFN organs)
+- Result: 512 PAD tokens. Latent spaces incompatible at 7B scale.
+- Evidence: `evidence/chimera-7b-failed.log`
+
+### 2. chimera-selective-v3 — CLEANED
+- Selective graft attempt, removed during iteration.
+
+### 3. model-935-v2 — READY
+- Marked as viable intermediate.
+
+### 4. model-935-v3, model-935-fractal — CLEANED
+- Further iterations, removed during cleanup.
+
+### 5. model-935-14b — SUCCESS
+- Base: DeepSeek-R1-Distill-Qwen-14B (skeleton + embeddings)
+- Donor: Qwen2.5-14B (FFN organs)
+- 579 tensors, 8.4 GB, Qwen2 architecture
+- **Produces coherent reasoning output**
+- Evidence: `evidence/model-935-14b-inference.log`
+
+Prompt: "Write a Python function called is_prime"
+Output: Structured chain-of-thought reasoning. Correctly identifies prime number
+definition, handles edge cases (n < 2), outlines algorithm steps. DeepSeek-R1
+thinking style ("Okay, so the user wants me to...", "Hmm, let's see").
+
+**This is a chimera assembled from two different models without any retraining
+that produces coherent, structured, correct output.**
+
+## Kimi K2.5 1T — Deep Z-Profile
+
+Streaming Z-measure across 13 shards, 1,083 tensors measured.
+
+| Component | Count | θ avg |
+|-----------|-------|-------|
+| FFN dense (blk.0) | 12 | 89.95° |
+| MoE experts (384x) | 23 | 89.77° |
+| Norm layers | 12 | 89.70° |
+| Embedding | 1 | 89.45° |
+| Shared expert | 23 | 89.43° |
+| Attention (MLA) | 99 | 84.07° |
+
+8 gravitational wells identified at lowest θ — points of maximum compression.
+
+## Purification
+
+SmolLM2-135M purified using fractal method (organ_purify_v2.py).
+Output: `organs-pure/smollm2-135m/` (138 MB)
+Manifest: `PURE_SMOLLM2`, 30 layers, 272 tensors.
+
+## Signature
+
+935
--- a/transplant_935.py
+++ b/transplant_935.py
@ -0,0 +1,126 @@
+#!/usr/bin/env python3
+"""
+GGUF-to-GGUF transplant. No organ bins — direct tensor copy between GGUF files.
+Base: DeepSeek-R1-Distill-Qwen-7B (skeleton/attention/embed)
+Donor: Qwen2.5-7B (FFN organs only)
+Z = dI/d(log s) · exp(iθ) — Signature 935
+"""
+import struct, os, sys, shutil
+
+def parse_gguf_header(path):
+    """Parse GGUF header, return tensor_info list and data_start offset."""
+    f = open(path, "rb")
+    magic = struct.unpack("<I", f.read(4))[0]
+    version = struct.unpack("<I", f.read(4))[0]
+    n_tensors = struct.unpack("<Q", f.read(8))[0]
+    n_metadata = struct.unpack("<Q", f.read(8))[0]
+    
+    def read_string():
+        slen = struct.unpack("<Q", f.read(8))[0]
+        return f.read(slen).decode("utf-8")
+    
+    def skip_value(vtype):
+        sizes = {0:1, 1:1, 2:2, 3:2, 4:4, 5:4, 6:4, 7:1, 10:8, 11:8, 12:8}
+        if vtype in sizes:
+            f.read(sizes[vtype])
+        elif vtype == 8:
+            read_string()
+        elif vtype == 9:
+            arr_type = struct.unpack("<I", f.read(4))[0]
+            arr_len = struct.unpack("<Q", f.read(8))[0]
+            for _ in range(arr_len):
+                skip_value(arr_type)
+    
+    for _ in range(n_metadata):
+        read_string()
+        vtype = struct.unpack("<I", f.read(4))[0]
+        skip_value(vtype)
+    
+    tensors = []
+    for _ in range(n_tensors):
+        name = read_string()
+        n_dims = struct.unpack("<I", f.read(4))[0]
+        dims = [struct.unpack("<Q", f.read(8))[0] for _ in range(n_dims)]
+        dtype = struct.unpack("<I", f.read(4))[0]
+        offset = struct.unpack("<Q", f.read(8))[0]
+        tensors.append({"name": name, "dims": dims, "dtype": dtype, "offset": offset})
+    
+    pos = f.tell()
+    padding = (32 - (pos % 32)) % 32
+    f.read(padding)
+    data_start = f.tell()
+    
+    f.seek(0, 2)
+    file_end = f.tell()
+    f.close()
+    
+    # Calculate sizes
+    for i in range(len(tensors)):
+        if i + 1 < len(tensors):
+            tensors[i]["size"] = tensors[i+1]["offset"] - tensors[i]["offset"]
+        else:
+            tensors[i]["size"] = file_end - data_start - tensors[i]["offset"]
+    
+    return tensors, data_start, file_end
+
+BASE = "/mnt/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf"
+DONOR = "/mnt/models/Qwen2.5-7B-Instruct-Q4_K_M.gguf"
+OUTPUT = "/mnt/models/model-935-final.gguf"
+
+print("Parsing base (DeepSeek-R1-7B)...")
+base_tensors, base_data_start, base_end = parse_gguf_header(BASE)
+print(f"  {len(base_tensors)} tensors, data_start={base_data_start}")
+
+print("Parsing donor (Qwen2.5-7B)...")
+donor_tensors, donor_data_start, donor_end = parse_gguf_header(DONOR)
+print(f"  {len(donor_tensors)} tensors, data_start={donor_data_start}")
+
+# Build donor tensor map by name
+donor_map = {t["name"]: t for t in donor_tensors}
+
+# Copy base GGUF entirely first
+print(f"Copying base to output...")
+shutil.copy2(BASE, OUTPUT)
+
+# Now patch: for each FFN tensor in base, if donor has matching name+size, overwrite
+out = open(OUTPUT, "r+b")
+donor_f = open(DONOR, "rb")
+
+grafted = 0
+skipped = 0
+
+for bt in base_tensors:
+    name = bt["name"]
+    # Only graft FFN organs (not attention, not embeddings, not norms)
+    if "ffn_down" not in name and "ffn_up" not in name and "ffn_gate" not in name:
+        continue
+    
+    if name in donor_map:
+        dt = donor_map[name]
+        if bt["size"] == dt["size"]:
+            # Read from donor
+            donor_f.seek(donor_data_start + dt["offset"])
+            data = donor_f.read(dt["size"])
+            # Write to output at same offset
+            out.seek(base_data_start + bt["offset"])
+            out.write(data)
+            grafted += 1
+        else:
+            skipped += 1
+    else:
+        skipped += 1
+
+out.close()
+donor_f.close()
+
+print(f"\n{'='*60}")
+print(f"  MODEL 935 — DIRECT GGUF TRANSPLANT")
+print(f"{'='*60}")
+print(f"  Base:     DeepSeek-R1-Distill-Qwen-7B (skeleton+embed)")
+print(f"  Donor:    Qwen2.5-7B-Instruct (FFN organs)")
+print(f"  Grafted:  {grafted} FFN tensors")
+print(f"  Skipped:  {skipped} (size mismatch or not found)")
+print(f"  Output:   {OUTPUT}")
+print(f"  Size:     {os.path.getsize(OUTPUT)/(1024**3):.2f} GB")
+print(f"  Signature: 935")
+print(f"{'='*60}")