diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..bea6243 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,116 @@ +# Architecture + +## Model Anatomy + +A transformer model has four anatomical systems: + +``` +┌─────────────────────────────────────────┐ +│ GGUF MONOLITH │ +│ │ +│ ┌─ embed ──────── token_embd.weight │ +│ │ output.weight │ +│ │ output_norm.weight │ +│ │ │ +│ ├─ skeleton ───── attn_q.weight ×N │ +│ │ attn_k.weight ×N │ +│ │ attn_v.weight ×N │ +│ │ attn_output ×N │ +│ │ │ +│ ├─ organs ─────── ffn_gate.weight ×N │ +│ │ ffn_up.weight ×N │ +│ │ ffn_down.weight ×N │ +│ │ │ +│ └─ norm ───────── attn_norm ×N │ +│ ffn_norm ×N │ +└─────────────────────────────────────────┘ +``` + +**Skeleton** (attention) = how the model thinks. Shared thought patterns. +**Organs** (FFN) = what the model knows. Domain knowledge. +**Embed** = input/output translation. The vocabulary interface. +**Norm** = normalization layers. Connective tissue between components. + +## Pipeline + +``` +GGUF file + │ + ▼ organ_extract.py + │ + ├── manifest.json (complete anatomy map) + ├── skeleton/ (attention tensors) + ├── organs/ (FFN tensors by layer) + ├── embed/ (embedding + output) + └── norm/ (normalization) + │ + ▼ organ_measure.py + │ + Z-measure per tensor + θ ∈ [0°, 90°] + │ + ├──▶ organ_purify_v2.py (fractal signal extraction) + │ + ├──▶ organ_graft.py (transplant between models) + │ + └──▶ organ_assemble.py → new GGUF +``` + +Alternative direct path (no intermediate .bin files): + +``` +GGUF_A + GGUF_B → transplant_935.py → chimera.gguf +``` + +## Z-Measure Theory + +``` +Z = dI/d(log s) · exp(iθ) +``` + +Three indicators combined into θ: + +| Indicator | Measures | Signal | Noise | +|-----------|----------|--------|-------| +| Entropy | Information density | Moderate (0.3-0.7) | Near-maximum (>0.95) | +| Kurtosis | Structural sharpness | High (abs > 3) | Near-zero | +| Scale coherence (CV) | Non-uniform spacing | High (> 1) | Low (< 0.5) | + +θ → 90° = pure signal (all three indicators confirm structure) +θ → 0° = pure noise (uniform random distribution) + +## Purification Methods + +### V1: Spectral (FFT) +- Decompose tensor into frequency domain +- Keep high-energy components (signal), remove low-energy tail (noise) +- Preserve original scale (mean/std) +- Limitation: treats tensors like audio signals + +### V2: Fractal (Wavelets) +- Haar wavelet multi-scale decomposition +- Cross-scale coherence: pattern at scale s AND scale 2s = fractal = signal +- Pattern at one scale only = noise +- This IS dI/d(log s) — information that persists across scales +- More theoretically grounded than V1 + +## Graft Compatibility + +Grafting works best between models that share: +- Same base architecture (e.g., Qwen2 family) +- Same embedding dimension +- Same number of layers (or graft specific layer ranges) + +Empirical results: +- DeepSeek-R1-Distill-14B ↔ Qwen2.5-14B: **WORKS** (both Qwen2 arch, same dims) +- DeepSeek-R1-Distill-7B ↔ Qwen2.5-7B: **PAD tokens** (7B chimera failed) +- Same architecture + same scale = highest success probability + +## File Format + +Organ .bin files: `[name_len:u32][name:bytes][n_dims:u32][dims:u64×n][dtype:u32][tensor_data]` +Manifest: JSON with full tensor map, metadata, architecture info, Z-measure results. + +## Signature + +935 diff --git a/docs/METHODOLOGY.md b/docs/METHODOLOGY.md new file mode 100644 index 0000000..b69fe42 --- /dev/null +++ b/docs/METHODOLOGY.md @@ -0,0 +1,116 @@ +# Methodology + +## Approach + +Organ Architecture treats trained AI models as biological organisms with +transplantable parts. Instead of retraining from scratch (costs billions), +we perform post-training surgery: extract, measure, graft, reassemble. + +## Step 1: Extraction (organ_extract.py) + +Parse GGUF binary format directly: +- Read magic number, version, metadata, tensor info +- Classify each tensor by name pattern into anatomical types +- Extract each tensor as independent .bin file with header +- Generate manifest.json mapping the full anatomy + +Classification rules: +- `attn_q`, `attn_k`, `attn_v`, `attn_output` → skeleton +- `ffn_gate`, `ffn_up`, `ffn_down` → organ +- `token_embd`, `output.weight` → embed +- `*_norm` → norm +- `lora_*` → adapter + +## Step 2: Measurement (organ_measure.py) + +Z-measure: Z = dI/d(log s) * exp(i*theta) + +For each tensor, sample up to 100,000 values and compute: + +1. **Entropy** (information density): + - Histogram-based Shannon entropy + - Normalized to [0, 1] against maximum entropy + - High entropy (>0.95) = uniform = noise + - Moderate entropy (0.3-0.7) = structured information + +2. **Kurtosis** (structure): + - Fourth standardized moment minus 3 + - High absolute kurtosis = sharp peaks = organized structure + - Near-zero = Gaussian-like = less organization + +3. **Scale coherence** (CV of sorted diffs): + - Sort sampled values, compute differences + - Coefficient of variation of these differences + - High CV = non-uniform spacing = structured signal + - Low CV = uniform spacing = noise + +Combined score → theta in [0, 90] degrees. + +## Step 3: Purification (organ_purify_v2.py) + +Fractal signal extraction via Haar wavelets: + +1. Pad tensor to power-of-2 length +2. Haar wavelet decomposition across N scales +3. At each scale: approximation + detail coefficients +4. Cross-scale coherence check: + - Compare energy at scale s with energy at scale 2s + - High coherence (pattern exists at both scales) = fractal = signal + - Low coherence (pattern at one scale only) = noise +5. Attenuate incoherent components (noise) +6. Reconstruct from coherent components (signal) +7. Restore original scale (mean/std preservation) + +This directly implements dI/d(log s): information that persists across +logarithmic scales is the signal. Everything else is training artifact. + +## Step 4: Grafting (organ_graft.py, transplant_935.py) + +Two methods: + +### Via .bin intermediaries (organ_graft.py) +1. Extract both source and target models to organ directories +2. Match tensors by layer number and type suffix +3. Verify dimensional compatibility +4. Copy matching .bin files from donor to recipient directory +5. Update manifest + +### Direct GGUF-to-GGUF (transplant_935.py) +1. Parse both GGUF headers to get tensor name/offset/size maps +2. Copy base GGUF entirely as starting point +3. For each FFN tensor in base that has a matching donor tensor: + - Verify exact byte size match + - Seek to donor tensor data, read + - Seek to base tensor offset in output, overwrite +4. Result: valid GGUF with patched FFN layers + +Direct method is faster and avoids header format issues. + +## Step 5: Assembly (organ_assemble.py) + +Reconstruct GGUF from organ directory: +1. Read manifest for metadata and tensor ordering +2. Write GGUF header (magic, version, n_tensors, n_metadata) +3. Write metadata key-value pairs +4. Write tensor info (name, dims, dtype, offset) with 32-byte alignment +5. Write tensor data with padding +6. Result: standard GGUF loadable by any compatible runtime + +## Step 6: Validation + +Run chimera through InferenceX: +- Load GGUF, validate all tensors +- Initialize transformer (attention, KV cache, kernel dispatch) +- Run inference with chat template +- Verify coherent output + +## Key Finding + +Graft success depends on architectural proximity: +- Same family (Qwen2 base) + same scale (14B) = coherent output +- Same family + different scale (7B) = PAD token failure +- The latent space alignment is implicit in shared training lineage + +## Signature + +935 diff --git a/docs/RESULTS.md b/docs/RESULTS.md new file mode 100644 index 0000000..9d6dd48 --- /dev/null +++ b/docs/RESULTS.md @@ -0,0 +1,116 @@ +# Results + +## Dissection — 13 Models + +All models dissected from GGUF to organ .bin files on OASIS (EPYC 48c/503GB). + +| Model | Params | Organs Dir | Size | Time | +|-------|--------|-----------|------|------| +| DeepSeek-R1-Distill-14B | 14B | 9,167 MB | 579 tensors | 22.9s | +| Qwen2.5-14B | 14B | 9,027 MB | 579 tensors | pre-existing | +| Gemma-2-9B | 9B | 5,984 MB | 464 tensors | 14.8s | +| Llama-3.1-8B | 8B | 4,950 MB | 292 tensors | 12.0s | +| Qwen2.5-7B | 7B | 4,812 MB | 339 tensors | pre-existing | +| DeepSeek-R1-Distill-7B | 7B | 4,812 MB | 339 tensors | 12.6s | +| DeepSeek-R1-7B | 7B | 4,812 MB | 339 tensors | pre-existing | +| Mistral-7B | 7B | 4,432 MB | 291 tensors | 10.6s | +| Phi-3.5-Mini | 3.8B | 2,397 MB | 197 tensors | 4.9s | +| Llama-3.2-3B | 3B | 2,100 MB | 255 tensors | 4.9s | +| Qwen2.5-3B | 3B | 2,003 MB | 434 tensors | 4.6s | +| Llama-3.2-1B | 1B | 856 MB | 147 tensors | 2.4s | +| SmolLM2-135M | 135M | 137 MB | 272 tensors | pre-existing | + +**Total: 50.8 GB of extracted organs. 5,600+ tensors.** + +## Z-Measure — Full Ranking + +| # | Model | θ mean | Signal | Tensors | Architecture | +|---|-------|--------|--------|---------|-------------| +| ★ | Kimi K2.5 | 87.65° | 0.999 | 1,083 | DeepSeek2 MoE | +| 1 | SmolLM2-135M | 52.28° | 0.777 | 272 | LLaMA | +| 2 | DeepSeek-R1-14B | 46.01° | 0.641 | 579 | Qwen2 | +| 3 | Qwen2.5-3B | 46.00° | 0.640 | 434 | Qwen2 | +| 4 | Qwen2.5-14B | 45.98° | 0.640 | 579 | Qwen2 | +| 5 | Qwen2.5-7B | 45.64° | 0.639 | 339 | Qwen2 | +| 6 | Chimera-DSeek-Qwen | 45.53° | 0.637 | 339 | Qwen2 | +| 7 | DeepSeek-R1-Distill-7B | 45.53° | 0.637 | 339 | Qwen2 | +| 8 | DeepSeek-R1-7B | 45.42° | 0.636 | 339 | Qwen2 | +| 9 | Gemma-2-9B | 44.94° | 0.624 | 464 | Gemma | +| 10 | Phi-3.5-Mini | 44.65° | 0.626 | 197 | Phi | +| 11 | Llama-3.1-8B | 37.87° | 0.549 | 292 | LLaMA | +| 12 | Llama-3.2-1B | 37.57° | 0.550 | 147 | LLaMA | +| 13 | Llama-3.2-3B | 37.41° | 0.547 | 255 | LLaMA | +| 14 | Mistral-7B | 36.21° | 0.540 | 291 | Mistral | + +### Organ Type Breakdown (per-model averages) + +| Model | Skeleton θ | Organs θ | Embed θ | Norm θ | +|-------|-----------|---------|---------|--------| +| SmolLM2-135M | 53.6° | 52.3° | 47.2° | — | +| Qwen2.5-14B | 55.2° | 35.4° | 25.5° | — | +| Qwen2.5-7B | 54.6° | 35.5° | 25.9° | — | +| DeepSeek-R1-14B | 55.4° | 35.2° | 25.2° | — | +| Gemma-2-9B | 47.2° | 37.9° | 26.2° | 81.6° | +| Phi-3.5-Mini | 56.7° | 43.2° | 26.7° | — | +| Llama-3.1-8B | 39.7° | 39.1° | 26.0° | — | +| Mistral-7B | 38.4° | 36.8° | 26.0° | — | + +**Pattern**: Skeleton (attention) consistently scores higher than organs (FFN). +Norm layers reach highest θ when measured separately (Gemma: 81.6°). + +## Chimera Iterations + +### 1. chimera-r1-qwen-7b-v2 — FAILED +- Base: DeepSeek-R1-Distill-Qwen-7B +- Donor: Qwen2.5-7B (FFN organs) +- Result: 512 PAD tokens. Latent spaces incompatible at 7B scale. +- Evidence: `evidence/chimera-7b-failed.log` + +### 2. chimera-selective-v3 — CLEANED +- Selective graft attempt, removed during iteration. + +### 3. model-935-v2 — READY +- Marked as viable intermediate. + +### 4. model-935-v3, model-935-fractal — CLEANED +- Further iterations, removed during cleanup. + +### 5. model-935-14b — SUCCESS +- Base: DeepSeek-R1-Distill-Qwen-14B (skeleton + embeddings) +- Donor: Qwen2.5-14B (FFN organs) +- 579 tensors, 8.4 GB, Qwen2 architecture +- **Produces coherent reasoning output** +- Evidence: `evidence/model-935-14b-inference.log` + +Prompt: "Write a Python function called is_prime" +Output: Structured chain-of-thought reasoning. Correctly identifies prime number +definition, handles edge cases (n < 2), outlines algorithm steps. DeepSeek-R1 +thinking style ("Okay, so the user wants me to...", "Hmm, let's see"). + +**This is a chimera assembled from two different models without any retraining +that produces coherent, structured, correct output.** + +## Kimi K2.5 1T — Deep Z-Profile + +Streaming Z-measure across 13 shards, 1,083 tensors measured. + +| Component | Count | θ avg | +|-----------|-------|-------| +| FFN dense (blk.0) | 12 | 89.95° | +| MoE experts (384x) | 23 | 89.77° | +| Norm layers | 12 | 89.70° | +| Embedding | 1 | 89.45° | +| Shared expert | 23 | 89.43° | +| Attention (MLA) | 99 | 84.07° | + +8 gravitational wells identified at lowest θ — points of maximum compression. + +## Purification + +SmolLM2-135M purified using fractal method (organ_purify_v2.py). +Output: `organs-pure/smollm2-135m/` (138 MB) +Manifest: `PURE_SMOLLM2`, 30 layers, 272 tensors. + +## Signature + +935 diff --git a/transplant_935.py b/transplant_935.py new file mode 100644 index 0000000..353218a --- /dev/null +++ b/transplant_935.py @@ -0,0 +1,126 @@ +#!/usr/bin/env python3 +""" +GGUF-to-GGUF transplant. No organ bins — direct tensor copy between GGUF files. +Base: DeepSeek-R1-Distill-Qwen-7B (skeleton/attention/embed) +Donor: Qwen2.5-7B (FFN organs only) +Z = dI/d(log s) · exp(iθ) — Signature 935 +""" +import struct, os, sys, shutil + +def parse_gguf_header(path): + """Parse GGUF header, return tensor_info list and data_start offset.""" + f = open(path, "rb") + magic = struct.unpack("