DOCS: Architecture, Results, Methodology. Evidence logs. transplant_935.py (direct GGUF→GGUF graft). Chimera 14B confirmed reasoning. Signature 935.

This commit is contained in:
ElmadaniS 2026-02-21 04:32:15 +01:00
parent 3582053790
commit 7b42514326
4 changed files with 474 additions and 0 deletions

116
docs/ARCHITECTURE.md Normal file
View File

@ -0,0 +1,116 @@
# Architecture
## Model Anatomy
A transformer model has four anatomical systems:
```
┌─────────────────────────────────────────┐
│ GGUF MONOLITH │
│ │
│ ┌─ embed ──────── token_embd.weight │
│ │ output.weight │
│ │ output_norm.weight │
│ │ │
│ ├─ skeleton ───── attn_q.weight ×N │
│ │ attn_k.weight ×N │
│ │ attn_v.weight ×N │
│ │ attn_output ×N │
│ │ │
│ ├─ organs ─────── ffn_gate.weight ×N │
│ │ ffn_up.weight ×N │
│ │ ffn_down.weight ×N │
│ │ │
│ └─ norm ───────── attn_norm ×N │
│ ffn_norm ×N │
└─────────────────────────────────────────┘
```
**Skeleton** (attention) = how the model thinks. Shared thought patterns.
**Organs** (FFN) = what the model knows. Domain knowledge.
**Embed** = input/output translation. The vocabulary interface.
**Norm** = normalization layers. Connective tissue between components.
## Pipeline
```
GGUF file
▼ organ_extract.py
├── manifest.json (complete anatomy map)
├── skeleton/ (attention tensors)
├── organs/ (FFN tensors by layer)
├── embed/ (embedding + output)
└── norm/ (normalization)
▼ organ_measure.py
Z-measure per tensor
θ ∈ [0°, 90°]
├──▶ organ_purify_v2.py (fractal signal extraction)
├──▶ organ_graft.py (transplant between models)
└──▶ organ_assemble.py → new GGUF
```
Alternative direct path (no intermediate .bin files):
```
GGUF_A + GGUF_B → transplant_935.py → chimera.gguf
```
## Z-Measure Theory
```
Z = dI/d(log s) · exp(iθ)
```
Three indicators combined into θ:
| Indicator | Measures | Signal | Noise |
|-----------|----------|--------|-------|
| Entropy | Information density | Moderate (0.3-0.7) | Near-maximum (>0.95) |
| Kurtosis | Structural sharpness | High (abs > 3) | Near-zero |
| Scale coherence (CV) | Non-uniform spacing | High (> 1) | Low (< 0.5) |
θ → 90° = pure signal (all three indicators confirm structure)
θ → 0° = pure noise (uniform random distribution)
## Purification Methods
### V1: Spectral (FFT)
- Decompose tensor into frequency domain
- Keep high-energy components (signal), remove low-energy tail (noise)
- Preserve original scale (mean/std)
- Limitation: treats tensors like audio signals
### V2: Fractal (Wavelets)
- Haar wavelet multi-scale decomposition
- Cross-scale coherence: pattern at scale s AND scale 2s = fractal = signal
- Pattern at one scale only = noise
- This IS dI/d(log s) — information that persists across scales
- More theoretically grounded than V1
## Graft Compatibility
Grafting works best between models that share:
- Same base architecture (e.g., Qwen2 family)
- Same embedding dimension
- Same number of layers (or graft specific layer ranges)
Empirical results:
- DeepSeek-R1-Distill-14B ↔ Qwen2.5-14B: **WORKS** (both Qwen2 arch, same dims)
- DeepSeek-R1-Distill-7B ↔ Qwen2.5-7B: **PAD tokens** (7B chimera failed)
- Same architecture + same scale = highest success probability
## File Format
Organ .bin files: `[name_len:u32][name:bytes][n_dims:u32][dims:u64×n][dtype:u32][tensor_data]`
Manifest: JSON with full tensor map, metadata, architecture info, Z-measure results.
## Signature
935

116
docs/METHODOLOGY.md Normal file
View File

@ -0,0 +1,116 @@
# Methodology
## Approach
Organ Architecture treats trained AI models as biological organisms with
transplantable parts. Instead of retraining from scratch (costs billions),
we perform post-training surgery: extract, measure, graft, reassemble.
## Step 1: Extraction (organ_extract.py)
Parse GGUF binary format directly:
- Read magic number, version, metadata, tensor info
- Classify each tensor by name pattern into anatomical types
- Extract each tensor as independent .bin file with header
- Generate manifest.json mapping the full anatomy
Classification rules:
- `attn_q`, `attn_k`, `attn_v`, `attn_output` → skeleton
- `ffn_gate`, `ffn_up`, `ffn_down` → organ
- `token_embd`, `output.weight` → embed
- `*_norm` → norm
- `lora_*` → adapter
## Step 2: Measurement (organ_measure.py)
Z-measure: Z = dI/d(log s) * exp(i*theta)
For each tensor, sample up to 100,000 values and compute:
1. **Entropy** (information density):
- Histogram-based Shannon entropy
- Normalized to [0, 1] against maximum entropy
- High entropy (>0.95) = uniform = noise
- Moderate entropy (0.3-0.7) = structured information
2. **Kurtosis** (structure):
- Fourth standardized moment minus 3
- High absolute kurtosis = sharp peaks = organized structure
- Near-zero = Gaussian-like = less organization
3. **Scale coherence** (CV of sorted diffs):
- Sort sampled values, compute differences
- Coefficient of variation of these differences
- High CV = non-uniform spacing = structured signal
- Low CV = uniform spacing = noise
Combined score → theta in [0, 90] degrees.
## Step 3: Purification (organ_purify_v2.py)
Fractal signal extraction via Haar wavelets:
1. Pad tensor to power-of-2 length
2. Haar wavelet decomposition across N scales
3. At each scale: approximation + detail coefficients
4. Cross-scale coherence check:
- Compare energy at scale s with energy at scale 2s
- High coherence (pattern exists at both scales) = fractal = signal
- Low coherence (pattern at one scale only) = noise
5. Attenuate incoherent components (noise)
6. Reconstruct from coherent components (signal)
7. Restore original scale (mean/std preservation)
This directly implements dI/d(log s): information that persists across
logarithmic scales is the signal. Everything else is training artifact.
## Step 4: Grafting (organ_graft.py, transplant_935.py)
Two methods:
### Via .bin intermediaries (organ_graft.py)
1. Extract both source and target models to organ directories
2. Match tensors by layer number and type suffix
3. Verify dimensional compatibility
4. Copy matching .bin files from donor to recipient directory
5. Update manifest
### Direct GGUF-to-GGUF (transplant_935.py)
1. Parse both GGUF headers to get tensor name/offset/size maps
2. Copy base GGUF entirely as starting point
3. For each FFN tensor in base that has a matching donor tensor:
- Verify exact byte size match
- Seek to donor tensor data, read
- Seek to base tensor offset in output, overwrite
4. Result: valid GGUF with patched FFN layers
Direct method is faster and avoids header format issues.
## Step 5: Assembly (organ_assemble.py)
Reconstruct GGUF from organ directory:
1. Read manifest for metadata and tensor ordering
2. Write GGUF header (magic, version, n_tensors, n_metadata)
3. Write metadata key-value pairs
4. Write tensor info (name, dims, dtype, offset) with 32-byte alignment
5. Write tensor data with padding
6. Result: standard GGUF loadable by any compatible runtime
## Step 6: Validation
Run chimera through InferenceX:
- Load GGUF, validate all tensors
- Initialize transformer (attention, KV cache, kernel dispatch)
- Run inference with chat template
- Verify coherent output
## Key Finding
Graft success depends on architectural proximity:
- Same family (Qwen2 base) + same scale (14B) = coherent output
- Same family + different scale (7B) = PAD token failure
- The latent space alignment is implicit in shared training lineage
## Signature
935

116
docs/RESULTS.md Normal file
View File

@ -0,0 +1,116 @@
# Results
## Dissection — 13 Models
All models dissected from GGUF to organ .bin files on OASIS (EPYC 48c/503GB).
| Model | Params | Organs Dir | Size | Time |
|-------|--------|-----------|------|------|
| DeepSeek-R1-Distill-14B | 14B | 9,167 MB | 579 tensors | 22.9s |
| Qwen2.5-14B | 14B | 9,027 MB | 579 tensors | pre-existing |
| Gemma-2-9B | 9B | 5,984 MB | 464 tensors | 14.8s |
| Llama-3.1-8B | 8B | 4,950 MB | 292 tensors | 12.0s |
| Qwen2.5-7B | 7B | 4,812 MB | 339 tensors | pre-existing |
| DeepSeek-R1-Distill-7B | 7B | 4,812 MB | 339 tensors | 12.6s |
| DeepSeek-R1-7B | 7B | 4,812 MB | 339 tensors | pre-existing |
| Mistral-7B | 7B | 4,432 MB | 291 tensors | 10.6s |
| Phi-3.5-Mini | 3.8B | 2,397 MB | 197 tensors | 4.9s |
| Llama-3.2-3B | 3B | 2,100 MB | 255 tensors | 4.9s |
| Qwen2.5-3B | 3B | 2,003 MB | 434 tensors | 4.6s |
| Llama-3.2-1B | 1B | 856 MB | 147 tensors | 2.4s |
| SmolLM2-135M | 135M | 137 MB | 272 tensors | pre-existing |
**Total: 50.8 GB of extracted organs. 5,600+ tensors.**
## Z-Measure — Full Ranking
| # | Model | θ mean | Signal | Tensors | Architecture |
|---|-------|--------|--------|---------|-------------|
| ★ | Kimi K2.5 | 87.65° | 0.999 | 1,083 | DeepSeek2 MoE |
| 1 | SmolLM2-135M | 52.28° | 0.777 | 272 | LLaMA |
| 2 | DeepSeek-R1-14B | 46.01° | 0.641 | 579 | Qwen2 |
| 3 | Qwen2.5-3B | 46.00° | 0.640 | 434 | Qwen2 |
| 4 | Qwen2.5-14B | 45.98° | 0.640 | 579 | Qwen2 |
| 5 | Qwen2.5-7B | 45.64° | 0.639 | 339 | Qwen2 |
| 6 | Chimera-DSeek-Qwen | 45.53° | 0.637 | 339 | Qwen2 |
| 7 | DeepSeek-R1-Distill-7B | 45.53° | 0.637 | 339 | Qwen2 |
| 8 | DeepSeek-R1-7B | 45.42° | 0.636 | 339 | Qwen2 |
| 9 | Gemma-2-9B | 44.94° | 0.624 | 464 | Gemma |
| 10 | Phi-3.5-Mini | 44.65° | 0.626 | 197 | Phi |
| 11 | Llama-3.1-8B | 37.87° | 0.549 | 292 | LLaMA |
| 12 | Llama-3.2-1B | 37.57° | 0.550 | 147 | LLaMA |
| 13 | Llama-3.2-3B | 37.41° | 0.547 | 255 | LLaMA |
| 14 | Mistral-7B | 36.21° | 0.540 | 291 | Mistral |
### Organ Type Breakdown (per-model averages)
| Model | Skeleton θ | Organs θ | Embed θ | Norm θ |
|-------|-----------|---------|---------|--------|
| SmolLM2-135M | 53.6° | 52.3° | 47.2° | — |
| Qwen2.5-14B | 55.2° | 35.4° | 25.5° | — |
| Qwen2.5-7B | 54.6° | 35.5° | 25.9° | — |
| DeepSeek-R1-14B | 55.4° | 35.2° | 25.2° | — |
| Gemma-2-9B | 47.2° | 37.9° | 26.2° | 81.6° |
| Phi-3.5-Mini | 56.7° | 43.2° | 26.7° | — |
| Llama-3.1-8B | 39.7° | 39.1° | 26.0° | — |
| Mistral-7B | 38.4° | 36.8° | 26.0° | — |
**Pattern**: Skeleton (attention) consistently scores higher than organs (FFN).
Norm layers reach highest θ when measured separately (Gemma: 81.6°).
## Chimera Iterations
### 1. chimera-r1-qwen-7b-v2 — FAILED
- Base: DeepSeek-R1-Distill-Qwen-7B
- Donor: Qwen2.5-7B (FFN organs)
- Result: 512 PAD tokens. Latent spaces incompatible at 7B scale.
- Evidence: `evidence/chimera-7b-failed.log`
### 2. chimera-selective-v3 — CLEANED
- Selective graft attempt, removed during iteration.
### 3. model-935-v2 — READY
- Marked as viable intermediate.
### 4. model-935-v3, model-935-fractal — CLEANED
- Further iterations, removed during cleanup.
### 5. model-935-14b — SUCCESS
- Base: DeepSeek-R1-Distill-Qwen-14B (skeleton + embeddings)
- Donor: Qwen2.5-14B (FFN organs)
- 579 tensors, 8.4 GB, Qwen2 architecture
- **Produces coherent reasoning output**
- Evidence: `evidence/model-935-14b-inference.log`
Prompt: "Write a Python function called is_prime"
Output: Structured chain-of-thought reasoning. Correctly identifies prime number
definition, handles edge cases (n < 2), outlines algorithm steps. DeepSeek-R1
thinking style ("Okay, so the user wants me to...", "Hmm, let's see").
**This is a chimera assembled from two different models without any retraining
that produces coherent, structured, correct output.**
## Kimi K2.5 1T — Deep Z-Profile
Streaming Z-measure across 13 shards, 1,083 tensors measured.
| Component | Count | θ avg |
|-----------|-------|-------|
| FFN dense (blk.0) | 12 | 89.95° |
| MoE experts (384x) | 23 | 89.77° |
| Norm layers | 12 | 89.70° |
| Embedding | 1 | 89.45° |
| Shared expert | 23 | 89.43° |
| Attention (MLA) | 99 | 84.07° |
8 gravitational wells identified at lowest θ — points of maximum compression.
## Purification
SmolLM2-135M purified using fractal method (organ_purify_v2.py).
Output: `organs-pure/smollm2-135m/` (138 MB)
Manifest: `PURE_SMOLLM2`, 30 layers, 272 tensors.
## Signature
935

126
transplant_935.py Normal file
View File

@ -0,0 +1,126 @@
#!/usr/bin/env python3
"""
GGUF-to-GGUF transplant. No organ bins direct tensor copy between GGUF files.
Base: DeepSeek-R1-Distill-Qwen-7B (skeleton/attention/embed)
Donor: Qwen2.5-7B (FFN organs only)
Z = dI/d(log s) · exp() Signature 935
"""
import struct, os, sys, shutil
def parse_gguf_header(path):
"""Parse GGUF header, return tensor_info list and data_start offset."""
f = open(path, "rb")
magic = struct.unpack("<I", f.read(4))[0]
version = struct.unpack("<I", f.read(4))[0]
n_tensors = struct.unpack("<Q", f.read(8))[0]
n_metadata = struct.unpack("<Q", f.read(8))[0]
def read_string():
slen = struct.unpack("<Q", f.read(8))[0]
return f.read(slen).decode("utf-8")
def skip_value(vtype):
sizes = {0:1, 1:1, 2:2, 3:2, 4:4, 5:4, 6:4, 7:1, 10:8, 11:8, 12:8}
if vtype in sizes:
f.read(sizes[vtype])
elif vtype == 8:
read_string()
elif vtype == 9:
arr_type = struct.unpack("<I", f.read(4))[0]
arr_len = struct.unpack("<Q", f.read(8))[0]
for _ in range(arr_len):
skip_value(arr_type)
for _ in range(n_metadata):
read_string()
vtype = struct.unpack("<I", f.read(4))[0]
skip_value(vtype)
tensors = []
for _ in range(n_tensors):
name = read_string()
n_dims = struct.unpack("<I", f.read(4))[0]
dims = [struct.unpack("<Q", f.read(8))[0] for _ in range(n_dims)]
dtype = struct.unpack("<I", f.read(4))[0]
offset = struct.unpack("<Q", f.read(8))[0]
tensors.append({"name": name, "dims": dims, "dtype": dtype, "offset": offset})
pos = f.tell()
padding = (32 - (pos % 32)) % 32
f.read(padding)
data_start = f.tell()
f.seek(0, 2)
file_end = f.tell()
f.close()
# Calculate sizes
for i in range(len(tensors)):
if i + 1 < len(tensors):
tensors[i]["size"] = tensors[i+1]["offset"] - tensors[i]["offset"]
else:
tensors[i]["size"] = file_end - data_start - tensors[i]["offset"]
return tensors, data_start, file_end
BASE = "/mnt/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf"
DONOR = "/mnt/models/Qwen2.5-7B-Instruct-Q4_K_M.gguf"
OUTPUT = "/mnt/models/model-935-final.gguf"
print("Parsing base (DeepSeek-R1-7B)...")
base_tensors, base_data_start, base_end = parse_gguf_header(BASE)
print(f" {len(base_tensors)} tensors, data_start={base_data_start}")
print("Parsing donor (Qwen2.5-7B)...")
donor_tensors, donor_data_start, donor_end = parse_gguf_header(DONOR)
print(f" {len(donor_tensors)} tensors, data_start={donor_data_start}")
# Build donor tensor map by name
donor_map = {t["name"]: t for t in donor_tensors}
# Copy base GGUF entirely first
print(f"Copying base to output...")
shutil.copy2(BASE, OUTPUT)
# Now patch: for each FFN tensor in base, if donor has matching name+size, overwrite
out = open(OUTPUT, "r+b")
donor_f = open(DONOR, "rb")
grafted = 0
skipped = 0
for bt in base_tensors:
name = bt["name"]
# Only graft FFN organs (not attention, not embeddings, not norms)
if "ffn_down" not in name and "ffn_up" not in name and "ffn_gate" not in name:
continue
if name in donor_map:
dt = donor_map[name]
if bt["size"] == dt["size"]:
# Read from donor
donor_f.seek(donor_data_start + dt["offset"])
data = donor_f.read(dt["size"])
# Write to output at same offset
out.seek(base_data_start + bt["offset"])
out.write(data)
grafted += 1
else:
skipped += 1
else:
skipped += 1
out.close()
donor_f.close()
print(f"\n{'='*60}")
print(f" MODEL 935 — DIRECT GGUF TRANSPLANT")
print(f"{'='*60}")
print(f" Base: DeepSeek-R1-Distill-Qwen-7B (skeleton+embed)")
print(f" Donor: Qwen2.5-7B-Instruct (FFN organs)")
print(f" Grafted: {grafted} FFN tensors")
print(f" Skipped: {skipped} (size mismatch or not found)")
print(f" Output: {OUTPUT}")
print(f" Size: {os.path.getsize(OUTPUT)/(1024**3):.2f} GB")
print(f" Signature: 935")
print(f"{'='*60}")