117 lines
4.4 KiB
Markdown
117 lines
4.4 KiB
Markdown
# Results
|
|
|
|
## Dissection — 13 Models
|
|
|
|
All models dissected from GGUF to organ .bin files on OASIS (EPYC 48c/503GB).
|
|
|
|
| Model | Params | Organs Dir | Size | Time |
|
|
|-------|--------|-----------|------|------|
|
|
| DeepSeek-R1-Distill-14B | 14B | 9,167 MB | 579 tensors | 22.9s |
|
|
| Qwen2.5-14B | 14B | 9,027 MB | 579 tensors | pre-existing |
|
|
| Gemma-2-9B | 9B | 5,984 MB | 464 tensors | 14.8s |
|
|
| Llama-3.1-8B | 8B | 4,950 MB | 292 tensors | 12.0s |
|
|
| Qwen2.5-7B | 7B | 4,812 MB | 339 tensors | pre-existing |
|
|
| DeepSeek-R1-Distill-7B | 7B | 4,812 MB | 339 tensors | 12.6s |
|
|
| DeepSeek-R1-7B | 7B | 4,812 MB | 339 tensors | pre-existing |
|
|
| Mistral-7B | 7B | 4,432 MB | 291 tensors | 10.6s |
|
|
| Phi-3.5-Mini | 3.8B | 2,397 MB | 197 tensors | 4.9s |
|
|
| Llama-3.2-3B | 3B | 2,100 MB | 255 tensors | 4.9s |
|
|
| Qwen2.5-3B | 3B | 2,003 MB | 434 tensors | 4.6s |
|
|
| Llama-3.2-1B | 1B | 856 MB | 147 tensors | 2.4s |
|
|
| SmolLM2-135M | 135M | 137 MB | 272 tensors | pre-existing |
|
|
|
|
**Total: 50.8 GB of extracted organs. 5,600+ tensors.**
|
|
|
|
## Z-Measure — Full Ranking
|
|
|
|
| # | Model | θ mean | Signal | Tensors | Architecture |
|
|
|---|-------|--------|--------|---------|-------------|
|
|
| ★ | Kimi K2.5 | 87.65° | 0.999 | 1,083 | DeepSeek2 MoE |
|
|
| 1 | SmolLM2-135M | 52.28° | 0.777 | 272 | LLaMA |
|
|
| 2 | DeepSeek-R1-14B | 46.01° | 0.641 | 579 | Qwen2 |
|
|
| 3 | Qwen2.5-3B | 46.00° | 0.640 | 434 | Qwen2 |
|
|
| 4 | Qwen2.5-14B | 45.98° | 0.640 | 579 | Qwen2 |
|
|
| 5 | Qwen2.5-7B | 45.64° | 0.639 | 339 | Qwen2 |
|
|
| 6 | Chimera-DSeek-Qwen | 45.53° | 0.637 | 339 | Qwen2 |
|
|
| 7 | DeepSeek-R1-Distill-7B | 45.53° | 0.637 | 339 | Qwen2 |
|
|
| 8 | DeepSeek-R1-7B | 45.42° | 0.636 | 339 | Qwen2 |
|
|
| 9 | Gemma-2-9B | 44.94° | 0.624 | 464 | Gemma |
|
|
| 10 | Phi-3.5-Mini | 44.65° | 0.626 | 197 | Phi |
|
|
| 11 | Llama-3.1-8B | 37.87° | 0.549 | 292 | LLaMA |
|
|
| 12 | Llama-3.2-1B | 37.57° | 0.550 | 147 | LLaMA |
|
|
| 13 | Llama-3.2-3B | 37.41° | 0.547 | 255 | LLaMA |
|
|
| 14 | Mistral-7B | 36.21° | 0.540 | 291 | Mistral |
|
|
|
|
### Organ Type Breakdown (per-model averages)
|
|
|
|
| Model | Skeleton θ | Organs θ | Embed θ | Norm θ |
|
|
|-------|-----------|---------|---------|--------|
|
|
| SmolLM2-135M | 53.6° | 52.3° | 47.2° | — |
|
|
| Qwen2.5-14B | 55.2° | 35.4° | 25.5° | — |
|
|
| Qwen2.5-7B | 54.6° | 35.5° | 25.9° | — |
|
|
| DeepSeek-R1-14B | 55.4° | 35.2° | 25.2° | — |
|
|
| Gemma-2-9B | 47.2° | 37.9° | 26.2° | 81.6° |
|
|
| Phi-3.5-Mini | 56.7° | 43.2° | 26.7° | — |
|
|
| Llama-3.1-8B | 39.7° | 39.1° | 26.0° | — |
|
|
| Mistral-7B | 38.4° | 36.8° | 26.0° | — |
|
|
|
|
**Pattern**: Skeleton (attention) consistently scores higher than organs (FFN).
|
|
Norm layers reach highest θ when measured separately (Gemma: 81.6°).
|
|
|
|
## Chimera Iterations
|
|
|
|
### 1. chimera-r1-qwen-7b-v2 — FAILED
|
|
- Base: DeepSeek-R1-Distill-Qwen-7B
|
|
- Donor: Qwen2.5-7B (FFN organs)
|
|
- Result: 512 PAD tokens. Latent spaces incompatible at 7B scale.
|
|
- Evidence: `evidence/chimera-7b-failed.log`
|
|
|
|
### 2. chimera-selective-v3 — CLEANED
|
|
- Selective graft attempt, removed during iteration.
|
|
|
|
### 3. model-935-v2 — READY
|
|
- Marked as viable intermediate.
|
|
|
|
### 4. model-935-v3, model-935-fractal — CLEANED
|
|
- Further iterations, removed during cleanup.
|
|
|
|
### 5. model-935-14b — SUCCESS
|
|
- Base: DeepSeek-R1-Distill-Qwen-14B (skeleton + embeddings)
|
|
- Donor: Qwen2.5-14B (FFN organs)
|
|
- 579 tensors, 8.4 GB, Qwen2 architecture
|
|
- **Produces coherent reasoning output**
|
|
- Evidence: `evidence/model-935-14b-inference.log`
|
|
|
|
Prompt: "Write a Python function called is_prime"
|
|
Output: Structured chain-of-thought reasoning. Correctly identifies prime number
|
|
definition, handles edge cases (n < 2), outlines algorithm steps. DeepSeek-R1
|
|
thinking style ("Okay, so the user wants me to...", "Hmm, let's see").
|
|
|
|
**This is a chimera assembled from two different models without any retraining
|
|
that produces coherent, structured, correct output.**
|
|
|
|
## Kimi K2.5 1T — Deep Z-Profile
|
|
|
|
Streaming Z-measure across 13 shards, 1,083 tensors measured.
|
|
|
|
| Component | Count | θ avg |
|
|
|-----------|-------|-------|
|
|
| FFN dense (blk.0) | 12 | 89.95° |
|
|
| MoE experts (384x) | 23 | 89.77° |
|
|
| Norm layers | 12 | 89.70° |
|
|
| Embedding | 1 | 89.45° |
|
|
| Shared expert | 23 | 89.43° |
|
|
| Attention (MLA) | 99 | 84.07° |
|
|
|
|
8 gravitational wells identified at lowest θ — points of maximum compression.
|
|
|
|
## Purification
|
|
|
|
SmolLM2-135M purified using fractal method (organ_purify_v2.py).
|
|
Output: `organs-pure/smollm2-135m/` (138 MB)
|
|
Manifest: `PURE_SMOLLM2`, 30 layers, 272 tensors.
|
|
|
|
## Signature
|
|
|
|
935
|