ElmadaniS/README.md
2026-02-24 22:27:51 +00:00

1.7 KiB

Salka Elmadani

Building Inference-X — better output from the same model.

Universal AI inference engine. Fused computation, adaptive precision, surgical expert loading. 305 KB, 19 backends, zero dependencies. Built in Morocco for the world.

What I build

Project What it does
Inference-X Universal inference engine — 305 KB binary, 19 hardware backends, 23 quantization formats, fused dequant+dot kernels, Shannon entropy adaptive precision. Same model, cleaner signal.
ATLAS Open mathematical framework for neural network analysis.
Organ Architecture Neural network surgery — extracting, measuring, and grafting components between AI models to create functional chimeras.

How it works

The same model produces higher-fidelity output through Inference-X because the computation path is cleaner: fused kernels eliminate intermediate buffers, adaptive precision allocates depth where it matters, and surgical expert loading keeps only active parameters in memory.

A smaller model running through a clean engine can outperform a larger model running through a noisy one.

Philosophy

The best inference engine is the one you do not notice. You should hear the model, not the framework.

inference-x.com · Documentation · Source Code · Elmadani.SALKA@proton.me


Morocco

@ElmadaniSa13111 on X · Support the project