# IX Web — Web Interface for Inference-X IX Web is a self-contained web chat interface for Inference-X. It lets you talk to any AI model running on your own hardware, with a model selector, hardware stats, and an OpenAI-compatible API. **Zero dependencies.** Pure Python stdlib + one HTML file. No npm, no Node.js, no frameworks. ## Quickstart ```bash # 1. Build Inference-X (from repo root) make # 2. Download a model ./ix download qwen-2.5-3b # 3. Start IX Web python3 web/ix_server.py ``` Open http://localhost:9090 — that's it. You have your own AI. ## What you get - **Chat interface** at `/` — dark theme, model selector, typing indicator, markdown rendering - **OpenAI-compatible API** at `/v1/chat/completions` — drop-in replacement for any OpenAI client - **Model list** at `/v1/models` — all detected GGUF models with sizes - **Hardware stats** at `/health` — CPU, RAM, core count - **Hot-swap models** — switch between models from the dropdown, no restart needed ## Architecture ``` Browser → ix_server.py (port 9090) → inference-x binary → .gguf model ``` IX Web spawns the IX binary per request. The model loads, generates, and exits. This means: - **Any silicon** — the protocol routes to your hardware - **No persistent memory** — each request is independent - **Any model size** — from 135M to 1T parameters, if you have the RAM ## Options ``` python3 web/ix_server.py --help --port 8080 # Custom port (default: 9090) --host 127.0.0.1 # Bind to localhost only --ix /path/to/inference-x # Custom binary path --models /path/to/models # Custom model directory (repeatable) ``` ## Model auto-detection IX Web scans these directories for `.gguf` files: 1. `./models/` (repo root) 2. `~/.cache/inference-x/models/` 3. `~/models/` 4. Any path passed via `--models` ## API usage IX Web is OpenAI-compatible. Use any client: ```python import requests r = requests.post("http://localhost:9090/v1/chat/completions", json={ "model": "qwen-2.5-3b", "messages": [{"role": "user", "content": "Hello!"}], "max_tokens": 256 }) print(r.json()["choices"][0]["message"]["content"]) ``` ```bash curl http://localhost:9090/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"auto","messages":[{"role":"user","content":"Hi"}]}' ``` ## Files ``` web/ ├── ix_server.py # HTTP server (Python, 0 dependencies) ├── chat.html # Chat interface (single HTML file) └── README.md # This file ``` ## License BSL-1.1 — same as Inference-X. Free for all use under $1M revenue.