diff --git a/README.md b/README.md index 57425b5..9748f90 100644 --- a/README.md +++ b/README.md @@ -1,50 +1,51 @@ +
+ # IX Tools -**Utilities for the Inference-X ecosystem** +**Utilities, scripts and integrations for the Inference-X ecosystem** -[Inference-X](https://inference-x.com) · [Community](https://git.inference-x.com/inference-x-community) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[**inference-x.com**](https://inference-x.com) · [**Community**](https://git.inference-x.com/inference-x-community) + +
--- -A collection of tools for working with GGUF models, benchmarking hardware, managing Inference-X deployments, and integrating with common workflows. +## Contents -## Tools +### Model Management +- `model-fetch` — Download GGUF models from HuggingFace with integrity verification +- `model-convert` — Convert safetensors/PyTorch → GGUF (wraps llama.cpp converter) +- `model-bench` — Benchmark a model: tokens/s, TTFT, memory usage -| Tool | Description | -|---|---| -| `ix-bench` | Hardware benchmark — measures t/s across backends | -| `ix-convert` | Convert models to GGUF from SafeTensors, PyTorch | -| `ix-pull` | Download GGUF models from Hugging Face | -| `ix-serve` | Production-ready IX server wrapper with auth, logging | -| `ix-proxy` | Load balancer across multiple IX instances | -| `ix-monitor` | Dashboard — GPU usage, t/s, active connections | -| `ix-chat` | Terminal chat UI with history and markdown rendering | -| `ix-embed` | Batch embedding generation tool | +### Server Utilities +- `ix-proxy` — Nginx config generator for multi-model inference-x deployments +- `ix-systemd` — systemd unit file generator for background inference services +- `ix-docker` — Minimal Dockerfile (~5 MB) for containerized inference -## Quick Start +### Integration Scripts +- `openai-proxy` — HTTP adapter: OpenAI SDK → Inference-X API +- `langchain-adapter` — LangChain provider for Inference-X +- `ollama-bridge` — Drop-in Ollama API compatibility layer + +### Monitoring +- `ix-metrics` — Prometheus metrics exporter (requests/s, latency, GPU util) +- `ix-dashboard` — Simple HTML dashboard for monitoring IX instances + +## Usage ```bash git clone https://git.inference-x.com/inference-x/ix-tools -cd ix-tools && make +cd ix-tools -# Benchmark your hardware -./ix-bench --all-backends +# Benchmark a model +./model-bench mistral-7b-v0.1.Q4_K_M.gguf --tokens 100 -# Download a model -./ix-pull qwen2.5-7b-instruct-q4_k_m.gguf - -# Start production server -./ix-serve --model model.gguf --port 8080 --workers 4 - -# Monitor running instances -./ix-monitor --host localhost:8080 +# Set up as a system service +sudo ./ix-systemd install mistral-7b-v0.1.Q4_K_M.gguf --port 8080 ``` -## Hardware Benchmark Reference (Feb 2026) - -Run `./ix-bench` to measure your actual hardware. -Community results: [git.inference-x.com/inference-x-community/ix-scout](https://git.inference-x.com/inference-x-community/ix-scout) - --- -[inference-x.com](https://inference-x.com) · MIT License +*Part of the [Inference-X ecosystem](https://inference-x.com)*