docs: professional README — Feb 2026
This commit is contained in:
parent
43c771d2e3
commit
80cb97bcf2
65
README.md
65
README.md
@ -1,50 +1,51 @@
|
||||
<div align="center">
|
||||
|
||||
# IX Tools
|
||||
|
||||
**Utilities for the Inference-X ecosystem**
|
||||
**Utilities, scripts and integrations for the Inference-X ecosystem**
|
||||
|
||||
[Inference-X](https://inference-x.com) · [Community](https://git.inference-x.com/inference-x-community)
|
||||
[](LICENSE)
|
||||
|
||||
[**inference-x.com**](https://inference-x.com) · [**Community**](https://git.inference-x.com/inference-x-community)
|
||||
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
A collection of tools for working with GGUF models, benchmarking hardware, managing Inference-X deployments, and integrating with common workflows.
|
||||
## Contents
|
||||
|
||||
## Tools
|
||||
### Model Management
|
||||
- `model-fetch` — Download GGUF models from HuggingFace with integrity verification
|
||||
- `model-convert` — Convert safetensors/PyTorch → GGUF (wraps llama.cpp converter)
|
||||
- `model-bench` — Benchmark a model: tokens/s, TTFT, memory usage
|
||||
|
||||
| Tool | Description |
|
||||
|---|---|
|
||||
| `ix-bench` | Hardware benchmark — measures t/s across backends |
|
||||
| `ix-convert` | Convert models to GGUF from SafeTensors, PyTorch |
|
||||
| `ix-pull` | Download GGUF models from Hugging Face |
|
||||
| `ix-serve` | Production-ready IX server wrapper with auth, logging |
|
||||
| `ix-proxy` | Load balancer across multiple IX instances |
|
||||
| `ix-monitor` | Dashboard — GPU usage, t/s, active connections |
|
||||
| `ix-chat` | Terminal chat UI with history and markdown rendering |
|
||||
| `ix-embed` | Batch embedding generation tool |
|
||||
### Server Utilities
|
||||
- `ix-proxy` — Nginx config generator for multi-model inference-x deployments
|
||||
- `ix-systemd` — systemd unit file generator for background inference services
|
||||
- `ix-docker` — Minimal Dockerfile (~5 MB) for containerized inference
|
||||
|
||||
## Quick Start
|
||||
### Integration Scripts
|
||||
- `openai-proxy` — HTTP adapter: OpenAI SDK → Inference-X API
|
||||
- `langchain-adapter` — LangChain provider for Inference-X
|
||||
- `ollama-bridge` — Drop-in Ollama API compatibility layer
|
||||
|
||||
### Monitoring
|
||||
- `ix-metrics` — Prometheus metrics exporter (requests/s, latency, GPU util)
|
||||
- `ix-dashboard` — Simple HTML dashboard for monitoring IX instances
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
git clone https://git.inference-x.com/inference-x/ix-tools
|
||||
cd ix-tools && make
|
||||
cd ix-tools
|
||||
|
||||
# Benchmark your hardware
|
||||
./ix-bench --all-backends
|
||||
# Benchmark a model
|
||||
./model-bench mistral-7b-v0.1.Q4_K_M.gguf --tokens 100
|
||||
|
||||
# Download a model
|
||||
./ix-pull qwen2.5-7b-instruct-q4_k_m.gguf
|
||||
|
||||
# Start production server
|
||||
./ix-serve --model model.gguf --port 8080 --workers 4
|
||||
|
||||
# Monitor running instances
|
||||
./ix-monitor --host localhost:8080
|
||||
# Set up as a system service
|
||||
sudo ./ix-systemd install mistral-7b-v0.1.Q4_K_M.gguf --port 8080
|
||||
```
|
||||
|
||||
## Hardware Benchmark Reference (Feb 2026)
|
||||
|
||||
Run `./ix-bench` to measure your actual hardware.
|
||||
Community results: [git.inference-x.com/inference-x-community/ix-scout](https://git.inference-x.com/inference-x-community/ix-scout)
|
||||
|
||||
---
|
||||
|
||||
[inference-x.com](https://inference-x.com) · MIT License
|
||||
*Part of the [Inference-X ecosystem](https://inference-x.com)*
|
||||
|
||||
Loading…
Reference in New Issue
Block a user