Better output from the same model. Fused computation, adaptive precision, surgical expert loading. 305 KB, 19 backends, zero dependencies. https://inference-x.com
20 lines
377 B
Markdown
20 lines
377 B
Markdown
## What does this PR do?
|
|
|
|
<!-- Brief description -->
|
|
|
|
## Type of change
|
|
- [ ] Bug fix
|
|
- [ ] New feature
|
|
- [ ] Performance improvement
|
|
- [ ] New backend / hardware support
|
|
- [ ] Documentation
|
|
- [ ] Other
|
|
|
|
## Testing
|
|
- [ ] `make` succeeds
|
|
- [ ] Tested with at least one model
|
|
- [ ] Benchmarked (if performance-related)
|
|
|
|
## Hardware tested on
|
|
<!-- List hardware you tested on -->
|