Orchid 1.0: A Reproducible Recipe for Aligned Ternary-Weight Language Models on Consumer Hardware
Description
We present Orchid 1.0, a 2-billion-parameter ternary-weight language model aligned through a three-stage LoRA pipeline (reasoning SFT, identity-and-knowledge SFT, and Odds-Ratio Preference Optimization) on a single RTX 3050 laptop with 4 GB of VRAM. We document each design decision, memory-management technique, and recovery procedure that made the training feasible on this hardware.
We then describe and resolve the ternary merge problem — the destructive interaction between LoRA deltas and ternary weight quantization — which motivated the construction of ternative.cpp, a purpose-built C++ inference engine that loads a base I2_S GGUF and a separate LoRA adapter GGUF and merges them at full precision at load time. Ternative.cpp supports CPU (AVX2, OpenMP) and GPU (CUDA 12.6) execution with an OpenAI-compatible HTTP server.
We evaluate Orchid 1.0 on four standard benchmarks: ARC-Challenge 56.0% (+6.1 pp over the BitNet base), HellaSwag 52.0%, WinoGrande 74.0%, and MMLU 38.6%.
All artifacts are openly available:
- Model: https://huggingface.co/MicheRomChis/orchid-1.0
- Inference engine: https://github.com/michelangeloromerochisco/ternative
- Training code: https://github.com/michelangeloromerochisco/orchid-1.0
Files
orchid-1-0-technical-paper.pdf
Files
(516.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9e07d969ac08b0fbe0d33e7040d10754
|
516.2 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- Software: https://github.com/michelangeloromerochisco/ternative (URL)
- Software: https://github.com/michelangeloromerochisco/orchid-1.0 (URL)
- Dataset: https://huggingface.co/MicheRomChis/orchid-1.0 (URL)
Software
- Repository URL
- https://github.com/michelangeloromerochisco/ternative