vllm 8

Luna: An AI Assistant That Works While I Sleep Jan 31, 2026
Running MiniMax-M2.1 Locally with Claude Code on Dual RTX Pro 6000 Dec 27, 2025
Guide on installing and running the best models on a dual RTX Pro 6000 rig with vLLM Dec 25, 2025
Injecting Knowledge into LLMs via Fine-Tuning Dec 21, 2025
Getting Started with running LLM models locally Nov 16, 2025
Speeding up local LLM inference 2x with Speculative Decoding Oct 26, 2025
Open Weights, Borrowed GPUs Oct 21, 2025
Harnessing GPT-OSS Built-in Tools Oct 11, 2025