vllm 7
- Running MiniMax-M2.1 Locally with Claude Code on Dual RTX Pro 6000
- Guide on installing and running the best models on a dual RTX Pro 6000 rig with vLLM
- Injecting Knowledge into LLMs via Fine-Tuning
- Getting Started with running LLM models locally
- Speeding up local LLM inference 2x with Speculative Decoding
- Open Weights, Borrowed GPUs
- Harnessing GPT-OSS Built-in Tools