The age of hyper-personalized software

Why I run local LLMs to power a multimodal event crawler

Dec 30, 2025 AI

Running MiniMax-M2.1 Locally with Claude Code on Dual RTX Pro 6000

Run Claude Code with your own local MiniMax-M2.1 model using vLLM's native Anthropic API endpoint support.

Dec 27, 2025 AI

Guide on installing and running the best models on a dual RTX Pro 6000 rig with vLLM

Step-by-step vLLM stable/nightly install on Ubuntu 24.04 for a dual RTX Pro 6000 (96GB x2), model download workflow, and a fix for tp=2 hangs (IOMMU). Includes tested serve commands for Devstral 123B, GLM-4.5/4.6V, Qwen3 235B, MiniMax-M2, and gpt-oss-120b.

Dec 25, 2025 AI

Injecting Knowledge into LLMs via Fine-Tuning

A practical guide to injecting new knowledge into LLM models through fine-tuning, using Q&A pairs generated from documentation.

Dec 21, 2025 AI, Development

Three Years of ChatGPT

Today marks 3 years since ChatGPT was launched. In this short article I reflect on how far LLMs have come in just a few years, from getting early access to GPT-4 to now running open models that surpass it, and share two graphs that illustrate both the progress in open-weight models and the increasingly close race between OpenAI, Google, and Anthropic (with Google currently in the lead).

Nov 30, 2025 AI

Getting Started with running LLM models locally

A guide to running large language models locally: hardware options, inference engines (vLLM, SGLang, llama.cpp), quantization techniques, and user interfaces.

Nov 16, 2025 AI