Self-hosted AI
Local models, real workflows. Inference orchestration, retrieval, and agent runtimes built to live on my own metal — not rented tokens, not vendor lock.
I build self-hosted AI, run a fleet of machines that thinks like one, and ship product end-to-end — from kernel to checkout.
Most of what I work on lives between the operating system and the product surface — the layer that decides whether anything else works. Here’s the shape of it.
Local models, real workflows. Inference orchestration, retrieval, and agent runtimes built to live on my own metal — not rented tokens, not vendor lock.
Twenty machines that behave like one. Doctrine, syncing, and shared memory across hardware that runs from a closet, not a region.
Full-stack delivery on a tight schedule — e-commerce, internal tools, collector workflows. Whatever the business actually needs, one IT leader can ship.
Long-running systems that act on intent, not just instructions. Memory, identity, and guardrails — designed so the next session can pick up where the last one stopped.
Real telemetry on real workloads. Not dashboards for show — signals that actually wake somebody up when the right thing breaks.
Backups, isolation, secrets routing, and the unglamorous work of keeping anything serious in production. The reason the rest of the list still runs.
Two dozen pieces of hardware — workstations, servers, laptops, GPU boxes — running a shared doctrine. One operator, one source of truth, and just enough automation that the machines stay in agreement when I’m not looking.
Nothing on this list is novel. The interesting part is that one person keeps it all running together — cohesively, in production, without a platform team.
Real pages, not anchor scroll — each one a stop in the operating portfolio.
I take on a small number of consulting engagements when the fit is right — infrastructure design, AI ops, or the kind of full-stack delivery that needs a senior pair of hands.