Projects.

Home

2026

CAD Bench
CAD Benchmark Build123D Python

An executable CAD benchmark for language-model CAD agents, with 17 Build123D tasks, difficulty-weighted scoring, model harnesses, and a public results leaderboard.

2026

EDA Bench
EDA Benchmark KiCad PCB

An execution-based EDA benchmark for agents that reconstruct KiCad PCB projects, combining frozen task packs with ngspice-driven I/O simulation graders.

2026

Residual Controllers
Research Reasoning Control

A research project on residual controllers for reasoning switches, exploring how controllers can steer when reasoning systems change strategies.

Dec 2025

Muon
Electron TypeScript Bun

An infinite canvas web browser built with Electron and Typescript.

June 2025

PromptMaxx
Python Textual

A tool to automate context gathering and applying changes when pair programming with online LLMs.