Research

Toward AI for science

I'm a Staff AI Engineer becoming a researcher. Day to day I build retrieval and entity-resolution systems over messy real-world data; on my own time I'm working through the mathematics, interpretability, and systems I think AI for science will demand. This page is the work in progress, not a finished résumé.

What I work on

Three threads run through my current work. The first grounds the rest; the other two are where I'm deliberately pushing into research.

Thread 01

Representation learning & retrieval

At Propelus I apply ML to medical licensing, verification, and fraud detection: retrieval, entity resolution, and data matching over structured, sparse, and noisy data. I built a domain-ontology induction pipeline (TNT-LLM-style, orchestrated with LangGraph), fine-tune embedding models, and run two-stage retrieval - a bi-encoder for recall, a cross-encoder reranker for precision. The open question I keep returning to is how embedding architecture and retrieval design trade off when the data is genuinely sparse and dirty. Two ideas anchor the work: retrieval beats classification when the target space keeps shifting, since new or merged categories don't break a retriever the way they break a classifier; and fine-tuned embeddings break the chicken-and-egg between a clean ontology and good retrieval, letting each bootstrap the other.

Thread 02

Mechanistic interpretability

This is the direction I most want to push into next. The plan: use sparse autoencoders (SAEs) to decompose a fine-tuned encoder's embedding space into interpretable features, turning ontology induction into something reproducible and auditable rather than opaque, with runtime feature steering to handle categories that split, merge, or emerge without retraining. I haven't built it yet; right now I'm grounding myself in the foundational interpretability literature (Elhage, Bricken, Park, Turner). It's where my retrieval work and my interest in interpretability converge.

Thread 03

GPU / CUDA as tooling

To understand how models actually compute, I'm writing a transformer inference engine in CUDA from scratch. It's a learning instrument as much as an artifact: working at the kernel level forces an honest account of memory, parallelism, and where the arithmetic really happens - the kind of understanding that's hard to fake from the framework layer down.

What I'm working toward

North star: AI for science, with sequence and protein modeling as the entry point. I don't claim expertise here yet - I'm building the foundations deliberately and in the open.

Mathematical foundations

Proof-based math to make the rest rigorous rather than hand-wavy: linear algebra via Axler, real analysis via Abbott, and geometric deep learning to connect structure and symmetry to learning.

Scientific grounding

A self-directed physics, then chemistry, then biology curriculum - the path toward modeling the systems science actually cares about, rather than treating them as black-box benchmarks.

The entry point

Sequence and protein modeling is where representation learning, interpretability, and scientific domain knowledge all meet. It's the concrete problem I'm orienting the study around.

Background

13+ years in systems and software, including 3+ at Meta. I bring an engineer's bias toward working artifacts and measurable results into research questions that usually get answered by intuition.

Writing

Notes and longer pieces as the work develops. More on the blog.

The Fundamental Questions of the AI Revolution

Nov 1, 2024

How we might avoid unreasoned hype and hate equally, by understanding reality

Papers grounding my work

The literature I keep returning to - the lineage from the foundations of computation and neural networks through the modern architectures and interpretability work my current threads build on. A reading map, not a list I've conquered.

🐍 Snake

Toward AI for science

What I work on

Representation learning & retrieval

Mechanistic interpretability

GPU / CUDA as tooling

What I'm working toward

Mathematical foundations

Scientific grounding

The entry point

Background

Writing

The Fundamental Questions of the AI Revolution

Papers grounding my work

A Logical Calculus of Ideas Immanent in Nervous Activity

On Computable Numbers

Computing Machinery and Intelligence

The Perceptron: A Probabilistic Model

Learning Representations by Back-Propagating Errors

Handwritten Digit Recognition with a Back-Propagation Network

A Neural Probabilistic Language Model

ImageNet Classification with Deep CNNs

Neural Machine Translation by Jointly Learning to Align and Translate

Generative Adversarial Networks

Deep Residual Learning for Image Recognition

Attention Is All You Need

BERT: Pre-training of Deep Bidirectional Transformers

Scaling Laws for Neural Language Models

Language Models are Few-Shot Learners

Learning Transferable Visual Models From Natural Language

An Image is Worth 16x16 Words

Training Language Models to Follow Instructions

PaLM: Scaling Language Modeling with Pathways

GPT-4 Technical Report

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Introducing GPT-5

gpt-oss-120b & gpt-oss-20b Model Card

SAM 2: Segment Anything in Images and Videos

Data Shapley in One Training Run