Contributing to OpenFold 3: A Primer
- 1 A Brief Overview of OpenFold 3
- 2 OpenFold 3 Local Setup
- 3 OpenFold 3 Technical Deep Dive
- 4 Contributing to OpenFold 3: A Primer you are here
What this is
If youāre new to OpenFold 3, as I am, the obvious question is where you can realistically be useful. This post is my attempt at an answer, written for two readers at once: someone deciding whether and where to jump in, and me, trying to pick a single lane I can go deep on instead of scattering effort across the whole thing.
A caveat up front. This is not a replacement for the repoās own contributor guide, which lives at docs/source/contribution.md and is published on Read the Docs. That document covers the mechanics. This post is the layer on top of it: where, given finite time and a fresh pair of eyes, the work actually is. In the last post I said itās necessary to understand the codebase before you contribute. Consider this the payoff of that promise.
The mechanics
The loop is short, which is the first encouraging thing. Fork the repo and clone it, then stand up the environment with pixi the same way the setup post walks through. The openfold3-cuda12 environment installs the package editable with the dev extras already baked in, so thereās no separate pip install step:
pixi install -e openfold3-cuda12
Run setup_openfold once (it downloads the model parameters and builds the local Chemical Component Dictionary), and confirm the suite passes before you touch anything:
pixi run -e openfold3-cuda12 setup_openfold
pixi run -e openfold3-cuda12 pytest openfold3/tests
That pixi run -e openfold3-cuda12 prefix gets old fast. As I mention in the setup post, I aliased it to ofrun (alias --save ofrun "pixi run -e openfold3-cuda12" in fish, or the plain alias ofrun="..." form in bash/zsh), and Iāll use that from here on.
Before you open a PR, format and lint:
ofrun ruff format
ofrun ruff check --fix
Two ruff choices are worth remembering, both of which I covered in the deep dive: an 88-character line length, and relative imports banned outright, so everything is imported by full path. The PR template asks for five things, Summary, Changes, Related Issues, Testing, and Other Notes. The Testing field is not decorative. The guide explicitly suggests turning whatever examples you used to convince yourself the change works into actual test cases, which is good advice in general and close to mandatory here.
If that still sounds intimidating, it shouldnāt. The two PRs Iāve landed so far were both single-file changes: a broken Slack invite link and a flaky test fix. Starting small is not a consolation prize, itās the recommended path.
The rules of the house
OpenFold has an explicit, recently written policy on AI-assisted contributions, and given how Iām writing this series, Iām exactly the person itās talking to.
The short version is three rules. First, every contribution, AI-assisted or not, is the human contributorās responsibility; you have to fully understand and stand behind anything you submit. Second, issues and pull requests should be written by humans, because theyāre the first line of communication with the core team, and that dialogue is the whole point. (The only exceptions are translation and generating a failing test case, and youāre asked to disclose even those.) Third, āgood first issuesā are reserved for new human developers, and agentic contributions on them will be closed, because those issues exist as a learning gateway and not as tickets to be cleared.
The reasoning behind all of it is that this is a small core team with finite review time, and theyād rather have a few high-signal contributions than a flood of generated ones. That seems reasonable to me, so hereās where I land. These tools are great for understanding the code and for drafting my own thinking, but the issue, the PR description, and the judgment behind them are mine. I opened both of my PRs myself, and Iām not pointing anything automated at the newcomer issues. If you lean on these tools too, that feels like the line to hold.
The lanes
The nice thing about a project this organized is that I donāt have to invent the map. The repoās own issue labels basically are the map, so here are the lanes I see, each tied to a real label and a live issue so you can confirm itās not hypothetical.
-
Performance and systems (my lean). Labels:
inference,enhancement. This is the memory wall from the setup post: the pair and triangle tensors that scale with the square of the sequence length, thechunk_sizeand offload levers that fight back, and the OOM ceiling on consumer cards. A live example is issue #225, an inference OOM inget_token_frame_atoms. Suits people with a systems or CUDA background. -
Kernels and hardware portability. Labels:
inference,bug. Thereās avalidate_rocm.pyinentry_points, and an open AMD bug (#177) whereprep_cutlassbreaks DeepSpeedās Evoformer attention on ROCm. This is for people who like being close to the metal. -
Data pipeline. Label:
data preprocessing. MSA generation, templates, the CCD, input formats. Issue #172 (MSA features not allowed at the chain level) is a live one. High leverage and less glamorous, which is often where the real work hides. -
Model and science. Labels:
model,science. Confidence heads, the diffusion module, modality parity across RNA, DNA, and ligands. Issue #247 (a possible mismatch in nucleotide PAE frame atom order) is the flavor here. Suits people with an ML or structural-biology background who want to touch the model itself. -
Testing, CI, and reproducibility. This is the one that bit me: snapshot tests that disagree across GPUs, conda and pixi parity, determinism. My flaky-test PR lived here. Underrated impact, and a good fit for infra-minded people.
-
Docs and developer experience. Labels:
documentation,Installation. Setup friction, confusing error messages, missing examples. Issue #149 (a DataLoader worker dying on a fresh install) is the kind of thing thatās both a real fix and the gentlest possible on-ramp. My first PR lived here too.
One observation from skimming the open issues: they skew heavily toward bug and enhancement, then inference, with model, training, and config nearly empty. The honest read is that the demand right now is for fixing and hardening, not green-field model work. Worth knowing before you go hunting for a glamorous problem.
What the history actually shows
I wanted to know where newcomers actually succeed, so I read through the recently merged PRs, and the history tells the story better than I could.
The clearest example is a contributor going by GMNGeoffrey, who has quietly become the owner of the chunk-size and tuning lane. Across a string of PRs (making chunking for the Triton kernels match cuEquivariance, promoting tune_chunk_size to a top-level config field, avoiding retesting non-viable chunk sizes, forcing chunk sizes to powers of two, and more) itās one person, one lane, building real trust by going deep rather than wide. It also happens to be the lane Iām eyeing, which is either encouraging or a warning depending on how you look at it.
And itās not just one person. Community contributors have landed conda packaging fixes (sdvillal), documentation and a segfault fix (etowahadams), corrected CIF ligand output (ryanhulke), and small performance cleanups like removing redundant contiguous calls (borisfom). The contributor list is genuinely not just the consortium.
Thereās a catch, though: as I write this there is essentially one open āgood first issue.ā One. So the realistic path for a newcomer is not to wait for a labeled gateway issue to appear. Itās to find a small, real bug or a performance nit and fix it cleanly. Which, Iāll admit, is exactly what my two PRs were, except I backed into that strategy by accident rather than planning it.
How to choose a lane
So how do you actually pick? The four things Iām weighing:
- Leverage. Does fixing this help a lot of people, or just me?
- Fit. Does it match what Iām already good at?
- Maintainer pull. Is anyone actually asking for it? The
contributions welcomelabel is a useful signal, and at the moment only two issues carry it (#58 and #40), which tells you where the door is most clearly open. - Scope. Can a first PR here be small and complete, or does being useful require a month of context first?
And one piece of advice Iām mostly giving to myself: talk to the maintainers before sinking weeks into something. The Slack and the issue tracker exist for exactly this, and itās a lot cheaper to ask āwould you take this?ā than to find out at review time that the answer was no.
Where I am leaning, and why
The technical case
Iām planting the flag on performance and systems, specifically memory on consumer GPUs. The reasoning is concrete. The memory wall is the thing standing between a 24GB card and real-world-size inputs, and it isnāt abstract: issue #225 is a live inference OOM, and thatās the kind of specific, bounded first target I want rather than a vague intention to āmake it faster.ā
From there the threads are visible. Inference currently defaults to full fp32 (precision: "32-true" in the trainer args), which on a memory-bound card is a real lever sitting untouched. The chunking and offload machinery is already there to build on. And none of this requires me to understand the diffusion math to be useful, which matters a lot when youāre starting.
Why this lane, even though it isnāt the point
I want to be honest about the motivation here, because it would be easy to misread the systems focus as the thing Iām passionate about. It isnāt. I have a systems and CUDA background and I genuinely donāt mind the work, chasing down where the bytes went and fighting a memory budget is satisfying enough to do well, but it is not what actually pulls me to this project.
What pulls me is the biology itself, the ML fundamentals underneath these models, and the chance to have some real impact on humanity by accelerating the work of actual bench biologists and everyone downstream who builds on this software. Protein structure prediction is one of the few corners of computing where the output runs more or less straight into disease research and drug discovery, and that is the part I care about. The CUDA work is a means to that end, not the end itself.
So Iām picking the systems lane because it is the highest-leverage place I can be useful right now given what I already know, not because it is the dream. Being genuinely useful on a real project beats waiting until I feel āqualifiedā to touch the model or the science directly. In an earlier post on motivation and meaning I landed on the idea that the work worth doing is both high-leverage and pointed at something you actually care about. The memory wall is the leverage. The science it unblocks is the part that matters, and the part Iām steering toward.
Plug in
If any of this resonates, the repo is on GitHub, the issues page is the best place to see whatās live, and thereās an OpenFold Slack (the invite link works now, since fixing it was my first PR). If youāre eyeing the same memory-and-performance lane I am, say hi. Iād rather figure this out alongside someone than alone.
A structure I actually ran
The closing structure for the series is electron transfer flavoprotein subunit α (ETFA). It is a fitting one to end on, because it is a collector: it sits in the mitochondrial matrix and gathers electrons from a dozen different flavoprotein dehydrogenases (the enzymes of fatty acid and amino acid breakdown) and funnels them into the respiratory chain. A lot of separate metabolic streams converge on it before their energy ever reaches the proton gradient. That felt like a reasonable metaphor for a post about finding a lane: many possible entry points, one place to focus them.
This is the cleanest prediction of the four, almost entirely high-confidence blue across both of its domains, with only a short low-confidence tail. Superposed against AlphaFoldās model of the same sequence (UniProt P13804), it matches to 0.54 Ć backbone RMSD over 311 of 312 residues. Across the whole series the same result keeps showing up: where OpenFold 3 is confident, it lands within about half an Ć„ngstrƶm of AlphaFold, which is the kind of reproducibility that makes the open, commercially usable version genuinely useful rather than merely interesting. (As noted throughout, the public database is AlphaFold 2; there is no public bulk AlphaFold 3 download, but for monomers like these it is a fair reference.)