Research preview · bioinformatics performance layer

An optimization layer for bioinformatics tools and scientific computing.

AutoZyme finds slow, memory-hungry paths inside widely used research software, uses autonomous search to produce candidate optimizations, and gates every public result on scientific output concordance. The goal is simple: make community tools faster, without asking scientists to change their workflows.

Preprint coming soon · arXiv:TBD First public releases: SeuratTurbo · R / ScanpyTurbo · Python · 40+ bioinformatics targets in the active pipeline

By the numbers

Aggregated across all tracked benchmarks
98
Benchmarks
40+
Active tool targets
5
Datasets
18
Methods
8.4×
Median speedup
4,363×
Peak speedup

How AutoZyme finds and compounds speedups

Agent discovery and community requests converge into one verification pipeline
Agent-discovered

AutoZyme searches for bottlenecks

The AutoZyme agent scans public bioinformatics and scientific-computing ecosystems for slow, memory-heavy, or widely used methods worth optimizing.

Community-requested

Researchers nominate pain points

Users request packages that are too slow, hit OOM, or waste repeated analysis time. Votes and reproducibility help prioritize the queue.

01

Baseline and gates

We freeze representative inputs, upstream baselines, output concordance metrics, and acceptance thresholds.

02

Iterate and verify

AutoZyme generates candidate changes, benchmarks them, rejects divergent outputs, and keeps only reproducible speedups.

03

Release and package

Accepted optimizations are published as drop-in AutoZyme packages or upstream-ready patches with reproducible benchmark evidence.

04

Return to Lab

The optimized result becomes the new public baseline in AutoZyme Lab, where contributors can try to push it further.

Lab is not only a submission portal. It is the public continuation point: every verified release can become a new frozen challenge baseline for the community to improve again.

Benchmarks at a glance

Released public subset; broader bioinformatics targets are staged as they pass gates
AutoZyme benchmark figure: per-method speedup and concordance for Seurat (panel A) and Scanpy (panel B), across four dataset sizes
Figure 1 Per-method wall-clock speedup (left), before/after timings, peak memory, and output concordance, for Seurat (A) and Scanpy (B) at 4 threads on an AMD Ryzen 9 7950X.
Download PDF ↓

Top speedups

Best wall-clock ratio per method
Scanpy
sc_rank_genes
heart_adult · 32 threads
4,363×
21.81 min 300 ms
Seurat
run_pca
pbmc68k · 32 threads
69.9×
1.86 min 1.60 s
Seurat
normalize_data
pbmc68k · 32 threads
59.5×
2.38 s 40 ms
Seurat
find_all_markers
heart_adult · 1 threads
48.5×
44.46 min 54.97 s
Seurat
sctransform
pbmc200k · 1 threads
47.9×
86.29 min 1.80 min
Scanpy
sc_neighbors
pbmc68k · 32 threads
35.6×
32.00 s 900 ms
Scanpy
sc_highly_variable
heart_adult · 32 threads
34.5×
6.90 s 200 ms
Scanpy
sc_normalize
heart_adult · 32 threads
31.7×
9.50 s 300 ms
Scanpy
sc_pca
pbmc200k_glaucoma · 32 threads
28.7×
57.40 s 2.00 s

Install

Drop-in — no API changes to your existing pipelines
R · SeuratTurbo Seurat v5.x
# Install from GitHub (CRAN release coming)
remotes::install_github("ElliotXie/seurat-turbo")

library(Seurat)
library(SeuratTurbo)   # activates patches

# Use Seurat exactly as you normally would —
# NormalizeData / RunPCA / FindClusters / etc.
# are transparently accelerated.
Requires R ≥ 4.0, Seurat ≥ 5.0
Python · ScanpyTurbo Scanpy v1.11.x
# Install from GitHub (PyPI release coming)
pip install git+https://github.com/ElliotXie/scanpy-turbo.git

import scanpy as sc
import scanpy_turbo   # activates patches

# Use Scanpy exactly as you normally would —
# pp.normalize_total / tl.leiden / etc.
# are transparently accelerated.
Requires Python ≥ 3.10, Scanpy ≥ 1.11

Currently optimized

First public drop-in packages; more tool families are moving through the queue
Seurat · R
SeuratTurbo
Drop-in patches for Seurat v5.x
NormalizeData · FindVariableFeatures · ScaleData · RunPCA · FindNeighbors · FindClusters (Louvain / Leiden) · RunUMAP · FindAllMarkers · SCTransform · IntegrateData (CCA)
ElliotXie/seurat-turbo
Scanpy · Python
ScanpyTurbo
Drop-in patches for Scanpy v1.11.x
normalize_total · highly_variable_genes · scale · pca · neighbors · leiden · umap · rank_genes_groups
ElliotXie/scanpy-turbo

How AutoZyme works

Under the hood, AutoZyme runs an autonomous research loop: candidate optimizations are generated, benchmarked against the upstream baseline on real datasets, and filtered on both speed and output concordance. The same framework applies beyond the first single-cell releases: any bioinformatics or scientific-computing method with slow runtime, out-of-memory failures, or repeated human waiting time can become an AutoZyme target.

You can nominate packages on the Suggest & Vote page or contribute directly through AutoZyme Lab. The long-term goal is a shared optimization layer for science: community requests, frozen benchmark gates, hidden validation, and credited speedups that flow back to everyone.

How to cite

@misc{autozyme2026,
  title  = {AutoZyme: Autonomous-Research-Driven Speedups for Scientific Toolkits},
  author = {The AutoZyme Team},
  year   = {2026},
  note   = {Manuscript in preparation}
}