Seurat FindAllMarkers is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a
verified, drop-in patch that is up to 271.4× faster, returning bit-for-bit identical results with no change to how you call it.
Best speedup271.4×
Median speedup183.8×
Output equivalenceBit-exact
Best runtime baseline 24.90 min → optimized 5.50 s
Datasets7
Pass rate9/10
Benchmark charts
Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
tms_ss2
tms_ss2 · ood_large21 threads · 63.3× speedup23.85 min baseline → 23.60 s optimizedmemory 38 GB → 12 GBtms_ss2 · ood_large24 threads · 154.3× speedup25.33 min baseline → 9.68 s optimizedmemory 38 GB → 12 GBtms_ss2 · ood_large232 threads · 271.4× speedup24.90 min baseline → 5.50 s optimizedmemory 38 GB → 12 GB
271.4×
heart_adult
heart_adult · large1 threads · 51.8× speedup51.87 min baseline → 1.00 min optimizedmemory 65 GB → 29 GBheart_adult · large4 threads · 136.6× speedup64.11 min baseline → 22.79 s optimizedmemory 71 GB → 29 GBheart_adult · large32 threads · 249.7× speedup50.03 min baseline → 12.47 s optimizedmemory 74 GB → 29 GB
249.7×
splitseq_rosenberg
splitseq_rosenberg · ood_large11 threads · 42.8× speedup8.79 min baseline → 12.46 s optimizedmemory 18 GB → 7.1 GBsplitseq_rosenberg · ood_large14 threads · 115.6× speedup8.89 min baseline → 4.61 s optimizedmemory 18 GB → 7.1 GBsplitseq_rosenberg · ood_large132 threads · 222.3× speedup8.90 min baseline → 2.40 s optimizedmemory 18 GB → 7.2 GB
222.3×
gastrulation_pijuansa…
gastrulation_pijuansala · ood_large31 threads · 42.1× speedup31.82 min baseline → 45.31 s optimizedmemory 65 GB → 19 GBgastrulation_pijuansala · ood_large34 threads · 107.7× speedup41.58 min baseline → 17.74 s optimizedmemory 61 GB → 19 GBgastrulation_pijuansala · ood_large332 threads · 199.8× speedup33.03 min baseline → 9.55 s optimizedmemory 64 GB → 19 GB
199.8×
pbmc200k_glaucoma
pbmc200k_glaucoma · medium1 threads · 33.3× speedup12.34 min baseline → 21.46 s optimizedmemory 29 GB → 12 GBpbmc200k_glaucoma · medium4 threads · 88.3× speedup11.81 min baseline → 8.10 s optimizedmemory 29 GB → 12 GBpbmc200k_glaucoma · medium32 threads · 167.8× speedup11.91 min baseline → 4.26 s optimizedmemory 29 GB → 12 GB
167.8×
pbmc68k
pbmc68k · small1 threads · 25.5× speedup1.48 min baseline → 3.72 s optimizedmemory 6.6 GB → 2.8 GBpbmc68k · small4 threads · 69.1× speedup1.62 min baseline → 1.37 s optimizedmemory 6.6 GB → 2.8 GBpbmc68k · small32 threads · 126.3× speedup1.61 min baseline → 751 ms optimizedmemory 6.6 GB → 2.9 GBpbmc68k · smallthread count n/a · 120.5× speedup1.58 min baseline → 788 ms optimizedmemory 6.6 GB → 2.8 GB
The public API stays the same; AutoZyme replaces only the supported fast path.
This task targets FindAllMarkers in Seurat. The benchmarked result
preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.
Also searched as: FindMarkers, FindConservedMarkers, marker genes, differential expression, DEG, DE genes, wilcoxon, wilcox.
Supported scope
The shipped default path (fast_FindAllMarkers_fusion) computes a Wilcoxon-rank-sum (normal-approximation, presto-style) marker test per cluster-vs-rest using a single fused RcppParallel C++ kernel.Read full supported scope
The shipped default path (fast_FindAllMarkers_fusion) computes a Wilcoxon-rank-sum (normal-approximation, presto-style) marker test per cluster-vs-rest using a single fused RcppParallel C++ kernel. It is taken ONLY when ALL of these hold (gate at patch.R:466-480): zyme/turbo enabled (default TRUE); test.use=='wilcox'; slot=='data'; features is NULL (all features); node is NULL; latent.vars is NULL; mean.fxn is NULL; fc.name is NULL; only.pos is FALSE; densify is FALSE; max.cells.per.ident is Inf; min.diff.pct == -Inf; base == 2; no extra (...) args (length(dots)==0); group.by is NULL or 'ident'. Additional runtime guards fall back to upstream: data layer must be a single (joined) dgCMatrix (patch.R:487-493), and Idents() must name all cells (patch.R:499-501). Within that gate, the fast path DOES honor user-supplied values of the args it actually consumes: logfc.threshold (default 0.1, used at :570), min.pct (default 0.01, used at :569), return.thresh (default 1e-2, used at :587), min.cells.group (default 3, used at :561 to skip small clusters), assay, and base (only base==2). Per-cluster small-group skipping matches Seurat behavior (warn+skip rare clusters rather than global fallback). p-value adjustment is Bonferroni over n.features. This matches the benchmarked call (object + verbose=FALSE = all defaults) exactly, so the benchmark exercises the supported fast path.
Out-of-scope behavior
silent possibly wrong
Show detailed speedup table10 runs▾
Dataset
Tier
Platform
Threads
Baseline
Optimized
Speedup
Memory
Concordance
Pass
gastrulation_pijuansala
ood_large3
Windows
32
33.03 min
9.55 s
199.8×
63.5 → 19.1 GB
—
pass
heart_adult
large
Windows
32
50.03 min
12.47 s
249.7×
74.4 → 29.1 GB
—
pass
pbmc200k_glaucoma
medium
Windows
32
11.91 min
4.26 s
167.8×
28.9 → 11.6 GB
—
pass
pbmc68k
small
Windows
32
1.61 min
751 ms
126.3×
6.6 → 2.9 GB
—
pass
splitseq_rosenberg
ood_large1
Windows
32
8.90 min
2.40 s
222.3×
18.0 → 7.2 GB
—
pass
tms_ss2
ood_large2
Windows
32
24.90 min
5.50 s
271.4×
38.0 → 11.7 GB
—
pass
gastrulation_pijuansala
ood_large3
macOS
1
7.42 min
14.11 s
31.5×
20.5 → 28.6 GB
—
fail
pbmc68k_full
medium
macOS
1
2.12 min
840 ms
151.6×
4.8 → 2.4 GB
—
pass
splitseq_rosenberg
ood_large1
macOS
1
5.63 min
2.67 s
126.8×
18.7 → 9.6 GB
—
pass
tms_ss2
ood_large2
macOS
1
21.48 min
5.83 s
221.3×
25.8 → 17.4 GB
—
pass
Frequently asked questions
Speeding up Seurat FindAllMarkers
Why is Seurat FindAllMarkers slow?
Seurat FindAllMarkers is CPU-bound, and the stock implementation in Seurat leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 24.90 min where the AutoZyme path takes 5.50 s (271.4× faster).
How do I make Seurat FindAllMarkers faster?
Install AutoZyme and activate the Seurat patch, then keep using Seurat FindAllMarkers exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 271.4× faster on the benchmark datasets, with no pipeline or API changes.
Does the AutoZyme speedup change the Seurat FindAllMarkers output?
No. The accelerated path returns bit-for-bit identical results to the original Seurat implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.
How do I install the Seurat speedup?
In R: install the autozyme package, then run library(autozyme) and autozyme::activate("seurat"). The patch applies automatically the next time you call FindAllMarkers.