R Seurat methods Seurat

Speed up Seurat SCTransform

Seurat SCTransform is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a verified, drop-in patch that is up to 16.7× faster, returning output within a strict, verified tolerance with no change to how you call it.

Best speedup 16.7×
Median speedup 9.16×
Output equivalence Tolerance
Best runtime baseline 20.85 min optimized 1.25 min
Datasets 6
Pass rate 7/7

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
pbmc200k_glaucomasplitseq_rosenbergtms_ss2gastrulation_pijuansa…pbmc68kgastrulation_pijuansa…
Thread sweep
Speedup across finalized thread counts on Windows
10×20×1414full (32)pbmc200k_glaucoma · medium1 threads · 10.8× speedup20.85 min baseline → 1.92 min optimizedmemory 119 GB → 19 GBpbmc200k_glaucoma · medium4 threads · 16.0× speedup20.85 min baseline → 1.31 min optimizedmemory 119 GB → 19 GBpbmc200k_glaucoma · medium32 threads · 16.7× speedup20.85 min baseline → 1.25 min optimizedmemory 119 GB → 19 GBsplitseq_rosenberg · ood_large11 threads · 10.7× speedup14.16 min baseline → 1.32 min optimizedmemory 87 GB → 17 GBsplitseq_rosenberg · ood_large14 threads · 16.7× speedup14.16 min baseline → 51.00 s optimizedmemory 87 GB → 11 GBsplitseq_rosenberg · ood_large132 threads · 16.5× speedup14.16 min baseline → 51.63 s optimizedmemory 87 GB → 11 GBtms_ss2 · ood_large21 threads · 6.66× speedup11.75 min baseline → 1.76 min optimizedmemory 97 GB → 25 GBtms_ss2 · ood_large24 threads · 9.16× speedup11.75 min baseline → 1.28 min optimizedmemory 97 GB → 25 GBtms_ss2 · ood_large232 threads · 8.74× speedup11.75 min baseline → 1.34 min optimizedmemory 97 GB → 25 GBgastrulation_pijuansala · ood_large31 threads · 6.91× speedup14.13 min baseline → 2.04 min optimizedmemory 84 GB → 38 GBgastrulation_pijuansala · ood_large34 threads · 8.59× speedup14.13 min baseline → 1.65 min optimizedmemory 84 GB → 38 GBgastrulation_pijuansala · ood_large332 threads · 8.43× speedup14.13 min baseline → 1.68 min optimizedmemory 84 GB → 38 GBpbmc68k · small1 threads · 5.48× speedup4.50 min baseline → 46.44 s optimizedmemory 41 GB → 4.4 GBpbmc68k · small4 threads · 8.34× speedup4.50 min baseline → 30.47 s optimizedmemory 41 GB → 4.4 GBpbmc68k · small32 threads · 6.67× speedup4.26 min baseline → 38.14 s optimizedmemory 41 GB → 4.4 GBgastrulation_pijuansala_60k · ood_large31 threads · 4.44× speedup5.61 min baseline → 1.03 min optimizedmemory 46 GB → 14 GBgastrulation_pijuansala_60k · ood_large314 threads · 4.55× speedup4.50 min baseline → 1.00 min optimizedmemory 50 GB → 14 GB
pbmc200k_glaucomasplitseq_rosenbergtms_ss2gastrulation_pijuan…pbmc68kgastrulation_pijuan…
Memory
Baseline vs optimized peak memory on Windows
0.0 GB100 GB200 GBpbmc200k_glaucoma0.16×tms_ss20.25×splitseq_rosenberg0.13×gastrulation_piju…0.45×gastrulation_piju…0.28×pbmc68k0.11×pbmc200k_glaucoma · mediummemory 119 GB → 19 GBoptimized / baseline 0.16×16.7× speedup · 32 threadstms_ss2 · ood_large2memory 97 GB → 25 GBoptimized / baseline 0.25×9.16× speedup · 4 threadssplitseq_rosenberg · ood_large1memory 87 GB → 11 GBoptimized / baseline 0.13×16.7× speedup · 4 threadsgastrulation_pijuansala · ood_large3memory 84 GB → 38 GBoptimized / baseline 0.45×8.59× speedup · 4 threadsgastrulation_pijuansala_60k · ood_large3memory 50 GB → 14 GBoptimized / baseline 0.28×4.55× speedup · 14 threadspbmc68k · smallmemory 41 GB → 4.4 GBoptimized / baseline 0.11×6.67× speedup · 32 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets SCTransform in Seurat. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: SCT, sctransform, variance stabilizing transform, regularized negative binomial.

Supported scope

Fast path runs ONLY for the upstream-default v2 SCTransform config on a standard sparse RNA counts assay. Read full supported scope

Fast path runs ONLY for the upstream-default v2 SCTransform config on a standard sparse RNA counts assay. The gate .seurat_sct_supported_default_path (patch.R:1526-1560) requires ALL of: reference.SCT.model=NULL; do.correct.umi=TRUE; ncells numeric/finite/>0; residual.features=NULL; variable.features.n numeric/finite/>0; variable.features.rv.th identical to 1.3; vars.to.regress=NULL; latent.data=NULL; do.scale=FALSE; do.center=TRUE; clip.range exactly equal to the default c(-sqrt(ncol/30), sqrt(ncol/30)) (all.equal, patch.R:1515-1524); vst.flavor identical to 'v2'; conserve.memory=FALSE; return.only.var.genes=TRUE; and ZERO extra ... arguments (length(extra_args)==0). Each of these is also the upstream Seurat 5.4.0 default, so the benchmarked call is squarely inside the fast path. For SCTransform.Seurat, additionally requires a single non-SCT assay name (patch.R:2023). The fast .default reconstructs Pearson residual mean/variance and corrected UMIs via Rcpp kernels (turbo_csc_to_csr / turbo_stats_correct_sparse / turbo_fused_resid_center_sparse) assuming the y~log_umi model with min_variance='umi_median' (the v2 contract).

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 7 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
gastrulation_pijuansala ood_large3 Windows 4 14.13 min 1.65 min 8.59× 83.5 → 37.9 GB pass
gastrulation_pijuansala_60k ood_large3 Windows 14 4.50 min 1.00 min 4.55× 50.3 → 13.8 GB pass
pbmc200k_glaucoma medium Windows 32 20.85 min 1.25 min 16.7× 118.7 → 19.3 GB pass
pbmc68k small Windows 4 4.50 min 30.47 s 8.34× 40.8 → 4.4 GB pass
splitseq_rosenberg ood_large1 Windows 4 14.16 min 51.00 s 16.7× 86.8 → 11.0 GB pass
tms_ss2 ood_large2 Windows 4 11.75 min 1.28 min 9.16× 97.2 → 24.7 GB pass
pbmc68k small macOS 14 2.74 min 17.69 s 9.35× 52.9 → 17.5 GB pass

Frequently asked questions

Speeding up Seurat SCTransform
Why is Seurat SCTransform slow?

Seurat SCTransform is CPU-bound, and the stock implementation in Seurat leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 20.85 min where the AutoZyme path takes 1.25 min (16.7× faster).

How do I make Seurat SCTransform faster?

Install AutoZyme and activate the Seurat patch, then keep using Seurat SCTransform exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 16.7× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the Seurat SCTransform output?

Effectively no. The output is tolerance-equivalent: held within a frozen concordance gate (up to about 0.6% drift from the original Seurat result) on every benchmark dataset.

How do I install the Seurat speedup?

In R: install the autozyme package, then run library(autozyme) and autozyme::activate("seurat"). The patch applies automatically the next time you call SCTransform.