R Seurat methods Seurat

Speed up Seurat NormalizeData

Seurat NormalizeData is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a verified, drop-in patch that is up to 43.7× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 43.7×
Median speedup 33.4×
Output equivalence Bit-exact
Best runtime baseline 15.00 s optimized 340 ms
Datasets 6
Pass rate 11/11

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
pbmc200k_glaucomasplitseq_rosenbergpbmc68kheart_adulttms_ss2gastrulation_pijuansa…
Thread sweep
Speedup across finalized thread counts on Windows
25×50×14full (32)pbmc200k_glaucoma · medium1 threads · 10.5× speedup14.84 s baseline → 1.42 s optimizedmemory 19 GB → 18 GBpbmc200k_glaucoma · medium4 threads · 22.2× speedup14.78 s baseline → 670 ms optimizedmemory 19 GB → 18 GBpbmc200k_glaucoma · medium32 threads · 43.7× speedup15.00 s baseline → 340 ms optimizedmemory 19 GB → 18 GBsplitseq_rosenberg · ood_large11 threads · 11.6× speedup8.11 s baseline → 710 ms optimizedmemory 11 GB → 11 GBsplitseq_rosenberg · ood_large14 threads · 29.3× speedup8.20 s baseline → 280 ms optimizedmemory 11 GB → 11 GBsplitseq_rosenberg · ood_large132 threads · 43.2× speedup8.24 s baseline → 190 ms optimizedmemory 11 GB → 11 GBpbmc68k · small1 threads · 9.71× speedup3.03 s baseline → 306 ms optimizedmemory 4.6 GB → 4.0 GBpbmc68k · small4 threads · 27.1× speedup2.95 s baseline → 110 ms optimizedmemory 3.9 GB → 3.9 GBpbmc68k · small32 threads · 37.4× speedup2.95 s baseline → 80 ms optimizedmemory 3.9 GB → 3.9 GBheart_adult · large1 threads · 9.42× speedup37.39 s baseline → 4.11 s optimizedmemory 50 GB → 47 GBheart_adult · large4 threads · 21.8× speedup39.41 s baseline → 1.78 s optimizedmemory 50 GB → 47 GBheart_adult · large32 threads · 35.5× speedup38.72 s baseline → 1.09 s optimizedmemory 50 GB → 47 GBtms_ss2 · ood_large21 threads · 7.33× speedup18.41 s baseline → 2.51 s optimizedmemory 21 GB → 20 GBtms_ss2 · ood_large24 threads · 20.0× speedup18.64 s baseline → 920 ms optimizedmemory 21 GB → 20 GBtms_ss2 · ood_large232 threads · 34.7× speedup18.13 s baseline → 530 ms optimizedmemory 21 GB → 20 GBgastrulation_pijuansala · ood_large31 threads · 8.11× speedup30.06 s baseline → 3.75 s optimizedmemory 33 GB → 33 GBgastrulation_pijuansala · ood_large34 threads · 20.1× speedup30.92 s baseline → 1.51 s optimizedmemory 33 GB → 33 GBgastrulation_pijuansala · ood_large332 threads · 33.4× speedup30.41 s baseline → 910 ms optimizedmemory 33 GB → 33 GB
pbmc200k_glaucomasplitseq_rosenbergpbmc68kheart_adulttms_ss2gastrulation_pijuan…
Memory
Baseline vs optimized peak memory on Windows
0.0 GB25 GB50 GBheart_adult0.95×gastrulation_piju…1.00×tms_ss20.95×pbmc200k_glaucoma0.96×splitseq_rosenberg0.96×pbmc68k0.87×heart_adult · largememory 50 GB → 47 GBoptimized / baseline 0.95×35.5× speedup · 32 threadsgastrulation_pijuansala · ood_large3memory 33 GB → 33 GBoptimized / baseline 1.00×33.4× speedup · 32 threadstms_ss2 · ood_large2memory 21 GB → 20 GBoptimized / baseline 0.95×34.7× speedup · 32 threadspbmc200k_glaucoma · mediummemory 19 GB → 18 GBoptimized / baseline 0.96×43.7× speedup · 32 threadssplitseq_rosenberg · ood_large1memory 11 GB → 11 GBoptimized / baseline 0.96×43.2× speedup · 32 threadspbmc68k · smallmemory 4.6 GB → 4.0 GBoptimized / baseline 0.87×9.71× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets NormalizeData in Seurat. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: normalization, LogNormalize, log normalize, normalize_total.

Supported scope

Fast path handles LogNormalize column normalization on a Seurat v5 object whose target assay is an Assay5/StdAssay with a single "counts" layer, producing the default "data" layer. Read full supported scope

Fast path handles LogNormalize column normalization on a Seurat v5 object whose target assay is an Assay5/StdAssay with a single "counts" layer, producing the default "data" layer. It is gated to: zyme/turbo TRUE; normalization.method exactly "LogNormalize"; scale.factor a finite numeric scalar (length 1); margin a finite numeric scalar == 1; block.size NULL; no extra dot arguments (length(...)==0); assay NULL or a single non-NA character assay name that exists; assay inherits StdAssay with layers/cells/features slots; Layers(search="counts") returns exactly "counts"; counts layer non-null and coercible to dgCMatrix. Algorithm: per column, col_sum = sum of nonzero entries (equals full column total for sparse counts), then x := fast_log1p(x * scale.factor/col_sum); columns with col_sum<=0 left untouched (matches upstream zero-column behavior). It writes the data layer and adds the "data" column to the cells/features LogMaps, then runs LogSeuratCommand. Numeric equivalence is exact within ~1e-5 (the fast_log1p polynomial approximation has ~1.8e-11 max abs error, far inside the task's max_rel_err<=0.01 / data_cor>=0.999 gates).

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 11 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
gastrulation_pijuansala ood_large3 Windows 32 30.41 s 910 ms 33.4× 33.4 → 33.4 GB pass
heart_adult large Windows 32 38.72 s 1.09 s 35.5× 49.5 → 47.3 GB pass
pbmc200k_glaucoma medium Windows 32 15.00 s 340 ms 43.7× 19.2 → 18.3 GB pass
pbmc68k small Windows 32 2.95 s 80 ms 37.4× 3.9 → 3.9 GB pass
splitseq_rosenberg ood_large1 Windows 32 8.24 s 190 ms 43.2× 11.3 → 10.8 GB pass
tms_ss2 ood_large2 Windows 32 18.13 s 530 ms 34.7× 20.9 → 19.9 GB pass
gastrulation_pijuansala ood_large3 macOS 14 18.16 s 1.49 s 12.2× 26.0 → 17.2 GB pass
pbmc200k_glaucoma medium macOS 1 5.89 s 795 ms 8.30× 16.5 → 10.7 GB pass
pbmc68k small macOS 14 1.56 s 144 ms 9.00× 3.9 → 2.6 GB pass
splitseq_rosenberg ood_large1 macOS 1 3.15 s 360 ms 8.22× 14.2 → 12.0 GB pass
tms_ss2 ood_large2 macOS 1 10.32 s 912 ms 11.5× 19.8 → 18.8 GB pass

Frequently asked questions

Speeding up Seurat NormalizeData
Why is Seurat NormalizeData slow?

Seurat NormalizeData is CPU-bound, and the stock implementation in Seurat leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 15.00 s where the AutoZyme path takes 340 ms (43.7× faster).

How do I make Seurat NormalizeData faster?

Install AutoZyme and activate the Seurat patch, then keep using Seurat NormalizeData exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 43.7× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the Seurat NormalizeData output?

No. The accelerated path returns bit-for-bit identical results to the original Seurat implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the Seurat speedup?

In R: install the autozyme package, then run library(autozyme) and autozyme::activate("seurat"). The patch applies automatically the next time you call NormalizeData.