R Seurat methods Seurat

Speed up Seurat IntegrateLayers CCA

Seurat IntegrateLayers CCA is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a verified, drop-in patch that is up to 10.6× faster, returning output within a strict, verified tolerance with no change to how you call it.

Best speedup 10.6×
Median speedup 9.41×
Output equivalence Tolerance
Best runtime baseline 134.81 min optimized 12.69 min
Datasets 6
Pass rate 11/11

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
gastrulation_pijuansa…tms_ss2_cca_111kpbmc160k_cite_cca_48kpbmc200k_glaucoma_cca…heart_adult_cca_42kifnb_cca_14k
Thread sweep
Speedup across finalized thread counts on Windows
No finalized multi-thread sweep for this platform.
Memory
Baseline vs optimized peak memory on Windows
0.0 GB50 GB100 GBgastrulation_piju…1.13×tms_ss2_cca_111k1.23×pbmc160k_cite_cca…1.05×pbmc200k_glaucoma…1.11×heart_adult_cca_4…1.07×ifnb_cca_14k1.08×gastrulation_pijuansala_cca_139k · largememory 50 GB → 57 GBoptimized / baseline 1.13×10.6× speedup · 1 threadstms_ss2_cca_111k · mediummemory 29 GB → 36 GBoptimized / baseline 1.23×10.1× speedup · 1 threadspbmc160k_cite_cca_48k · ood_large3memory 7.1 GB → 7.5 GBoptimized / baseline 1.05×4.79× speedup · 1 threadspbmc200k_glaucoma_cca_47k · ood_large2memory 5.2 GB → 5.8 GBoptimized / baseline 1.11×4.77× speedup · 1 threadsheart_adult_cca_42k · ood_large1memory 4.8 GB → 5.1 GBoptimized / baseline 1.07×3.33× speedup · 1 threadsifnb_cca_14k · smallmemory 2.5 GB → 2.7 GBoptimized / baseline 1.08×2.22× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets IntegrateLayers · CCA in Seurat. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: integration, batch correction, CCA, canonical correlation, IntegrateData, anchors, data integration.

Supported scope

The benchmarked entry point Seurat::IntegrateLayers(method = CCAIntegration) runs stock Seurat and dispatches into the patched method chain. Read full supported scope

The benchmarked entry point Seurat::IntegrateLayers(method = CCAIntegration) runs stock Seurat and dispatches into the patched method chain. Correct/supported combination: method = CCAIntegration with the default CCA pipeline (reduction = "cca", normalization.method = "LogNormalize"), default integration biology (anchor.features=2000, dims=1:30, k.anchor=5, k.filter, k.score, k.weight=100, l2.norm=TRUE, scale=TRUE), Seurat v5 object with split RNA layers and a "pca" reduction present, and Python (numpy+scipy with PROPACK svds) available. On macOS the RunCCA fast path additionally requires the threadpoolctl thread-guard. The shipped patch restored upstream n.trees=50 in both FindIntegrationAnchors and FindWeights (the silent n.trees=10 speed shortcut present in the task-local pipeline/run.R was removed), and CCAIntegration now forwards caller-supplied dims and k.weight to upstream exactly. Equivalence is numerical-approximation, not bit-exact: RunCCA.default replaces base::svd with scipy svds (solver=propack) on a float32 cross-product; measured min_cc_cor >= 0.9993 and cc_dim_cor_mean >= 0.9996 across all six datasets at threads=1, well above the 0.90 task thresholds.

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 11 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
gastrulation_pijuansala_cca_139k large Windows 1 134.81 min 12.69 min 10.6× 50.3 → 56.7 GB pass
heart_adult_cca_42k ood_large1 Windows 1 24.97 min 7.51 min 3.33× 4.8 → 5.1 GB pass
ifnb_cca_14k small Windows 1 1.55 min 41.77 s 2.22× 2.5 → 2.7 GB pass
pbmc160k_cite_cca_48k ood_large3 Windows 1 27.10 min 5.67 min 4.79× 7.1 → 7.5 GB pass
pbmc200k_glaucoma_cca_47k ood_large2 Windows 1 25.95 min 5.45 min 4.77× 5.2 → 5.8 GB pass
tms_ss2_cca_111k medium Windows 1 96.42 min 9.56 min 10.1× 29.0 → 35.7 GB pass
heart_adult_cca_42k ood_large1 macOS 1 33.00 min 4.67 min 7.07× 6.2 → 8.8 GB pass
ifnb_cca_14k small macOS 1 1.90 min 12.09 s 9.41× 3.4 → 3.3 GB pass
pbmc160k_cite_cca_48k ood_large3 macOS 1 37.86 min 3.25 min 11.7× 11.2 → 13.9 GB pass
pbmc200k_glaucoma_cca_47k ood_large2 macOS 1 35.86 min 3.02 min 11.9× 10.0 → 12.6 GB pass
tms_ss2_cca_111k medium macOS 1 146.18 min 4.86 min 30.1× 21.0 → 20.7 GB pass

Frequently asked questions

Speeding up Seurat IntegrateLayers CCA
Why is Seurat IntegrateLayers CCA slow?

Seurat IntegrateLayers CCA is CPU-bound, and the stock implementation in Seurat leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 134.81 min where the AutoZyme path takes 12.69 min (10.6× faster).

How do I make Seurat IntegrateLayers CCA faster?

Install AutoZyme and activate the Seurat patch, then keep using Seurat IntegrateLayers CCA exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 10.6× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the Seurat IntegrateLayers CCA output?

Effectively no. The output is tolerance-equivalent: held within a frozen concordance gate (up to about 0.6% drift from the original Seurat result) on every benchmark dataset.

How do I install the Seurat speedup?

In R: install the autozyme package, then run library(autozyme) and autozyme::activate("seurat"). The patch applies automatically the next time you call IntegrateLayers CCA.