Python Bulk genomics & enrichment BayesSpace

Speed up BayesSpace

BayesSpace is one of the slower steps in many bulk genomics & enrichment workflows. AutoZyme ships a verified, drop-in patch that is up to 13.1× faster, returning output within a validated, bounded difference with no change to how you call it.

Best speedup 13.1×
Median speedup 8.70×
Output equivalence Bounded
Best runtime baseline 8.27 min optimized 37.87 s
Datasets 5
Pass rate 10/10

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
10x_visium_human_lymp…10x_visium_human_brea…10x_visium_mouse_brai…10x_visium_mouse_brai…thrane_melanoma_ST_me…
Thread sweep
Speedup across finalized thread counts on Windows
No finalized multi-thread sweep for this platform.
Memory
Baseline vs optimized peak memory on Windows
0.0 GB1.0 GB2.0 GB10x_visium_human_…0.98×10x_visium_human_…1.00×10x_visium_mouse_…1.00×10x_visium_mouse_…0.99×thrane_melanoma_S…0.90×10x_visium_human_lymph_node · ood_largememory 1.8 GB → 1.8 GBoptimized / baseline 0.98×13.1× speedup · 1 threads10x_visium_human_breast_cancer_block_a · ood_xlargememory 1.7 GB → 1.7 GBoptimized / baseline 1.00×11.9× speedup · 1 threads10x_visium_mouse_brain_sagittal_posterior · largememory 1.5 GB → 1.5 GBoptimized / baseline 1.00×10.3× speedup · 1 threads10x_visium_mouse_brain_sagittal_anterior · mediummemory 1.5 GB → 1.5 GBoptimized / baseline 0.99×8.92× speedup · 1 threadsthrane_melanoma_ST_mel1_rep2 · smallmemory 0.9 GB → 0.8 GBoptimized / baseline 0.90×8.41× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets BayesSpace::spatialCluster in BayesSpace. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: spatial clustering, spatial domains, spatialCluster.

Supported scope

The patch overrides ONLY BayesSpace internal iterate_t, the Gibbs/MH MCMC inner loop invoked when spatialCluster is called with model="t". Read full supported scope

The patch overrides ONLY BayesSpace internal iterate_t, the Gibbs/MH MCMC inner loop invoked when spatialCluster is called with model="t". Within that path the fast kernel (fast_iterate_t_impl in src/bayesspace.cpp) reproduces the full upstream t-model math for arbitrary n (spots), d (PC dims), q>=2 (clusters), gamma, nrep, thin, burn.in, and any df_j neighbor structure (handles empty neighbor lists). It is platform-agnostic because platform only affects how spatialCluster builds df_j (the neighbor list) before iterate_t runs, and any q,d,gamma,nrep are honored. It is NOT bit-exact: BLAS-batched rooti projection introduces fp reordering and, more importantly, the proposal draw was changed from Rcpp::sample to R::unif_rand, which rotates the entire RNG trajectory — equilibrium distribution preserved, individual trajectory diverges. Correctness is gated statistically (ARI/NMI permutation-invariant clustering similarity, noise_multiplier widened 5%), not element-wise. Validated tiers span platform in {ST, Visium}, q in 4-12, d in 7-20, nrep 10000-50000.

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 10 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
10x_visium_human_breast_cancer_block_a ood_xlarge Windows 1 14.05 min 1.18 min 11.9× 1.7 → 1.7 GB pass
10x_visium_human_lymph_node ood_large Windows 1 8.27 min 37.87 s 13.1× 1.8 → 1.8 GB pass
10x_visium_mouse_brain_sagittal_anterior medium Windows 1 2.24 min 15.07 s 8.92× 1.5 → 1.5 GB pass
10x_visium_mouse_brain_sagittal_posterior large Windows 1 5.59 min 32.63 s 10.3× 1.5 → 1.5 GB pass
thrane_melanoma_ST_mel1_rep2 small Windows 1 37.66 s 4.48 s 8.41× 0.9 → 0.8 GB pass
10x_visium_human_breast_cancer_block_a ood_xlarge macOS 1 12.28 min 1.32 min 9.34× 1.9 → 1.8 GB pass
10x_visium_human_lymph_node ood_large macOS 1 5.40 min 38.73 s 8.37× 1.9 → 1.9 GB pass
10x_visium_mouse_brain_sagittal_anterior medium macOS 1 1.75 min 13.71 s 7.65× 1.6 → 1.5 GB pass
10x_visium_mouse_brain_sagittal_posterior large macOS 1 4.42 min 31.32 s 8.48× 1.7 → 1.6 GB pass
thrane_melanoma_ST_mel1_rep2 small macOS 1 29.23 s 3.66 s 7.99× 1.1 → 0.9 GB pass

Frequently asked questions

Speeding up BayesSpace
Why is BayesSpace slow?

BayesSpace is CPU-bound, and the stock implementation in BayesSpace leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 8.27 min where the AutoZyme path takes 37.87 s (13.1× faster).

How do I make BayesSpace faster?

Install AutoZyme and activate the BayesSpace patch, then keep using BayesSpace exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 13.1× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the BayesSpace output?

Differences are small and bounded: concordance-validated to within roughly 1.5 to 5% of the original BayesSpace result on every benchmark dataset, inside a frozen gate.

How do I install the BayesSpace speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("bayesspace"). The patch applies automatically the next time you call BayesSpace::spatialCluster.