R Cell-cell communication nichenetr

Speed up NicheNet

NicheNet is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a verified, drop-in patch that is up to 1,482× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 1,482×
Median speedup 1,187×
Output equivalence Bit-exact
Best runtime baseline 2.08 min optimized 94 ms
Datasets 5
Pass rate 10/10

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
lcmv_mouse_fullifnb_human_x2ifnb_human_x4tms_spleen_BvsT_x6gbm_malig_vs_macro_x10
Thread sweep
Speedup across finalized thread counts on Windows
1,000×2,000×14full (8)lcmv_mouse_full · small1 threads · 224.7× speedup2.07 min baseline → 630 ms optimizedmemory 1.0 GB → 0.8 GBlcmv_mouse_full · small4 threads · 645.2× speedup1.81 min baseline → 166 ms optimizedmemory 1.0 GB → 0.8 GBlcmv_mouse_full · small8 threads · 1,482× speedup2.08 min baseline → 94 ms optimizedmemory 1.0 GB → 0.8 GBifnb_human_x2 · medium1 threads · 189.5× speedup6.08 min baseline → 2.06 s optimizedmemory 1.7 GB → 1.2 GBifnb_human_x2 · medium4 threads · 575.4× speedup5.02 min baseline → 519 ms optimizedmemory 1.7 GB → 1.2 GBifnb_human_x2 · medium8 threads · 1,293× speedup5.70 min baseline → 285 ms optimizedmemory 1.7 GB → 1.2 GBifnb_human_x4 · large1 threads · 164.2× speedup10.91 min baseline → 4.06 s optimizedmemory 2.8 GB → 1.8 GBifnb_human_x4 · large4 threads · 577.0× speedup11.36 min baseline → 1.18 s optimizedmemory 2.8 GB → 1.8 GBifnb_human_x4 · large8 threads · 1,250× speedup10.84 min baseline → 549 ms optimizedmemory 2.8 GB → 1.8 GBtms_spleen_BvsT_x6 · ood_large1 threads · 121.0× speedup11.89 min baseline → 6.04 s optimizedmemory 2.9 GB → 1.9 GBtms_spleen_BvsT_x6 · ood_large4 threads · 520.9× speedup13.26 min baseline → 1.54 s optimizedmemory 2.5 GB → 1.9 GBtms_spleen_BvsT_x6 · ood_large8 threads · 972.7× speedup13.78 min baseline → 832 ms optimizedmemory 2.5 GB → 1.9 GBgbm_malig_vs_macro_x10 · ood_xlarge1 threads · 109.4× speedup27.72 min baseline → 15.35 s optimizedmemory 5.1 GB → 3.6 GBgbm_malig_vs_macro_x10 · ood_xlarge4 threads · 429.2× speedup29.51 min baseline → 4.13 s optimizedmemory 5.1 GB → 3.6 GBgbm_malig_vs_macro_x10 · ood_xlarge8 threads · 737.0× speedup27.37 min baseline → 2.30 s optimizedmemory 5.1 GB → 3.6 GB
lcmv_mouse_fullifnb_human_x2ifnb_human_x4tms_spleen_BvsT_x6gbm_malig_vs_macro_…
Memory
Baseline vs optimized peak memory on Windows
0.0 GB5.0 GB10 GBgbm_malig_vs_macr…0.71×tms_spleen_BvsT_x60.63×ifnb_human_x40.64×ifnb_human_x20.68×lcmv_mouse_full0.76×gbm_malig_vs_macro_x10 · ood_xlargememory 5.1 GB → 3.6 GBoptimized / baseline 0.71×429.2× speedup · 4 threadstms_spleen_BvsT_x6 · ood_largememory 2.9 GB → 1.9 GBoptimized / baseline 0.63×121.0× speedup · 1 threadsifnb_human_x4 · largememory 2.8 GB → 1.8 GBoptimized / baseline 0.64×164.2× speedup · 1 threadsifnb_human_x2 · mediummemory 1.7 GB → 1.2 GBoptimized / baseline 0.68×575.4× speedup · 4 threadslcmv_mouse_full · smallmemory 1.0 GB → 0.8 GBoptimized / baseline 0.76×645.2× speedup · 4 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets NicheNet in nichenetr. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: cell-cell communication, ligand-receptor, ligand activity, CCC.

Supported scope

Fast path is taken only when zyme=TRUE (default) AND single=TRUE (default), i.e. the per-ligand scoring mode that is nichenetr's common/default use case. Read full supported scope

Fast path is taken only when zyme=TRUE (default) AND single=TRUE (default), i.e. the per-ligand scoring mode that is nichenetr's common/default use case. It correctly computes the four output metrics (auroc, aupr, aupr_corrected, pearson) for any potential_ligands all present in colnames(ligand_target_matrix), any geneset/background with at least one gene matching rownames(ligand_target_matrix), for a DENSE base R numeric matrix ligand_target_matrix. Dispatch by ligand count: n_ligands >= 64 -> parallel C++ kernel score_ligands_cpp (threads = min(getOption('autozyme.threads',14), n_ligands, physical cores)); n_ligands < 64 -> vectorized R fallback using .nichenetr_fast_score_selected_metrics (bit-exact to upstream caTools::trapz formulation). Within the C++ kernel, n_pos<=2048 uses the binary-search LigandScoreBinaryWorker, n_pos>2048 uses the heap-vector LigandScoreWorker; both produce the same metrics. The benchmark exercises this exact path: all dev/OOD tiers call with the four data args only (single defaults to upstream TRUE), so the benchmarked_call is pure upstream defaults except for data, and bit-exact agreement is reported (max_abs_diff_aupr_corrected=0, all correlations=1).

Out-of-scope behavior

errors

Show detailed speedup table 10 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
gbm_malig_vs_macro_x10 ood_xlarge Windows 8 27.37 min 2.30 s 737.0× 5.1 → 3.6 GB pass
ifnb_human_x2 medium Windows 8 5.70 min 285 ms 1,293× 1.7 → 1.2 GB pass
ifnb_human_x4 large Windows 8 10.84 min 549 ms 1,250× 2.8 → 1.8 GB pass
lcmv_mouse_full small Windows 8 2.08 min 94 ms 1,482× 1.0 → 0.8 GB pass
tms_spleen_BvsT_x6 ood_large Windows 8 13.78 min 832 ms 972.7× 2.5 → 1.9 GB pass
gbm_malig_vs_macro_x10 ood_xlarge macOS 4 15.63 min 1.26 s 752.7× 6.5 → 3.7 GB pass
ifnb_human_x2 medium macOS 8 3.23 min 247 ms 773.7× 3.4 → 1.2 GB pass
ifnb_human_x4 large macOS 1 6.34 min 329 ms 1,160× 4.2 → 1.8 GB pass
lcmv_mouse_full small macOS 4 1.14 min 49 ms 1,423× 2.2 → 0.8 GB pass
tms_spleen_BvsT_x6 ood_large macOS 4 6.90 min 343 ms 1,214× 3.8 → 1.9 GB pass

Frequently asked questions

Speeding up NicheNet
Why is NicheNet slow?

NicheNet is CPU-bound, and the stock implementation in nichenetr leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 2.08 min where the AutoZyme path takes 94 ms (1,482× faster).

How do I make NicheNet faster?

Install AutoZyme and activate the nichenetr patch, then keep using NicheNet exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 1,482× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the NicheNet output?

No. The accelerated path returns bit-for-bit identical results to the original nichenetr implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the nichenetr speedup?

In R: install the autozyme package, then run library(autozyme) and autozyme::activate("nichenetr"). The patch applies automatically the next time you call NicheNet.