R Cell-cell communication CellChat

Speed up CellChat

CellChat is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a verified, drop-in patch that is up to 711.2× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 711.2×
Median speedup 64.8×
Output equivalence Bit-exact
Best runtime baseline 1.59 min optimized 136 ms
Datasets 5
Pass rate 10/10

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
ifnb_2kpbmc68k_2ktms_ss2_3kheart_adult_30kheart_adult_80k
Thread sweep
Speedup across finalized thread counts on Windows
500×1,000×14full (8)ifnb_2k · small1 threads · 200.1× speedup1.72 min baseline → 537 ms optimizedmemory 1.0 GB → 0.8 GBifnb_2k · small4 threads · 711.2× speedup1.59 min baseline → 136 ms optimizedmemory 1.0 GB → 0.8 GBifnb_2k · small8 threads · 621.0× speedup1.87 min baseline → 180 ms optimizedmemory 1.0 GB → 0.8 GBpbmc68k_2k · medium1 threads · 221.0× speedup4.34 min baseline → 1.18 s optimizedmemory 1.2 GB → 0.9 GBpbmc68k_2k · medium4 threads · 490.1× speedup3.37 min baseline → 413 ms optimizedmemory 1.2 GB → 0.9 GBpbmc68k_2k · medium8 threads · 680.5× speedup4.22 min baseline → 400 ms optimizedmemory 1.2 GB → 0.9 GBtms_ss2_3k · large1 threads · 112.8× speedup10.99 min baseline → 5.89 s optimizedmemory 1.7 GB → 1.5 GBtms_ss2_3k · large4 threads · 250.7× speedup9.77 min baseline → 2.34 s optimizedmemory 1.7 GB → 1.4 GBtms_ss2_3k · large8 threads · 305.2× speedup10.95 min baseline → 2.16 s optimizedmemory 1.7 GB → 1.4 GBheart_adult_30k · ood_large1 threads · 21.4× speedup11.39 min baseline → 32.09 s optimizedmemory 4.6 GB → 3.9 GBheart_adult_30k · ood_large4 threads · 71.7× speedup11.25 min baseline → 9.42 s optimizedmemory 4.6 GB → 3.9 GBheart_adult_30k · ood_large8 threads · 110.4× speedup10.87 min baseline → 5.98 s optimizedmemory 4.6 GB → 3.9 GBheart_adult_80k · ood_xlarge1 threads · 13.2× speedup22.46 min baseline → 1.61 min optimizedmemory 11 GB → 9.5 GBheart_adult_80k · ood_xlarge4 threads · 44.9× speedup21.74 min baseline → 28.29 s optimizedmemory 11 GB → 9.5 GBheart_adult_80k · ood_xlarge8 threads · 62.0× speedup17.30 min baseline → 16.76 s optimizedmemory 11 GB → 9.5 GB
ifnb_2kpbmc68k_2ktms_ss2_3kheart_adult_30kheart_adult_80k
Memory
Baseline vs optimized peak memory on Windows
0.0 GB10 GB20 GBheart_adult_80k0.84×heart_adult_30k0.83×tms_ss2_3k0.88×pbmc68k_2k0.75×ifnb_2k0.84×heart_adult_80k · ood_xlargememory 11 GB → 9.5 GBoptimized / baseline 0.84×62.0× speedup · 8 threadsheart_adult_30k · ood_largememory 4.6 GB → 3.9 GBoptimized / baseline 0.83×21.4× speedup · 1 threadstms_ss2_3k · largememory 1.7 GB → 1.5 GBoptimized / baseline 0.88×112.8× speedup · 1 threadspbmc68k_2k · mediummemory 1.2 GB → 0.9 GBoptimized / baseline 0.75×490.1× speedup · 4 threadsifnb_2k · smallmemory 1.0 GB → 0.8 GBoptimized / baseline 0.84×200.1× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets CellChat::computeCommunProb in CellChat. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: cell-cell communication, ligand-receptor, CCC, communication probability.

Supported scope

Fast native path handles datatype="RNA" CellChat objects (the only branch the pipeline exercises). Read full supported scope

Fast native path handles datatype="RNA" CellChat objects (the only branch the pipeline exercises). type="triMean" gets the full Rcpp acceleration: cpp_aggregate_triMean for the per-group average, batched cpp_aggregate_triMean_boot for the nboot permutation tensor, cpp_outer_Pnull for the Prob outer product, and cpp_unified_inner for the per-LR/per-bootstrap Hill-product p-values. Non-triMean types (truncatedMean, thresholdedMean, median) are also handled correctly but only partly accelerated: the aggregator and per-bootstrap aggregation fall back to R-level stats::aggregate (lines 230-234, 323-330) while the outer/inner kernels still run. raw.use TRUE/FALSE both supported (data.signaling vs data.smooth, lines 158-162). nboot and seed.use are honored (set.seed(seed.use); replicate(nboot,...) lines 314-315). LR.use=NULL and explicit LR.use both supported (lines 163-175). Kh and n flow into the Hill kernels. Per-LR simple-gene and complex-subunit (geometric-mean) ligand/receptor expansion plus coreceptor/agonist/antagonist cofactors are precomputed once off the inner loop. Output prob/pval reported bit-identical (pearson 1.0, max_abs_diff 0.0) vs upstream at the benchmarked config.

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 10 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
heart_adult_30k ood_large Windows 8 10.87 min 5.98 s 110.4× 4.6 → 3.9 GB pass
heart_adult_80k ood_xlarge Windows 8 17.30 min 16.76 s 62.0× 11.3 → 9.5 GB pass
ifnb_2k small Windows 4 1.59 min 136 ms 711.2× 1.0 → 0.8 GB pass
pbmc68k_2k medium Windows 8 4.22 min 400 ms 680.5× 1.2 → 0.9 GB pass
tms_ss2_3k large Windows 8 10.95 min 2.16 s 305.2× 1.7 → 1.4 GB pass
heart_adult_30k ood_large macOS 1 3.58 min 17.15 s 12.7× 8.1 → 4.5 GB pass
heart_adult_80k ood_xlarge macOS 1 8.06 min 49.60 s 9.76× 14.4 → 9.7 GB pass
ifnb_2k small macOS 1 13.89 s 517 ms 26.4× 1.3 → 1.0 GB pass
pbmc68k_2k medium macOS 4 41.61 s 619 ms 67.7× 1.2 → 1.0 GB pass
tms_ss2_3k large macOS 4 1.80 min 3.06 s 35.6× 2.2 → 1.7 GB pass

Frequently asked questions

Speeding up CellChat
Why is CellChat slow?

CellChat is CPU-bound, and the stock implementation in CellChat leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 1.59 min where the AutoZyme path takes 136 ms (711.2× faster).

How do I make CellChat faster?

Install AutoZyme and activate the CellChat patch, then keep using CellChat exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 711.2× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the CellChat output?

No. The accelerated path returns bit-for-bit identical results to the original CellChat implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the CellChat speedup?

In R: install the autozyme package, then run library(autozyme) and autozyme::activate("cellchat"). The patch applies automatically the next time you call CellChat::computeCommunProb.