R Bulk genomics & enrichment clusterProfiler

Speed up clusterProfiler

clusterProfiler is one of the slower steps in many bulk genomics & enrichment workflows. AutoZyme ships a verified, drop-in patch that is up to 90.8× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 90.8×
Median speedup 19.5×
Output equivalence Bit-exact
Best runtime baseline 37.64 min optimized 24.88 s
Datasets 5
Pass rate 10/10

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
tms_ss2gastrulation_pijuansa…heart_adultpbmc200k_glaucomapbmc68k
Thread sweep
Speedup across finalized thread counts on Windows
50×100×14full (8)tms_ss2 · ood_xlarge1 threads · 69.7× speedup33.07 min baseline → 27.99 s optimizedmemory 3.6 GB → 3.5 GBtms_ss2 · ood_xlarge4 threads · 90.8× speedup37.64 min baseline → 24.88 s optimizedmemory 3.0 GB → 3.6 GBtms_ss2 · ood_xlarge8 threads · 69.0× speedup32.21 min baseline → 28.98 s optimizedmemory 3.4 GB → 3.4 GBgastrulation_pijuansala · ood_large1 threads · 26.3× speedup9.53 min baseline → 21.77 s optimizedmemory 3.5 GB → 3.4 GBgastrulation_pijuansala · ood_large4 threads · 26.7× speedup9.68 min baseline → 21.76 s optimizedmemory 3.5 GB → 3.4 GBgastrulation_pijuansala · ood_large8 threads · 27.6× speedup10.75 min baseline → 22.97 s optimizedmemory 3.5 GB → 3.6 GBheart_adult · large1 threads · 23.7× speedup7.21 min baseline → 18.29 s optimizedmemory 2.8 GB → 2.6 GBheart_adult · large4 threads · 24.0× speedup7.35 min baseline → 18.39 s optimizedmemory 2.8 GB → 2.6 GBheart_adult · large8 threads · 25.6× speedup8.75 min baseline → 20.35 s optimizedmemory 3.1 GB → 3.1 GBpbmc200k_glaucoma · medium1 threads · 17.8× speedup5.39 min baseline → 18.17 s optimizedmemory 2.8 GB → 2.6 GBpbmc200k_glaucoma · medium4 threads · 18.7× speedup5.44 min baseline → 18.92 s optimizedmemory 3.0 GB → 3.1 GBpbmc200k_glaucoma · medium8 threads · 20.3× speedup8.11 min baseline → 22.11 s optimizedmemory 3.0 GB → 3.1 GBpbmc68k · small1 threads · 12.7× speedup4.43 min baseline → 21.62 s optimizedmemory 3.1 GB → 2.8 GBpbmc68k · small4 threads · 12.4× speedup3.68 min baseline → 17.82 s optimizedmemory 2.8 GB → 2.6 GBpbmc68k · small8 threads · 13.1× speedup4.42 min baseline → 21.60 s optimizedmemory 3.1 GB → 3.1 GB
tms_ss2gastrulation_pijuan…heart_adultpbmc200k_glaucomapbmc68k
Memory
Baseline vs optimized peak memory on Windows
0.0 GB2.5 GB5.0 GBtms_ss20.99×gastrulation_piju…0.96×heart_adult1.00×pbmc68k1.00×pbmc200k_glaucoma1.02×tms_ss2 · ood_xlargememory 3.6 GB → 3.5 GBoptimized / baseline 0.99×69.7× speedup · 1 threadsgastrulation_pijuansala · ood_largememory 3.5 GB → 3.4 GBoptimized / baseline 0.96×26.3× speedup · 1 threadsheart_adult · largememory 3.1 GB → 3.1 GBoptimized / baseline 1.00×25.6× speedup · 8 threadspbmc68k · smallmemory 3.1 GB → 3.1 GBoptimized / baseline 1.00×13.1× speedup · 8 threadspbmc200k_glaucoma · mediummemory 3.0 GB → 3.1 GBoptimized / baseline 1.02×20.3× speedup · 8 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets clusterProfiler::compareCluster in clusterProfiler. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: enrichment, GO enrichment, pathway enrichment, over-representation, ORA, GSEA, compareCluster.

Supported scope

Correctly accelerates the GO over-representation (ORA) workflow: compareCluster(geneClusters = <named list of character gene-ID vectors>, fun = "enrichGO", OrgDb, keyType, ont in {ALL,BP,CC,MF}, pvalueCutoff/qvalueCutoff/pAdjustMethod/minGSSize/maxGSSize at any value). Read full supported scope

Correctly accelerates the GO over-representation (ORA) workflow: compareCluster(geneClusters = <named list of character gene-ID vectors>, fun = "enrichGO", OrgDb, keyType, ont in {ALL,BP,CC,MF}, pvalueCutoff/qvalueCutoff/pAdjustMethod/minGSSize/maxGSSize at any value). Three coordinated overrides: (1) fast_get_GO_data caches the per-(organism, ont, keytype) PATHID2EXTID/EXTID2PATHID/PATHID2NAME/GO2ONT plus ZYME_ALLEXTID/ZYME_TERM_LENGTHS into the shared .Anno_clusterProfiler_Env, building all 4 ont entries from one mapIds + split; (2) fast_enricher_internal vectorizes phyper and uses cached extID/term-length when universe is NULL (the compareCluster default) — this is bound to BOTH clusterProfiler::enricher_internal and DOSE::enricher_internal, so it also correctly handles any other ORA fun (enrichKEGG/enrichDO/etc.) by falling back to .ALLEXTID_fn/lengths() when the GO cache keys are absent; (3) fast_compareCluster fans the independent per-cluster fun() calls across cores via .zyme_mclapply (lapply fallback on Windows or threads<=1), applies the exact pvalue<=cutoff & p.adjust<=cutoff then qvalue<=cutoff filtering that upstream get_enriched/as.data.frame applies, and rebuilds the compareClusterResult. fun may be any character name resolvable in the clusterProfiler namespace or a function object. universe handling is preserved: character universe intersects (or replaces, under options(enrichment_force_universe=TRUE)); non-character universe is ignored with a message, matching upstream. Concordance verified at term_jaccard=1, pearson_logp_shared=1, max_abs_logp_diff=0 for the benchmarked GO-ALL/ENTREZID config (human pbmc + mouse OOD tiers).

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 10 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
gastrulation_pijuansala ood_large Windows 8 10.75 min 22.97 s 27.6× 3.5 → 3.6 GB pass
heart_adult large Windows 8 8.75 min 20.35 s 25.6× 3.1 → 3.1 GB pass
pbmc200k_glaucoma medium Windows 8 8.11 min 22.11 s 20.3× 3.0 → 3.1 GB pass
pbmc68k small Windows 8 4.42 min 21.60 s 13.1× 3.1 → 3.1 GB pass
tms_ss2 ood_xlarge Windows 4 37.64 min 24.88 s 90.8× 3.0 → 3.6 GB pass
gastrulation_pijuansala ood_large macOS 4 7.73 min 35.24 s 13.1× 4.5 → 2.6 GB pass
heart_adult large macOS 4 5.75 min 18.43 s 18.7× 4.2 → 2.4 GB pass
pbmc200k_glaucoma medium macOS 4 3.94 min 17.79 s 13.4× 4.1 → 2.4 GB pass
pbmc68k small macOS 1 2.81 min 13.46 s 12.6× 3.9 → 2.4 GB pass
tms_ss2 ood_xlarge macOS 4 24.10 min 26.05 s 54.8× 4.8 → 2.8 GB pass

Frequently asked questions

Speeding up clusterProfiler
Why is clusterProfiler slow?

clusterProfiler is CPU-bound, and the stock implementation in clusterProfiler leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 37.64 min where the AutoZyme path takes 24.88 s (90.8× faster).

How do I make clusterProfiler faster?

Install AutoZyme and activate the clusterProfiler patch, then keep using clusterProfiler exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 90.8× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the clusterProfiler output?

No. The accelerated path returns bit-for-bit identical results to the original clusterProfiler implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the clusterProfiler speedup?

In R: install the autozyme package, then run library(autozyme) and autozyme::activate("clusterprofiler"). The patch applies automatically the next time you call clusterProfiler::compareCluster.