Scientific CNV inference inferCNV (HMM)

Speed up inferCNV (HMM)

inferCNV (HMM) is one of the slower steps in many single-cell genomics workflows. AutoZyme ships a verified, drop-in patch that is up to 56.9× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 56.9×
Median speedup 43.6×
Output equivalence Bit-exact
Best runtime baseline 9.16 min optimized 9.66 s
Datasets 5
Pass rate 10/10

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
gbm_neftel_largehnscc_purambrca_wumelanoma_tiroshmelanoma_tirosh_tiny
Thread sweep
Speedup across finalized thread counts on Windows
No finalized multi-thread sweep for this platform.
Memory
Baseline vs optimized peak memory on Windows
0.0 GB25 GB50 GBbrca_wu0.28×hnscc_puram0.68×gbm_neftel_large0.85×melanoma_tirosh0.83×melanoma_tirosh_t…0.65×brca_wu · ood_xlargememory 27 GB → 7.5 GBoptimized / baseline 0.28×50.5× speedup · 1 threadshnscc_puram · ood_largememory 4.1 GB → 2.8 GBoptimized / baseline 0.68×55.1× speedup · 1 threadsgbm_neftel_large · largememory 3.6 GB → 3.1 GBoptimized / baseline 0.85×56.9× speedup · 1 threadsmelanoma_tirosh · mediummemory 2.9 GB → 2.4 GBoptimized / baseline 0.83×50.3× speedup · 1 threadsmelanoma_tirosh_tiny · smallmemory 1.8 GB → 1.2 GBoptimized / baseline 0.65×32.6× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets inferCNV (HMM) in inferCNV (HMM). The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: CNV, copy number, copy number variation, copy number alteration.

Supported scope

Correctly accelerates infercnv::run with HMM=TRUE, analysis_mode="samples", HMM_type="i6" (default), single-threaded (num_threads=1), window_length=101, dense or coercible-to-dense expression with NO NAs, and tumor-subcluster partition_method="none" (what… Read full supported scope

Correctly accelerates infercnv::run with HMM=TRUE, analysis_mode="samples", HMM_type="i6" (default), single-threaded (num_threads=1), window_length=101, dense or coercible-to-dense expression with NO NAs, and tumor-subcluster partition_method="none" (what analysis_mode="samples" implies). Under HMM=TRUE the streamlined fast_run body is deliberately NOT used (its guard requires isFALSE(args$HMM)); instead a gc-suppressed clone of upstream run() executes with 19 monkey-patched internals (smooth_window/by_chromosome, normalize, log2xplus1/invert_log2, subtract_ref+threshold fusion, center, Viterbi, state-consensus, cell_prob/cnv_prob, hspike trend, define_cnv_gene_regions, run_gibb_sampling, inferCNVBayesNet saveRDS-shim, define_signif_tumor_subclusters). Verified bit-exact (max_abs_diff=0, pearson=1, hmm_state_agreement=1) across 5 datasets/tiers on macOS+Windows AT FIXED set.seed(1234) ONLY. Most matrix kernels are genuine mathematical equivalents; the HMM Gibbs step is an approximation that happened to reproduce identical discrete state calls on the benchmarked data.

Out-of-scope behavior

silent possibly wrong

Show detailed speedup table 10 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
brca_wu ood_xlarge Windows 1 83.53 min 1.65 min 50.5× 27.1 → 7.5 GB pass
gbm_neftel_large large Windows 1 9.16 min 9.66 s 56.9× 3.6 → 3.1 GB pass
hnscc_puram ood_large Windows 1 38.09 min 41.51 s 55.1× 4.1 → 2.8 GB pass
melanoma_tirosh medium Windows 1 12.11 min 14.44 s 50.3× 2.9 → 2.4 GB pass
melanoma_tirosh_tiny small Windows 1 3.12 min 5.74 s 32.6× 1.8 → 1.2 GB pass
brca_wu ood_xlarge macOS 1 31.32 min 1.28 min 24.5× 19.4 → 7.6 GB pass
gbm_neftel_large large macOS 1 6.74 min 5.28 s 76.6× 7.5 → 5.9 GB pass
hnscc_puram ood_large macOS 1 11.84 min 28.49 s 24.9× 8.5 → 4.7 GB pass
melanoma_tirosh medium macOS 1 4.42 min 8.47 s 31.3× 5.8 → 4.1 GB pass
melanoma_tirosh_tiny small macOS 1 1.81 min 2.95 s 36.9× 2.9 → 1.7 GB pass

Frequently asked questions

Speeding up inferCNV (HMM)
Why is inferCNV (HMM) slow?

inferCNV (HMM) is CPU-bound, and the stock implementation in inferCNV (HMM) leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 9.16 min where the AutoZyme path takes 9.66 s (56.9× faster).

How do I make inferCNV (HMM) faster?

Install AutoZyme and activate the inferCNV (HMM) patch, then keep using inferCNV (HMM) exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 56.9× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the inferCNV (HMM) output?

No. The accelerated path returns bit-for-bit identical results to the original inferCNV (HMM) implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the inferCNV (HMM) speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("infercnvhmm"). The patch applies automatically the next time you call inferCNV (HMM).