Speed up ProDy: up to 180.7× faster, identical output

Benchmark charts

Switch benchmark platform; all charts update together

Speedup distribution

Each dot is one finalized dataset/thread run on Windows

log scale

7b0u

180.7×

6tlj

108.8×

1aon

81.5×

5gar

45.3×

1oel

34.7×

7b0u6tlj1aon5gar1oel

Thread sweep

Speedup across finalized thread counts on Windows

No finalized multi-thread sweep for this platform.

Memory

Baseline vs optimized peak memory on Windows

baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets ProDy in ProDy. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: protein dynamics, normal mode analysis, NMA, elastic network model.

Supported scope

The patch replaces two functions (prody.dynamics.anm.ANMBase.buildHessian and prody.dynamics.anm.solveEig); calcANM/calcModes route through them. Read full supported scope

The patch replaces two functions (prody.dynamics.anm.ANMBase.buildHessian and prody.dynamics.anm.solveEig); calcANM/calcModes route through them. buildHessian is general and mathematically exact: it builds the Hessian in float64 unconditionally, supports AtomGroup/Atomic inputs (via _getCoords/getCoords) and raw numpy coord arrays (via checkCoords), any cutoff>0 (validated by checkENMParameters), and both scalar gamma and callable Gamma objects (callable invoked as gamma(dist2,i,j), matching upstream exactly, incl. squared-distance arg). It also restores a real sparse CSR Kirchhoff matrix and attaches the build coords to the sparse Hessian instance (no global state). solveEig has a deterministic eigsh shift-invert path (eigsh sigma=-1e-8, which=LM, tol=1e-10) that handles any sparse M with a finite integer n_modes < dof for both zeros=False and zeros=True, with an internal guard that defers to upstream if final_n_modes exceeds available values. The LOBPCG accelerated path (the fast path actually measured) is taken only for the standard ANM config: M sparse, reverse=False, integer n_modes < dof, zeros=False, expct_n_zeros==6, build coords attached, coords.shape[0]*3==dof, and coords.shape[0] < 12000; its result is accepted only after a residual-norm gate (max scaled residual <= 1e-3) and otherwise falls through to the deterministic eigsh path. Benchmark tiers 1OEL/5GAR/1aon/6TLJ (3668–9200 Cα) exercise LOBPCG; the 7B0U OOD tier (13320 Cα) is >=12000 so it uses the eigsh shift-invert path, not LOBPCG.

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 10 runs

Dataset	Tier	Platform	Threads	Baseline	Optimized	Speedup	Memory	Concordance	Pass
`1aon`	large	Windows	1	13.30 min	9.79 s	81.5×	8.8 → 0.4 GB	—	pass
`1oel`	small	Windows	1	1.27 min	2.20 s	34.7×	1.9 → 0.2 GB	—	pass
`5gar`	medium	Windows	1	5.21 min	6.89 s	45.3×	4.8 → 0.3 GB	—	pass
`6tlj`	ood_large	Windows	1	19.87 min	10.96 s	108.8×	11.6 → 0.4 GB	—	pass
`7b0u`	ood_xlarge	Windows	1	60.02 min	19.94 s	180.7×	24.0 → 1.4 GB	—	pass
`1aon`	large	macOS	1	9.46 min	6.37 s	89.1×	8.9 → 1.0 GB	—	pass
`1oel`	small	macOS	1	59.37 s	1.62 s	36.6×	2.1 → 0.5 GB	—	pass
`5gar`	medium	macOS	1	1.86 min	4.52 s	24.6×	5.2 → 0.8 GB	—	pass
`6tlj`	ood_large	macOS	1	6.37 min	7.50 s	51.0×	8.5 → 1.3 GB	—	pass
`7b0u`	ood_xlarge	macOS	1	19.67 min	25.86 s	45.6×	20.4 → 1.7 GB	—	pass

Frequently asked questions

Speeding up ProDy

Why is ProDy slow?

ProDy is CPU-bound, and the stock implementation in ProDy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 60.02 min where the AutoZyme path takes 19.94 s (180.7× faster).

How do I make ProDy faster?

Install AutoZyme and activate the ProDy patch, then keep using ProDy exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 180.7× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the ProDy output?

No. The accelerated path returns bit-for-bit identical results to the original ProDy implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the ProDy speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("prody"). The patch applies automatically the next time you call ProDy.