Benchmark charts
Speedup distribution
Each dot is one finalized dataset/thread run on WindowsThread sweep
Speedup across finalized thread counts on WindowsMemory
Baseline vs optimized peak memory on WindowsWhat is accelerated
This task targets ProDy in ProDy. The benchmarked result
preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.
Also searched as: protein dynamics, normal mode analysis, NMA, elastic network model.
Supported scope
The patch replaces two functions (prody.dynamics.anm.ANMBase.buildHessian and prody.dynamics.anm.solveEig); calcANM/calcModes route through them. Read full supported scope
The patch replaces two functions (prody.dynamics.anm.ANMBase.buildHessian and prody.dynamics.anm.solveEig); calcANM/calcModes route through them. buildHessian is general and mathematically exact: it builds the Hessian in float64 unconditionally, supports AtomGroup/Atomic inputs (via _getCoords/getCoords) and raw numpy coord arrays (via checkCoords), any cutoff>0 (validated by checkENMParameters), and both scalar gamma and callable Gamma objects (callable invoked as gamma(dist2,i,j), matching upstream exactly, incl. squared-distance arg). It also restores a real sparse CSR Kirchhoff matrix and attaches the build coords to the sparse Hessian instance (no global state). solveEig has a deterministic eigsh shift-invert path (eigsh sigma=-1e-8, which=LM, tol=1e-10) that handles any sparse M with a finite integer n_modes < dof for both zeros=False and zeros=True, with an internal guard that defers to upstream if final_n_modes exceeds available values. The LOBPCG accelerated path (the fast path actually measured) is taken only for the standard ANM config: M sparse, reverse=False, integer n_modes < dof, zeros=False, expct_n_zeros==6, build coords attached, coords.shape[0]*3==dof, and coords.shape[0] < 12000; its result is accepted only after a residual-norm gate (max scaled residual <= 1e-3) and otherwise falls through to the deterministic eigsh path. Benchmark tiers 1OEL/5GAR/1aon/6TLJ (3668–9200 Cα) exercise LOBPCG; the 7B0U OOD tier (13320 Cα) is >=12000 so it uses the eigsh shift-invert path, not LOBPCG.
Out-of-scope behavior
silent fallback to upstream
Show detailed speedup table 10 runs
| Dataset | Tier | Platform | Threads | Baseline | Optimized | Speedup | Memory | Concordance | Pass |
|---|---|---|---|---|---|---|---|---|---|
1aon | large | Windows | 1 | 13.30 min | 9.79 s | 81.5× | 8.8 → 0.4 GB | — | pass |
1oel | small | Windows | 1 | 1.27 min | 2.20 s | 34.7× | 1.9 → 0.2 GB | — | pass |
5gar | medium | Windows | 1 | 5.21 min | 6.89 s | 45.3× | 4.8 → 0.3 GB | — | pass |
6tlj | ood_large | Windows | 1 | 19.87 min | 10.96 s | 108.8× | 11.6 → 0.4 GB | — | pass |
7b0u | ood_xlarge | Windows | 1 | 60.02 min | 19.94 s | 180.7× | 24.0 → 1.4 GB | — | pass |
1aon | large | macOS | 1 | 9.46 min | 6.37 s | 89.1× | 8.9 → 1.0 GB | — | pass |
1oel | small | macOS | 1 | 59.37 s | 1.62 s | 36.6× | 2.1 → 0.5 GB | — | pass |
5gar | medium | macOS | 1 | 1.86 min | 4.52 s | 24.6× | 5.2 → 0.8 GB | — | pass |
6tlj | ood_large | macOS | 1 | 6.37 min | 7.50 s | 51.0× | 8.5 → 1.3 GB | — | pass |
7b0u | ood_xlarge | macOS | 1 | 19.67 min | 25.86 s | 45.6× | 20.4 → 1.7 GB | — | pass |
Frequently asked questions
Why is ProDy slow?
ProDy is CPU-bound, and the stock implementation in ProDy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 60.02 min where the AutoZyme path takes 19.94 s (180.7× faster).
How do I make ProDy faster?
Install AutoZyme and activate the ProDy patch, then keep using ProDy exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 180.7× faster on the benchmark datasets, with no pipeline or API changes.
Does the AutoZyme speedup change the ProDy output?
No. The accelerated path returns bit-for-bit identical results to the original ProDy implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.
How do I install the ProDy speedup?
In Python: pip install autozyme, then import autozyme and autozyme.activate("prody"). The patch applies automatically the next time you call ProDy.