Python Physics & chemistry fipy

Speed up FiPy

FiPy is one of the slower steps in many physics & chemistry workflows. AutoZyme ships a verified, drop-in patch that is up to 34.5× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 34.5×
Median speedup 34.5×
Output equivalence Bit-exact
Best runtime baseline 32.71 s optimized 949 ms
Datasets 5
Pass rate 9/9

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
heat2d_n80heat2d_n160heat2d_aniso_180x360heat2d_n240
Thread sweep
Speedup across finalized thread counts on Windows
No finalized multi-thread sweep for this platform.
Memory
Baseline vs optimized peak memory on Windows
0.0 GB1.0 GB2.0 GBheat2d_aniso_180x…0.87×heat2d_n2400.87×heat2d_n1600.85×heat2d_n800.95×heat2d_aniso_180x360 · ood_largememory 0.2 GB → 0.2 GBoptimized / baseline 0.87×28.9× speedup · 1 threadsheat2d_n240 · largememory 0.2 GB → 0.2 GBoptimized / baseline 0.87×27.1× speedup · 1 threadsheat2d_n160 · mediummemory 0.1 GB → 0.1 GBoptimized / baseline 0.85×30.1× speedup · 1 threadsheat2d_n80 · smallmemory 0.1 GB → 0.1 GBoptimized / baseline 0.95×34.5× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets FiPy in fipy. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: PDE, partial differential equation, finite volume, diffusion equation.

Supported scope

The fast path is correct for the canonical FV time-stepping pattern it was built for: eqn = TransientTerm() == DiffusionTerm(coeff=const) solved repeatedly with eqn.solve(var=v, dt=dt) on a FIXED mesh, FIXED dt, and CONSTANT diffusion coefficient, using the… Read full supported scope

The fast path is correct for the canonical FV time-stepping pattern it was built for: eqn = TransientTerm() == DiffusionTerm(coeff=const) solved repeatedly with eqn.solve(var=v, dt=dt) on a FIXED mesh, FIXED dt, and CONSTANT diffusion coefficient, using the default scipy LinearLUSolver (solver=None). Under these conditions the assembled matrix M/dt - L_diff and its LU factor are genuinely step-invariant; only the RHS depends on var.old.value, so the patch reuses the cached LU and cached diffusion-RHS and computes b = var.old.value*mass_per_dt_scaled + b_diff_scaled, then x = LU.solve(b). First call per (eqn,var,dt,state) goes through the upstream path (populating caches), so numerics are bit-exact on the cold call and at FP noise (~1e-15) on hits. The cache key (id(self), id(var), float(dt), state_sig) correctly forces a fresh upstream rebuild when: dt changes, the var object is swapped, attached constraint value/where masks change (incl. CellVariable.faceConstraints), explicit boundaryConditions value/faces change, or a coefficient OBJECT is reassigned (term.coeff = new_coeff) — state_sig fingerprints these via content hash or object id. Layer 1 (LinearLUSolver._solve_) content-keys splu by a blake2b hash of the CSR shape/data/indices, so it stays correct even for matrices it has not memoized (it falls back to a fresh splu on hash miss).

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 9 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
heat2d_aniso_180x360 ood_large Windows 1 6.18 min 12.82 s 28.9× 0.2 → 0.2 GB pass
heat2d_n160 medium Windows 1 1.98 min 3.94 s 30.1× 0.1 → 0.1 GB pass
heat2d_n240 large Windows 1 5.37 min 11.88 s 27.1× 0.2 → 0.2 GB pass
heat2d_n80 small Windows 1 32.71 s 949 ms 34.5× 0.1 → 0.1 GB pass
heat2d_aniso_180x360 ood_large macOS 1 6.24 min 6.26 s 59.8× 0.9 → 0.3 GB pass
heat2d_n160 medium macOS 1 1.77 min 2.61 s 40.7× 0.4 → 0.1 GB pass
heat2d_n240 large macOS 1 5.38 min 5.26 s 61.4× 0.8 → 0.3 GB pass
heat2d_n320_prod ood_xlarge macOS 1 21.24 min 20.33 s 62.7× 1.4 → 0.4 GB pass
heat2d_n80 small macOS 1 24.14 s 712 ms 33.9× 0.3 → 0.1 GB pass

Frequently asked questions

Speeding up FiPy
Why is FiPy slow?

FiPy is CPU-bound, and the stock implementation in fipy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 32.71 s where the AutoZyme path takes 949 ms (34.5× faster).

How do I make FiPy faster?

Install AutoZyme and activate the fipy patch, then keep using FiPy exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 34.5× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the FiPy output?

No. The accelerated path returns bit-for-bit identical results to the original fipy implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the fipy speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("fipy"). The patch applies automatically the next time you call FiPy.