Python Earth & atmospheric sciences xclim

Speed up xclim

xclim is one of the slower steps in many earth & atmospheric sciences workflows. AutoZyme ships a verified, drop-in patch that is up to 918.2× faster, returning bit-for-bit identical results with no change to how you call it.

Best speedup 918.2×
Median speedup 424.9×
Output equivalence Bit-exact
Best runtime baseline 8.33 min optimized 544 ms
Datasets 5
Pass rate 9/9

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
log scale
synth_700x700x30ygregorian_smooth_700x…gregorian_smooth_550x…synth_500x500x30ysynth_350x350x30y
Thread sweep
Speedup across finalized thread counts on Windows
No finalized multi-thread sweep for this platform.
Memory
Baseline vs optimized peak memory on Windows
0.0 GB25 GB50 GBgregorian_smooth_…0.54×synth_700x700x30y0.60×gregorian_smooth_…0.62×synth_500x500x30y0.54×synth_350x350x30y0.57×gregorian_smooth_700x700x30y · ood_largememory 38 GB → 20 GBoptimized / baseline 0.54×840.9× speedup · 1 threadssynth_700x700x30y · largememory 34 GB → 20 GBoptimized / baseline 0.60×918.2× speedup · 1 threadsgregorian_smooth_550x550x50y · ood_xlargememory 34 GB → 21 GBoptimized / baseline 0.62×661.0× speedup · 1 threadssynth_500x500x30y · mediummemory 19 GB → 11 GBoptimized / baseline 0.54×577.3× speedup · 1 threadssynth_350x350x30y · smallmemory 9.7 GB → 5.5 GBoptimized / baseline 0.57×367.4× speedup · 1 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets xclim in xclim. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: climate indices, climate indicators, climate.

Supported scope

Fast numba two-phase per-cell scan handles: op in {">", ">=", "gt", "ge"}; mid_date a valid "MM-DD" string that exists in the calendar; freq="YS" (annual, year-start); a DataArray with a "time" dimension on a daily axis (noleap OR Gregorian/standard calendars… Read full supported scope

Fast numba two-phase per-cell scan handles: op in {">", ">=", "gt", "ge"}; mid_date a valid "MM-DD" string that exists in the calendar; freq="YS" (annual, year-start); a DataArray with a "time" dimension on a daily axis (noleap OR Gregorian/standard calendars both work — year boundaries derived from times.year diffs, mid offsets from month/day matching). Within this scope the kernel faithfully reproduces upstream xclim season/season_length semantics: start = first run of `window` consecutive condition-true days whose run-start index is < mid_date; end = first run of `window` consecutive condition-false days at/after max(start, mid_date); length = end-beg, or (year_end - beg) when a start exists but no end is found. Verified against upstream xclim 0.60.0 run_length.season / first_run_before_date / first_run_after_date source: the start upper bound mid+window-1, the run_start<mid acceptance, the search_start=max(beg,mid) end search, and the size-minus-start fallback all match. Benchmarked tiers hit pct_exact=1.0 / max_abs_diff=0.

Out-of-scope behavior

silent possibly wrong

Show detailed speedup table 9 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
gregorian_smooth_550x550x50y ood_xlarge Windows 1 5.09 min 462 ms 661.0× 33.8 → 21.0 GB pass
gregorian_smooth_700x700x30y ood_large Windows 1 6.09 min 435 ms 840.9× 37.8 → 20.4 GB pass
synth_350x350x30y small Windows 1 46.27 s 126 ms 367.4× 9.7 → 5.5 GB pass
synth_500x500x30y medium Windows 1 2.66 min 276 ms 577.3× 19.4 → 10.5 GB pass
synth_700x700x30y large Windows 1 8.33 min 544 ms 918.2× 33.8 → 20.4 GB pass
gregorian_smooth_700x700x30y ood_large macOS 1 5.13 min 825 ms 375.1× 20.0 → 20.5 GB pass
synth_350x350x30y small macOS 1 43.14 s 152 ms 288.1× 7.5 → 5.6 GB pass
synth_500x500x30y medium macOS 1 1.74 min 246 ms 424.9× 12.2 → 10.7 GB pass
synth_700x700x30y large macOS 1 5.29 min 1.00 s 316.3× 14.3 → 20.4 GB pass

Frequently asked questions

Speeding up xclim
Why is xclim slow?

xclim is CPU-bound, and the stock implementation in xclim leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 8.33 min where the AutoZyme path takes 544 ms (918.2× faster).

How do I make xclim faster?

Install AutoZyme and activate the xclim patch, then keep using xclim exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 918.2× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the xclim output?

No. The accelerated path returns bit-for-bit identical results to the original xclim implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the xclim speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("xclim"). The patch applies automatically the next time you call xclim.