Python Earth & atmospheric sciences obspy

Speed up ObsPy

ObsPy is one of the slower steps in many earth & atmospheric sciences workflows. AutoZyme ships a verified, drop-in patch that is up to 6.61× faster, returning output within a strict, verified tolerance with no change to how you call it.

Best speedup 6.61×
Median speedup 6.04×
Output equivalence Tolerance
Best runtime baseline 1.33 min optimized 12.05 s
Datasets 5
Pass rate 9/9

Benchmark charts

Switch benchmark platform; all charts update together
Platform
Speedup distribution
Each dot is one finalized dataset/thread run on Windows
synthetic_60x3d_100hzsynthetic_60x10d_50hz…synthetic_24x10d_100h…synthetic_60x4d_100hzsynthetic_60x1d_100hz
Thread sweep
Speedup across finalized thread counts on Windows
10×14full (8)synthetic_60x3d_100hz · medium1 threads · 6.61× speedup1.33 min baseline → 12.05 s optimizedmemory 29 GB → 24 GBsynthetic_60x3d_100hz · medium4 threads · 4.99× speedup58.24 s baseline → 11.66 s optimizedmemory 29 GB → 27 GBsynthetic_60x3d_100hz · medium8 threads · 4.87× speedup58.40 s baseline → 11.99 s optimizedmemory 29 GB → 24 GBsynthetic_60x10d_50hz_repeat15 · ood_xlarge4 threads · 3.71× speedup11.00 min baseline → 2.91 min optimizedmemory 42 GB → 32 GBsynthetic_60x10d_50hz_repeat15 · ood_xlarge8 threads · 6.17× speedup9.32 min baseline → 1.67 min optimizedmemory 49 GB → 39 GBsynthetic_24x10d_100hz_longtrace · ood_large4 threads · 5.03× speedup1.25 min baseline → 16.42 s optimizedmemory 39 GB → 43 GBsynthetic_24x10d_100hz_longtrace · ood_large8 threads · 6.04× speedup1.64 min baseline → 18.54 s optimizedmemory 39 GB → 43 GBsynthetic_60x4d_100hz · large4 threads · 4.96× speedup1.26 min baseline → 15.26 s optimizedmemory 39 GB → 31 GBsynthetic_60x4d_100hz · large8 threads · 5.90× speedup1.36 min baseline → 15.98 s optimizedmemory 39 GB → 32 GBsynthetic_60x1d_100hz · small1 threads · 5.22× speedup20.28 s baseline → 3.89 s optimizedmemory 9.8 GB → 8.0 GBsynthetic_60x1d_100hz · small4 threads · 5.31× speedup20.92 s baseline → 3.94 s optimizedmemory 9.8 GB → 8.8 GBsynthetic_60x1d_100hz · small8 threads · 4.89× speedup20.39 s baseline → 4.17 s optimizedmemory 9.8 GB → 7.9 GB
synthetic_60x3d_100…synthetic_60x10d_50…synthetic_24x10d_10…synthetic_60x4d_100…synthetic_60x1d_100…
Memory
Baseline vs optimized peak memory on Windows
0.0 GB25 GB50 GBsynthetic_60x10d_…0.81×synthetic_24x10d_…1.09×synthetic_60x4d_1…0.81×synthetic_60x3d_1…0.81×synthetic_60x1d_1…0.80×synthetic_60x10d_50hz_repeat15 · ood_xlargememory 49 GB → 39 GBoptimized / baseline 0.81×6.17× speedup · 8 threadssynthetic_24x10d_100hz_longtrace · ood_largememory 39 GB → 43 GBoptimized / baseline 1.09×6.04× speedup · 8 threadssynthetic_60x4d_100hz · largememory 39 GB → 32 GBoptimized / baseline 0.81×5.90× speedup · 8 threadssynthetic_60x3d_100hz · mediummemory 29 GB → 24 GBoptimized / baseline 0.81×4.87× speedup · 8 threadssynthetic_60x1d_100hz · smallmemory 9.8 GB → 7.9 GBoptimized / baseline 0.80×4.89× speedup · 8 threads
baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets obspy.core.stream.Stream.filter in obspy. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: seismology, seismic, earthquake, waveform, stream filter.

Supported scope

Three coordinated patches. (1) Stream.filter (fast_stream_filter) is filter-type-agnostic: it simply dispatches tr.filter(type, *args, **options) per Trace across a ThreadPoolExecutor (workers = min(len(self), auto_threads(cap=24))), falling to a serial loop… Read full supported scope

Three coordinated patches. (1) Stream.filter (fast_stream_filter) is filter-type-agnostic: it simply dispatches tr.filter(type, *args, **options) per Trace across a ThreadPoolExecutor (workers = min(len(self), auto_threads(cap=24))), falling to a serial loop when workers<=1 or len(self)<=1. Any filter type, args, and options are forwarded unchanged, so bandstop/lowpass/highpass/lowpass_cheby_2/etc. all still work correctly (just routed to upstream per-trace functions, with only the Stream-level parallelism as the speedup). (2) obspy.signal.filter.bandpass (fast_bandpass) is a line-for-line reimplementation of upstream bandpass — same Nyquist edge logic, same highpass fallback when freqmax>=Nyquist (warns), same ValueError when low corner>Nyquist, same sosfilt / zerophase np.flip path along arbitrary axis — with the ONLY change being that the iirfilter SOS-coefficient design is memoized via lru_cache keyed on (corners, (freqmin,freqmax), df, rp, rs, 'band', ftype). This is bit-exact to upstream for any combination of corners/freqmin/freqmax/df/rp/rs/ftype/zerophase/axis (benchmark reports max_abs_diff=0.0, pearson=1.0). (3) obspy.core.trace._get_function_from_entry_point is wrapped in lru_cache to amortize per-call entry-point resolution. Correctly handles: bandpass with any well-formed numeric params; all other filter types via passthrough; single- and multi-trace Streams.

Out-of-scope behavior

handles all

Show detailed speedup table 9 runs
Dataset Tier Platform Threads Baseline Optimized Speedup Memory Concordance Pass
synthetic_24x10d_100hz_longtrace ood_large Windows 8 1.64 min 18.54 s 6.04× 39.4 → 42.8 GB pass
synthetic_60x10d_50hz_repeat15 ood_xlarge Windows 8 9.32 min 1.67 min 6.17× 48.7 → 39.4 GB pass
synthetic_60x1d_100hz small Windows 4 20.92 s 3.94 s 5.31× 9.8 → 8.8 GB pass
synthetic_60x3d_100hz medium Windows 1 1.33 min 12.05 s 6.61× 29.3 → 23.7 GB pass
synthetic_60x4d_100hz large Windows 8 1.36 min 15.98 s 5.90× 39.0 → 31.5 GB pass
synthetic_24x10d_100hz_longtrace ood_large macOS 4 3.26 min 46.05 s 4.25× 27.4 → 29.4 GB pass
synthetic_60x1d_100hz small macOS 8 30.23 s 4.67 s 6.51× 11.8 → 8.9 GB pass
synthetic_60x3d_100hz medium macOS 8 1.88 min 14.18 s 7.75× 24.1 → 25.8 GB pass
synthetic_60x4d_100hz large macOS 4 2.81 min 28.08 s 6.01× 25.6 → 26.5 GB pass

Frequently asked questions

Speeding up ObsPy
Why is ObsPy slow?

ObsPy is CPU-bound, and the stock implementation in obspy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 1.33 min where the AutoZyme path takes 12.05 s (6.61× faster).

How do I make ObsPy faster?

Install AutoZyme and activate the obspy patch, then keep using ObsPy exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 6.61× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the ObsPy output?

Effectively no. The output is tolerance-equivalent: held within a frozen concordance gate (up to about 0.6% drift from the original obspy result) on every benchmark dataset.

How do I install the obspy speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("obspy"). The patch applies automatically the next time you call obspy.core.stream.Stream.filter.