Speed up ObsPy: up to 6.61× faster, near-identical output

Benchmark charts

Switch benchmark platform; all charts update together

Speedup distribution

Each dot is one finalized dataset/thread run on Windows

synthetic_60x3d_100hz

6.61×

synthetic_60x10d_50hz…

6.17×

synthetic_24x10d_100h…

6.04×

synthetic_60x4d_100hz

5.90×

synthetic_60x1d_100hz

5.31×

synthetic_60x3d_100hzsynthetic_60x10d_50hz…synthetic_24x10d_100h…synthetic_60x4d_100hzsynthetic_60x1d_100hz

Thread sweep

Speedup across finalized thread counts on Windows

synthetic_60x3d_100…synthetic_60x10d_50…synthetic_24x10d_10…synthetic_60x4d_100…synthetic_60x1d_100…

Memory

Baseline vs optimized peak memory on Windows

baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets obspy.core.stream.Stream.filter in obspy. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: seismology, seismic, earthquake, waveform, stream filter.

Supported scope

Three coordinated patches. (1) Stream.filter (fast_stream_filter) is filter-type-agnostic: it simply dispatches tr.filter(type, *args, **options) per Trace across a ThreadPoolExecutor (workers = min(len(self), auto_threads(cap=24))), falling to a serial loop… Read full supported scope

Three coordinated patches. (1) Stream.filter (fast_stream_filter) is filter-type-agnostic: it simply dispatches tr.filter(type, *args, **options) per Trace across a ThreadPoolExecutor (workers = min(len(self), auto_threads(cap=24))), falling to a serial loop when workers<=1 or len(self)<=1. Any filter type, args, and options are forwarded unchanged, so bandstop/lowpass/highpass/lowpass_cheby_2/etc. all still work correctly (just routed to upstream per-trace functions, with only the Stream-level parallelism as the speedup). (2) obspy.signal.filter.bandpass (fast_bandpass) is a line-for-line reimplementation of upstream bandpass — same Nyquist edge logic, same highpass fallback when freqmax>=Nyquist (warns), same ValueError when low corner>Nyquist, same sosfilt / zerophase np.flip path along arbitrary axis — with the ONLY change being that the iirfilter SOS-coefficient design is memoized via lru_cache keyed on (corners, (freqmin,freqmax), df, rp, rs, 'band', ftype). This is bit-exact to upstream for any combination of corners/freqmin/freqmax/df/rp/rs/ftype/zerophase/axis (benchmark reports max_abs_diff=0.0, pearson=1.0). (3) obspy.core.trace._get_function_from_entry_point is wrapped in lru_cache to amortize per-call entry-point resolution. Correctly handles: bandpass with any well-formed numeric params; all other filter types via passthrough; single- and multi-trace Streams.

Out-of-scope behavior

handles all

Show detailed speedup table 9 runs

Dataset	Tier	Platform	Threads	Baseline	Optimized	Speedup	Memory	Concordance	Pass
`synthetic_24x10d_100hz_longtrace`	ood_large	Windows	8	1.64 min	18.54 s	6.04×	39.4 → 42.8 GB	—	pass
`synthetic_60x10d_50hz_repeat15`	ood_xlarge	Windows	8	9.32 min	1.67 min	6.17×	48.7 → 39.4 GB	—	pass
`synthetic_60x1d_100hz`	small	Windows	4	20.92 s	3.94 s	5.31×	9.8 → 8.8 GB	—	pass
`synthetic_60x3d_100hz`	medium	Windows	1	1.33 min	12.05 s	6.61×	29.3 → 23.7 GB	—	pass
`synthetic_60x4d_100hz`	large	Windows	8	1.36 min	15.98 s	5.90×	39.0 → 31.5 GB	—	pass
`synthetic_24x10d_100hz_longtrace`	ood_large	macOS	4	3.26 min	46.05 s	4.25×	27.4 → 29.4 GB	—	pass
`synthetic_60x1d_100hz`	small	macOS	8	30.23 s	4.67 s	6.51×	11.8 → 8.9 GB	—	pass
`synthetic_60x3d_100hz`	medium	macOS	8	1.88 min	14.18 s	7.75×	24.1 → 25.8 GB	—	pass
`synthetic_60x4d_100hz`	large	macOS	4	2.81 min	28.08 s	6.01×	25.6 → 26.5 GB	—	pass

Frequently asked questions

Speeding up ObsPy

Why is ObsPy slow?

ObsPy is CPU-bound, and the stock implementation in obspy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 1.33 min where the AutoZyme path takes 12.05 s (6.61× faster).

How do I make ObsPy faster?

Install AutoZyme and activate the obspy patch, then keep using ObsPy exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 6.61× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the ObsPy output?

Effectively no. The output is tolerance-equivalent: held within a frozen concordance gate (up to about 0.6% drift from the original obspy result) on every benchmark dataset.

How do I install the obspy speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("obspy"). The patch applies automatically the next time you call obspy.core.stream.Stream.filter.