Speed up sarsen: up to 26.5× faster, identical output

Benchmark charts

Static evidence view; finalized rows for the benchmarked platform

Speedup distribution

Each dot is one finalized dataset/thread run on Mac

log scale

rome_13k

26.5×

rome_10k

11.9×

rome_8k

9.19×

rome_grd_gamma_12k

5.34×

rome_13krome_10krome_8krome_grd_gamma_12k

Thread sweep

Speedup across finalized thread counts on Mac

rome_13krome_10krome_8krome_grd_gamma_12k

Memory

Baseline vs optimized peak memory on Mac

baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets sarsen in sarsen. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: SAR, synthetic aperture radar, terrain correction, sentinel-1, radar.

Supported scope

Correct (bit-exact, pass_rate 1.0, max_abs_diff 0.0) for: GRD products (GroundRangeSarProduct) processed in-memory with chunks=None, interp_method="nearest" (the upstream default), correct_radiometry in {None (GTC), "gamma_nearest"} — note "gamma_nearest"… Read full supported scope

Correct (bit-exact, pass_rate 1.0, max_abs_diff 0.0) for: GRD products (GroundRangeSarProduct) processed in-memory with chunks=None, interp_method="nearest" (the upstream default), correct_radiometry in {None (GTC), "gamma_nearest"} — note "gamma_nearest" still goes through upstream do_terrain_correction radiometry chain but calls the patched simulate_acquisition/orbit/geocoding helpers, and verifies bit-exact at ood_large. The patch replaces 9 internal helpers (transform_dem_3d, convert_to_dem_3d, slant_range_time_to_ground_range, the three OrbitPolyfitInterpolator polyval fits, the two zero_doppler Newton kernels, simulate_acquisition, GroundRangeSarProduct.interp_sar, Sentinel1SarProduct.beta_nought) plus a beta_nought process-local cache; the public terrain_correction wrapper only scopes xr.set_options(use_bottleneck=False) and delegates to upstream. Verified upstream version sarsen 0.9.6.dev5+g6c5e37d1d on the Rome DEM / S1B GRD IW/VV product.

Out-of-scope behavior

silent fallback to upstream

Show detailed speedup table 4 runs

Dataset	Tier	Platform	Threads	Baseline	Optimized	Speedup	Memory	Concordance	Pass
`rome_10k`	medium	macOS	8	1.75 min	8.78 s	11.9×	25.2 → 17.7 GB	—	pass
`rome_13k`	large	macOS	4	7.72 min	17.50 s	26.5×	26.3 → 22.4 GB	—	pass
`rome_8k`	small	macOS	8	57.63 s	6.31 s	9.19×	20.3 → 12.5 GB	—	pass
`rome_grd_gamma_12k`	ood_large	macOS	4	6.19 min	1.16 min	5.34×	25.6 → 22.3 GB	—	pass

Frequently asked questions

Speeding up sarsen

Why is sarsen slow?

sarsen is CPU-bound, and the stock implementation in sarsen leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 7.72 min where the AutoZyme path takes 17.50 s (26.5× faster).

How do I make sarsen faster?

Install AutoZyme and activate the sarsen patch, then keep using sarsen exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 26.5× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the sarsen output?

No. The accelerated path returns bit-for-bit identical results to the original sarsen implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the sarsen speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("sarsen"). The patch applies automatically the next time you call sarsen.