Speed up Squidpy co_occurrence: up to 72.3× faster, identical output

Q: Why is Squidpy co_occurrence slow?

Squidpy co_occurrence is CPU-bound, and the stock implementation in squidpy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 128.16 min where the AutoZyme path takes 2.18 min (72.3× faster).

Q: How do I make Squidpy co_occurrence faster?

Install AutoZyme and activate the squidpy patch, then keep using Squidpy co_occurrence exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 72.3× faster on the benchmark datasets, with no pipeline or API changes.

Q: Does the AutoZyme speedup change the Squidpy co_occurrence output?

No. The accelerated path returns bit-for-bit identical results to the original squidpy implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

Q: How do I install the squidpy speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("squidpy"). The patch applies automatically the next time you call squidpy.gr.co_occurrence.

Benchmark charts

Switch benchmark platform; all charts update together

Speedup distribution

Each dot is one finalized dataset/thread run on Windows

log scale

four_i_mouse_cortex_f…

72.3×

four_i_mouse_cortex_8…

36.5×

merfish_mouse_preoptic

35.4×

slideseqv2_mouse_hipp…

13.2×

seqfish_mouse_gastrul…

5.58×

four_i_mouse_cortex_f…four_i_mouse_cortex_8…merfish_mouse_preopticslideseqv2_mouse_hipp…seqfish_mouse_gastrul…

Thread sweep

Speedup across finalized thread counts on Windows

four_i_mouse_cortex…merfish_mouse_preop…slideseqv2_mouse_hi…seqfish_mouse_gastr…

Memory

Baseline vs optimized peak memory on Windows

baselineoptimized

What is accelerated

The public API stays the same; AutoZyme replaces only the supported fast path.

This task targets squidpy.gr.co_occurrence in squidpy. The benchmarked result preserves the declared scientific output gate while reducing CPU runtime on the listed datasets.

Also searched as: co-occurrence, spatial statistics, neighborhood enrichment.

Supported scope

The patch replaces the inner helper squidpy.gr._ppatterns._co_occurrence_helper (register_patch targets=[("squidpy.gr._ppatterns","_co_occurrence_helper", fast_co_occurrence_helper)]), so it is invoked for EVERY co_occurrence call regardless of public-API arguments. Read full supported scope

The patch replaces the inner helper squidpy.gr._ppatterns._co_occurrence_helper (register_patch targets=[("squidpy.gr._ppatterns","_co_occurrence_helper", fast_co_occurrence_helper)]), so it is invoked for EVERY co_occurrence call regardless of public-API arguments. By binding below the public entry point it correctly supports: any cluster_key, any spatial_key, copy=True/False, any interval (int->linspace or array — upstream builds the float interval array before the helper is reached; the helper only sees interval as a numeric array and bisects it, so non-default interval sizes/values work), any n_splits (auto or explicit — the helper consumes whatever tile collection co_occurrence built), and any n_jobs/backend (squidpy.parallelize chunks idx_splits across joblib workers; the patched helper handles arbitrary sub-lists of triu pairs per chunk, and re-derives its own numba thread count at call time via auto_threads). Per-pair same_split symmetry and the divide-by-zero / zero-marginal NaN guards mirror upstream (the "if rs==0.0: continue" / "if m==0.0: continue" branches reproduce upstream's "np.sum==0 -> zeros" behavior). Two internal kernels: a fused 2D distance+bin+histogram path (spatial.shape[1]==2) and a runtime-dim nd fallback (spatial.shape[1]!=2). An all-tiles parallel-over-tiles kernel is used only when is_2d AND len(idx_splits) >= _ALL_TILES_THRESHOLD (default 2000, tunable via AUTOZYME_COOCCURRENCE_ALL_TILES env at import). Concordance verified pearson_occ=1.0 and q99_abs_diff_occ <= 2e-6 across all five tiers (small/medium/large/ood_large/ood_xlarge), thread 1/4/14, pass_rate=1.0. Tested against squidpy 1.6.5.

Out-of-scope behavior

handles all

Show detailed speedup table 10 runs

Dataset	Tier	Platform	Threads	Baseline	Optimized	Speedup	Memory	Concordance	Pass
`four_i_mouse_cortex_80k`	ood_large	Windows	8	11.82 min	18.52 s	36.5×	0.6 → 0.6 GB	—	pass
`four_i_mouse_cortex_full`	ood_xlarge	Windows	4	128.16 min	2.18 min	72.3×	0.9 → 0.9 GB	—	pass
`merfish_mouse_preoptic`	large	Windows	4	7.86 min	12.55 s	35.4×	0.7 → 0.7 GB	—	pass
`seqfish_mouse_gastrulation`	small	Windows	4	48.75 s	8.60 s	5.58×	6.2 → 0.6 GB	—	pass
`slideseqv2_mouse_hippocampus`	medium	Windows	8	2.18 min	10.03 s	13.2×	0.8 → 0.8 GB	—	pass
`four_i_mouse_cortex_80k`	ood_large	macOS	14	7.80 min	10.55 s	42.7×	2.0 → 0.7 GB	—	pass
`four_i_mouse_cortex_full`	ood_xlarge	macOS	—	80.62 min	1.11 min	72.3×	2.5 → 1.0 GB	—	pass
`merfish_mouse_preoptic`	large	macOS	8	4.97 min	7.59 s	42.1×	2.5 → 0.7 GB	—	pass
`seqfish_mouse_gastrulation`	small	macOS	14	41.92 s	3.02 s	13.6×	13.3 → 0.7 GB	—	pass
`slideseqv2_mouse_hippocampus`	medium	macOS	14	1.91 min	5.00 s	22.0×	2.1 → 0.9 GB	—	pass

Frequently asked questions

Speeding up Squidpy co_occurrence

Why is Squidpy co_occurrence slow?

Squidpy co_occurrence is CPU-bound, and the stock implementation in squidpy leaves performance on the table in its core numerical work. On the benchmark datasets the original takes 128.16 min where the AutoZyme path takes 2.18 min (72.3× faster).

How do I make Squidpy co_occurrence faster?

Install AutoZyme and activate the squidpy patch, then keep using Squidpy co_occurrence exactly as before. AutoZyme transparently substitutes the faster, output-validated path, up to 72.3× faster on the benchmark datasets, with no pipeline or API changes.

Does the AutoZyme speedup change the Squidpy co_occurrence output?

No. The accelerated path returns bit-for-bit identical results to the original squidpy implementation (maximum absolute difference 0), checked by a frozen concordance gate on every benchmark dataset.

How do I install the squidpy speedup?

In Python: pip install autozyme, then import autozyme and autozyme.activate("squidpy"). The patch applies automatically the next time you call squidpy.gr.co_occurrence.