Performance optimisations#
This tutorial resents techniques for optimised performances of HEALPix calculations.
Vectorisation#
Always prefer vectorised operations :
import time
import numpy as np
from healpix_geo.nested import lonlat_to_healpix
N = 100
lon = np.random.uniform(-180, 180, N)
lat = np.random.uniform(-90, 90, N)
# Good : vectorised
t0 = time.perf_counter()
ipix = lonlat_to_healpix(lon, lat, depth=10, ellipsoid="WGS84")
t_vec = time.perf_counter() - t0
# Bad : loop
t0 = time.perf_counter()
ipix_list = []
for i in range(N):
ipix_list.append(lonlat_to_healpix(lon[i : i + 1], lat[i : i + 1], 10, "WGS84")[0])
t_loop = time.perf_counter() - t0
print(f"Vectorised : {t_vec * 1000:.1f} ms ({N:,} points)")
print(f"Loop : {t_loop * 1000:.1f} ms ({N:,} points)")
print(f"Speedup : {t_loop / t_vec:.0f}× faster with vectorisation")
Vectorised : 0.7 ms (100 points)
Loop : 27.2 ms (100 points)
Speedup : 39× faster with vectorisation
Multi-threading#
The num_threads parameter controls parallel execution.
import numpy as np
from healpix_geo.nested import lonlat_to_healpix
# Automatic (use all available CPU cores)
ipix = lonlat_to_healpix(lon, lat, depth=10, ellipsoid="WGS84", num_threads=0)
# Use 4 threads
ipix = lonlat_to_healpix(lon, lat, depth=10, ellipsoid="WGS84", num_threads=4)
# Sequential execution (single thread)
ipix = lonlat_to_healpix(lon, lat, depth=10, ellipsoid="WGS84", num_threads=1)
Tip
num_threads=0 uses all available CPU cores.
num_threads=1 disables parallelism.
num_threads>1 uses the specified number of threads.