Documentation

geolith vs Planetiler

geolith was benchmarked head-to-head against Planetiler (Java) on identical input data and region bounds.

Test Configuration

ParameterValue
RegionWestern India (67.67,14.39,80.92,24.72)
Max zoom15
OSM PBFwestern-zone-latest.osm.pbf (196 MB)
OvertureBuildings + Divisions (hive-partitioned GeoParquet)
India boundaryGeoJSON
Natural EarthSQLite
Land polygonsShapefile (EPSG:3857)
Water polygonsShapefile (EPSG:3857)
MachineApple M2 Max, 96 GB RAM, NVMe SSD

Both tools processed the same layers: earth, water, boundaries, landuse, roads, transit, buildings, places, pois, and divisions. Planetiler additionally processed daylight landcover (0.3s wall time, negligible impact).

Results

Metricgeolith (Rust)Planetiler (Java)
Wall time5:156:20
CPU time (user)709s3,343s
CPU utilization297%895%
Output size2.0 GB3.4 GB
Peak memory~2 GB25 GB (-Xmx24g)
Tiles written1,612,666

Summary

  • 17% faster wall time (5:15 vs 6:20)
  • 4.7x less CPU time (709s vs 3,343s)
  • 41% smaller output (2.0 GB vs 3.4 GB)
  • ~12x less memory (~2 GB vs 25 GB heap)

geolith achieves competitive performance with significantly lower resource usage. The smaller output is primarily due to tile content deduplication and efficient MVT encoding.

Phase Breakdown (geolith)

PhaseDurationDetails
Phase 1: Overture GeoParquet~52s6.4M features read, 66.3M tile features
Phase 1: OSM PBF (2-pass)~25s31.8M nodes, 3.8M features
Phase 1: Natural Earth<1s18,939 features from 12 tables
Phase 1: Land polygons~78s833K polygons from shapefile
Phase 1: Water polygons~90s14.5K polygons from shapefile
Phase 1: India boundary~2s1 GeoJSON feature
Phase 2: External sort<1s34 LZ4-compressed chunks, k-way merge
Phase 3: Tile encode~50s1,612,666 tiles, gzip compressed
Total5:159.5M input features, 77.5M tile features

geolith CLI

geolith \
    --data ./overture-data/ \
    --osm-pbf western-zone-latest.osm.pbf \
    --natural-earth natural_earth_vector.sqlite \
    --land-polygons ./land-polygons-split-3857/ \
    --water-polygons ./water-polygons-split-3857/ \
    --india-boundary india-boundary.geojson \
    --output western-zone.pmtiles \
    --bbox 67.67,14.39,80.92,24.72 \
    --max-zoom 15 \
    --node-cache

Planetiler CLI

java -Xmx24g -jar protomaps-basemap-HEAD-with-deps.jar \
    --osm-path=western-zone-latest.osm.pbf \
    --output=western-zone-planetiler.pmtiles \
    --bounds=67.67,14.39,80.92,24.72 \
    --buildings_source=overture \
    --overture_buildings_path=./overture-buildings/ \
    --divisions_source=overture \
    --overture_divisions_path=./overture-divisions/ \
    --india_boundary_path=india-boundary.geojson \
    --nodemap-type=sparsearray --nodemap-storage=mmap \
    --download --force

Expected Performance

DatasetSizeZoomTimeOutputMachine
Regional (western India)~196 MB OSM + Overture15~5 min~2 GB PMTilesM2 Max, 96 GB RAM
Regional (India full)~5 GB OSM + Overture15~15–25 min~5–8 GB PMTilesM2 Max, 96 GB RAM
Planet (all themes)~70 GB OSM + ~290 GB Overture15TBDTBD32 cores, 64 GB RAM, NVMe

Planet-scale benchmarks are planned. The regional result above demonstrates geolith already outperforms Planetiler on identical workloads.

Bottlenecks

Performance is typically bound by one of three factors:

  1. Disk I/O — Reading GeoParquet/PBF input and writing temporary sort files. Use NVMe SSDs for the data directory and --tmpdir.
  2. CPU — Feature processing (projection, clipping, simplification) is parallelized via rayon. More cores = faster processing.
  3. Memory — External sort keeps memory usage bounded. The main memory consumers are the Parquet reader buffers and the OSM node store.

Tuning Recommendations

FactorRecommendation
Disk speedNVMe SSD for --data, --tmpdir, and --output paths
Thread countDefault (all cores) is usually optimal. Reduce with --threads if sharing the machine.
Temp directoryPlace on the fastest available disk. Avoid network mounts.
Max zoomEach additional zoom level roughly quadruples tile count. See Output Tuning.
Node cacheUse --node-cache to skip OSM pass 1 on subsequent runs with the same PBF.

Running Micro-Benchmarks

# Run the full benchmark suite
cargo bench

# Run a specific benchmark file
cargo bench --bench tile_encode

# Filter by benchmark name
cargo bench -- "encode_tile"

Micro-benchmarks use Criterion.rs for statistical analysis. Results include confidence intervals and regression detection. Reports are saved to target/criterion/ with HTML visualizations.