/ Technical Overview

GeoPBF — Lightweight Binary GIS Format

Protocol Buffers-based geospatial binary format.
Lightweight · Universal converter · ArrayBuffer-native — the wire format for browser GIS

Table of Contents
  1. Why GeoPBF?
  2. Binary Format Structure
  3. Delta Encoding × Varint
  4. GIS Conversion Hub
  5. Antimeridian Handling
  6. ArrayBuffer & Worker Architecture
  7. Gint Extension — 64-bit Coordinate Packing
  8. Comparison with Other Formats
  9. API Reference

1. Why GeoPBF?

GeoJSON is readable and convenient, but has critical limitations at scale. As a text format it is expensive to transfer, slow to parse, and blocks the browser's main thread. Shapefile is widely adopted but its multi-file structure is ill-suited for the Web. Tiled MVT fragments data by zoom level, increasing round-trip overhead.

GeoPBF is a binary format designed as the communication layer for browser GIS. Protocol Buffers compression efficiency, coordinate delta-encoding, and an ArrayBuffer-native design make transfer, decoding, and inter-Worker handoff uniformly fast.

2. Binary Format Structure

A GeoPBF file consists of two parts: a Header Section and a Body Section (FARRAY).

Header
fixed-length metadata
NAME · KEYS (global attribute dictionary) · PRECISION · BUFS (binary pool)
DESCRIPTION · LICENSE · ATTRIBUTION
Metadata can be updated O(1) via binary slice — no re-encoding required
FARRAY
feature array
FEATURE × N
  └ GEOMETRY: GTYPE · LENGTH · COORDS (delta-compressed)
  └ PROPERTIES: INDEX (KEYS ref) · VALUE (typed)
FieldTagTypeDescription
GTYPE8Varint0:Point 1:MPoint 2:Line 3:MLine 4:Poly 5:MPoly 6:Collection
LENGTH9Packed VarintVertex counts for rings and multi-part geometries
COORDS10Packed SVarintDelta-compressed coordinate stream X₀,Y₀,ΔX₁,ΔY₁,... (see below)
INDEX12Packed VarintIndices into the global KEYS dictionary
VALUE11Repeated MessageNULL/BOOL/INTEGER/FLOAT/STRING/DATE/COLOR/JSON/BLOB/IMAGE
O(1) Header Updates

Metadata such as NAME, DESCRIPTION, and LICENSE is stored independently in the header, allowing direct binary-slice writes without touching the feature body. Renaming a dataset or updating license information on a multi-gigabyte file completes instantly, with no re-encoding required.

3. Delta Encoding × Varint

Storing coordinates as floating-point text like GeoJSON means the longitude 139.7412345 alone costs 11 bytes. GeoPBF reduces this dramatically with two stages of compression.

① Raw float
139.74000
35.68000
139.74123
35.68123
139.74246
② Delta
139.74000
35.68000
+0.00123
+0.00123
+0.00123
③ × 10⁶ (int)
139740000
35680000
+1234
+1234
+1234
④ Varint bytes
4 bytes
4 bytes
2 bytes
2 bytes
2 bytes

The coordinate difference between adjacent vertices is typically just a few hundred metres, so after integer scaling the delta values are small. Varint encodes small integers in fewer bytes, giving GeoPBF a property that is ideal for geographic data: denser vertex clusters compress more aggressively.

Comparison with GeoJSON

A GeoJSON "coordinates":[139.741234,35.681234] pair is roughly 30 bytes (text). GeoPBF's delta-Varint representation costs 8 bytes for the first vertex and 2–3 bytes for every subsequent one. Real-world coastline data typically sees a 60–80% size reduction. Layering gzip via CompressionStream adds another 30–50%.

4. GIS Conversion Hub

GeoPBF is not merely a storage format — it acts as a conversion hub among major GIS formats, with decoding (import) and encoding (export) paths for all of them. Placing GeoPBF at the center enables any-to-any format conversion in a single step.

FormatReadWriteNotes
GeoJSONFile / Blob / Object all accepted
Shapefile✓ (.zip)✓ (.zip)Multi-file bundle handled as a single .zip
KMZ / KMLGoogle Earth compatible
GMLOGC standard
GPXGPS tracks & waypoints
FlatGeoBufStreaming-friendly binary
TopoJSONTopology-preserving I/O
GeoPBF✓ (native)✓ (native)Native fast-load
Gint✓ (decode)✓ (encode)Morton + VW extension (see §6)
Automatic Format Detection

Whatever you pass to geopbf(input) — a File, Blob, ArrayBuffer, or plain Object — the library auto-detects the format from the file extension, MIME type, or magic bytes and routes to the correct decoder. The caller never has to think about format.

// Any format, the same API
const pbf = await geopbf(file);          // File (Shapefile .zip, GeoJSON, KMZ ...)
const pbf = await geopbf(arrayBuffer);    // ArrayBuffer (fetch response, etc.)
const pbf = await geopbf(geojsonObject);  // GeoJSON object directly

// Export to any format
const geojson  = pbf.geojson;    // → GeoJSON object
const topojson = pbf.topojson;   // → TopoJSON (topology-preserving)
const shp      = pbf.shape();    // → Shapefile (.zip Blob)
const kmz      = pbf.kmz();      // → KMZ Blob
const gml      = pbf.gml();      // → GML string
const fgb      = await pbf.fgb(); // → FlatGeoBuf ArrayBuffer

// Save as native .geopbf (gzip-compressed)
await pbf.save('coastline');    // → downloads coastline.geopbf

5. Antimeridian Handling

The antimeridian (±180° longitude) is a well-known trap in geographic data. A polygon that straddles it — for example a geometry spanning the Bering Sea — is stored as a single ring in most formats, with coordinates jumping from +179° to −179°. Renderers, spatial indexes, and clipping operations all break silently on such data.

GeoPBF treats antimeridian correctness as a format invariant: every geometry stored in GeoPBF is guaranteed to be antimeridian-cut. No straddling polygon can exist inside the format.

Automatic at Encode Time — All Import Paths

setFeature() calls antimeridianFeature(q) on every feature before writing it to the buffer — regardless of the source format. GeoJSON, Shapefile, KMZ, GML, GPX, FlatGeoBuf, TopoJSON — all go through the same path. The caller never needs to pre-process geometries.

5.1 Spherical Great-Circle Intersection

The cut is not a naive ±180° coordinate clamp. antimeridianCut() computes the precise latitude at which each crossing segment intersects the antimeridian using the spherical formula — essential near the poles where meridian convergence is significant. After cutting, each sub-polygon is a valid closed ring with vertices inserted at the exact intersection latitude.

// Crossing detection: sign change + span > 180° means antimeridian crossing (not date line jump)
// p[i][0] * p[i+1][0] < 0  &&  abs(p[i][0] - p[i+1][0]) > 180

// Spherical intersection latitude (great-circle formula)
// x = sin(Δlat/2)·sin(avgLng)·cos(halfDeltaLng) - sin(avgLat)·cos(avgLng)·sin(halfDeltaLng)
// z = cos(lat0)·cos(lat1)·sin(Δlng)
// intersectLat = atan2(x, |z|)   ← geometrically exact on the sphere
Why This Matters for Gint

Morton codes require coordinates in a bounded domain. A geometry that straddles ±180° would produce Morton codes that jump across the entire integer space, breaking binary search and spatial proximity guarantees. By cutting at encode time, all Morton codes in a GeoPBF tile are spatially coherent with no special-casing needed downstream.

6. ArrayBuffer & Worker Architecture

GeoPBF's internal representation is consistently an ArrayBuffer. This is a deliberate design choice for deep integration with the browser's parallel processing layer (Web Workers), yielding three important properties.

Main Thread
geopbf(file)
pbf.arrayBuffer
worker.postMessage(buf, [buf])
— buffer detached after transfer —
Encoder Worker
gint encode
topology analysis
postMessage(result, [result])
zero-copy return
Render Worker
SharedArrayBuffer
concurrent multi-worker reads
LOD filter
Canvas / WebGL draw

6.1 Transferable — Zero-Copy Transfer

// ArrayBuffer moves from main thread → Worker with zero copy
// (ownership transfer, not a copy — no extra memory consumed)
worker.postMessage(pbf.arrayBuffer, [pbf.arrayBuffer]);
// After transfer pbf.arrayBuffer is detached (main thread can no longer access it)

// Worker side
onmessage = ({ data: buf }) => {
	const view = new DataView(buf);  // begin decoding immediately
	postMessage(result, [result]);    // return zero-copy
};

6.2 SharedArrayBuffer — Concurrent Multi-Worker Reads

// Expand decoded gint coordinate data into a SharedArrayBuffer
// → multiple render Workers can read concurrently with no copies
const sab = new SharedArrayBuffer(buf.byteLength);
new Uint8Array(sab).set(new Uint8Array(buf));

// N tile workers run LOD filter + draw in parallel
tileWorkers.forEach(w => w.postMessage({ sab, zoom, viewport }));

6.3 CompressionStream — Native gzip

// On save: browser-native gzip (no external library needed)
let blob = new Blob([buf]);
blob = await (new Response(
	blob.stream().pipeThrough(new CompressionStream("gzip"))
)).blob();
postMessage(new File([blob], `${name}.geopbf`, { type: "application/x-geopbf" }));
Why Workers Fit So Well

GeoJSON is an object graph (JSON). Before a Worker can touch it you need JSON.stringify → transfer → JSON.parse — a full serialize/deserialize cycle that costs hundreds of milliseconds on large datasets. GeoPBF is an ArrayBuffer from the very start, so it moves between Workers via zero-copy transfer, and the receiver decodes by advancing a pointer. The main thread is never blocked.

7. Gint Extension — 64-bit Coordinate Packing

Applying gint encoding to GeoPBF packs each vertex's coordinates into a single 64-bit integer that simultaneously stores both the Morton code and the VW weight.

L1 node
(anchor)
1
Morton code (63 bits) — fixed precision 10⁻⁷ deg
bit 63 = 1 → L1 (anchor vertex, always retained)
L2 node
(detail)
0
Morton code (58 bits)
VW rank
(6 bits)
bit 63 = 0 → L2 (detail vertex, LOD-filtered)  ·  bits 0-5 = VW weight rank (0–63)

Packing where (Morton) and which LOD (VW rank) into a single 64-bit integer means both spatial queries and LOD filtering reduce to simple integer arithmetic.

// L1 / L2 check (most-significant bit)
const isAnchor = (code >> 63n) === 1n;

// Extract Morton code (L2: upper 58 bits)
const mortonCode = code & 0x7FFFFFFFFFFFFFF8n;  // bits 3-62

// Extract VW weight rank (L2: lower 6 bits)
const vwRank = Number(code & 0x3Fn);  // bits 0-5 → 0–63

// LOD filter at zoom z (rank threshold corresponding to 1px²)
const minRank = rankFromZoom(z);  // lower threshold at higher zoom (more vertices)
const visible = isAnchor || vwRank >= minRank;

8. Comparison with Other Formats

Metric GeoJSON Shapefile MVT (tiles) GeoPBF
Encoding Text UTF-8 Binary (multi-file) PBF binary PBF + delta + Varint
File count 1 .shp + .dbf + .shx + … zoom × tile count 1
Transfer efficiency Low (large text) Medium Medium (zoom-split) High (binary + gzip)
Worker transfer JSON serialize required Conversion required ArrayBuffer ArrayBuffer zero-copy
LOD support None None Per-zoom files VW weight built-in
Spatial index None .shx (linear) Tile coords Morton 64-bit
Format conversion External library needed External library needed Limited 9 formats built-in
Topology None None None Arc sharing · boundary integrity

9. API Reference

// ── Load ───────────────────────────────────────────────
const pbf = await geopbf(input, options);
// input:   File | Blob | ArrayBuffer | GeoJSON object | URL string
// options: { name, gint, nocache, clean }
//   gint:    false → skip topology encode (gint conversion)
//   nocache: true  → bypass IndexedDB cache and re-fetch
//   clean:   true  → run topology integrity clean pass

// ── Data access ────────────────────────────────────────
pbf.length          // feature count
pbf.keys            // attribute name array ['name', 'rank', ...]
pbf.arrayBuffer     // raw ArrayBuffer (for Worker transfer)
pbf.geojson         // convert to GeoJSON FeatureCollection
pbf.topojson        // convert to TopoJSON (topology-preserving)
pbf.getFeature(i)   // decode only the i-th feature

// ── Export ─────────────────────────────────────────────
await pbf.save('name')      // download as .geopbf (gzip)
pbf.shape()               // Shapefile .zip
pbf.kmz()                 // KMZ
pbf.gml()                 // GML string
pbf.gpx()                 // GPX string
await pbf.fgb()           // FlatGeoBuf ArrayBuffer

// ── O(1) metadata update ───────────────────────────────
pbf.header({ name: 'New Name', description: '...', license: 'MIT' });

// ── Render ─────────────────────────────────────────────
pbf.draw(canvasCtx, { fill: '#5a8050', stroke: '#a8d47e' });

geopbf v1.0.0 · Kenji Yoshida · MIT License · 2026