mirror of
https://github.com/protomaps/PMTiles.git
synced 2026-02-04 02:41:09 +00:00
spec/ directory with v2 and v3 specs [#62,#41,#4]
This commit is contained in:
40
README.md
40
README.md
@@ -44,46 +44,6 @@ See https://github.com/protomaps/PMTiles/tree/master/python/bin for library usag
|
|||||||
|
|
||||||
## Specification
|
## Specification
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
PMTiles is a binary serialization format designed for two main access patterns: over the network, via HTTP 1.1 Byte Serving (`Range:` requests), or via memory-mapped files on disk. **All integer values are little-endian.**
|
|
||||||
|
|
||||||
A PMTiles archive is composed of:
|
|
||||||
* a fixed-size 512,000 byte header section
|
|
||||||
* Followed by any number of tiles in arbitrary format
|
|
||||||
* Optionally followed by any number of *leaf directories*
|
|
||||||
|
|
||||||
### Header
|
|
||||||
* The header begins with a 2-byte magic number, "PM"
|
|
||||||
* Followed by 2 bytes, the PMTiles specification version (currently 2).
|
|
||||||
* Followed by 4 bytes, the length of metadata (M bytes)
|
|
||||||
* Followed by 2 bytes, the number of entries in the *root directory* (N entries)
|
|
||||||
* Followed by M bytes of metadata, which **must be a JSON string with bounds, minzoom and maxzoom properties (new in v2)**
|
|
||||||
* Followed by N * 17 bytes, the root directory.
|
|
||||||
|
|
||||||
### Directory structure
|
|
||||||
A directory is a contiguous sequence of 17 byte entries. A directory can have at most 21,845 entries. **A directory must be sorted by Z, X and then Y order (new in v2).**
|
|
||||||
|
|
||||||
An entry consists of:
|
|
||||||
* 1 byte: the zoom level (Z) of the entry, with the top bit set to 1 instead of 0 to indicate the offset/length points to a leaf directory and not a tile.
|
|
||||||
* 3 bytes: the X (column) of the entry.
|
|
||||||
* 3 bytes: the Y (row) of the entry.
|
|
||||||
* 6 bytes: the offset of where the tile begins in the archive.
|
|
||||||
* 4 bytes: the length of the tile, in bytes.
|
|
||||||
|
|
||||||
**All leaf directory entries follow non-leaf entries. All leaf directories in a single directory must have the same Z value. (new in v2).**
|
|
||||||
|
|
||||||
### Notes
|
|
||||||
* A full directory of 21,845 entries holds exactly a complete pyramid with 8 levels, or 1+4+16+64+256+1024+4096+16384.
|
|
||||||
* A PMTiles archive with less than 21,845 tiles should have a root directory and no leaf directories.
|
|
||||||
* Multiple tile entries can point to the same offset; this is useful for de-duplicating certain tiles, such as an empty "ocean" tile.
|
|
||||||
* Analogously, multiple leaf directory entries can point to the same offset; this can avoid inefficiently-packed small leaf directories.
|
|
||||||
* The tentative media type for PMTiles archives is `application/vnd.pmtiles`.
|
|
||||||
|
|
||||||
### Implementation suggestions
|
|
||||||
* PMTiles is designed to make implementing a writer simple. Reserve 512KB, then write all tiles, recording their entry information; then write all leaf directories; finally, rewind to 0 and write the header.
|
|
||||||
* The order of tile data in the archive is unspecified; an optimized implementation should arrange tiles on a 2D space-filling curve.
|
|
||||||
* PMTiles readers should cache directory entries by byte offset, not by Z/X/Y. This means that deduplicated leaf directories result in cache hits.
|
|
||||||
|
|
||||||
## Recipes
|
## Recipes
|
||||||
|
|
||||||
|
|||||||
42
spec/v2/spec.md
Normal file
42
spec/v2/spec.md
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
# PMTiles version 2
|
||||||
|
|
||||||
|
*Note: this is deprecated in favor of spec version 3.*
|
||||||
|
|
||||||
|
PMTiles is a binary serialization format designed for two main access patterns: over the network, via HTTP 1.1 Byte Serving (`Range:` requests), or via memory-mapped files on disk. **All integer values are little-endian.**
|
||||||
|
|
||||||
|
A PMTiles archive is composed of:
|
||||||
|
* a fixed-size 512,000 byte header section
|
||||||
|
* Followed by any number of tiles in arbitrary format
|
||||||
|
* Optionally followed by any number of *leaf directories*
|
||||||
|
|
||||||
|
### Header
|
||||||
|
* The header begins with a 2-byte magic number, "PM"
|
||||||
|
* Followed by 2 bytes, the PMTiles specification version (currently 2).
|
||||||
|
* Followed by 4 bytes, the length of metadata (M bytes)
|
||||||
|
* Followed by 2 bytes, the number of entries in the *root directory* (N entries)
|
||||||
|
* Followed by M bytes of metadata, which **must be a JSON string with bounds, minzoom and maxzoom properties (new in v2)**
|
||||||
|
* Followed by N * 17 bytes, the root directory.
|
||||||
|
|
||||||
|
### Directory structure
|
||||||
|
A directory is a contiguous sequence of 17 byte entries. A directory can have at most 21,845 entries. **A directory must be sorted by Z, X and then Y order (new in v2).**
|
||||||
|
|
||||||
|
An entry consists of:
|
||||||
|
* 1 byte: the zoom level (Z) of the entry, with the top bit set to 1 instead of 0 to indicate the offset/length points to a leaf directory and not a tile.
|
||||||
|
* 3 bytes: the X (column) of the entry.
|
||||||
|
* 3 bytes: the Y (row) of the entry.
|
||||||
|
* 6 bytes: the offset of where the tile begins in the archive.
|
||||||
|
* 4 bytes: the length of the tile, in bytes.
|
||||||
|
|
||||||
|
**All leaf directory entries follow non-leaf entries. All leaf directories in a single directory must have the same Z value. (new in v2).**
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
* A full directory of 21,845 entries holds exactly a complete pyramid with 8 levels, or 1+4+16+64+256+1024+4096+16384.
|
||||||
|
* A PMTiles archive with less than 21,845 tiles should have a root directory and no leaf directories.
|
||||||
|
* Multiple tile entries can point to the same offset; this is useful for de-duplicating certain tiles, such as an empty "ocean" tile.
|
||||||
|
* Analogously, multiple leaf directory entries can point to the same offset; this can avoid inefficiently-packed small leaf directories.
|
||||||
|
* The tentative media type for PMTiles archives is `application/vnd.pmtiles`.
|
||||||
|
|
||||||
|
### Implementation suggestions
|
||||||
|
* PMTiles is designed to make implementing a writer simple. Reserve 512KB, then write all tiles, recording their entry information; then write all leaf directories; finally, rewind to 0 and write the header.
|
||||||
|
* The order of tile data in the archive is unspecified; an optimized implementation should arrange tiles on a 2D space-filling curve.
|
||||||
|
* PMTiles readers should cache directory entries by byte offset, not by Z/X/Y. This means that deduplicated leaf directories result in cache hits.
|
||||||
91
spec/v3/spec.md
Normal file
91
spec/v3/spec.md
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
# PMTiles version 3
|
||||||
|
|
||||||
|
## File structure
|
||||||
|
|
||||||
|
A PMTiles archive is a single-file archive of square tiles with five main sections:
|
||||||
|
|
||||||
|
1. A fixed-size, 127-byte **Header** starting with `PMTiles` and then the spec version - currently `3` - that contains offsets to the next sections.
|
||||||
|
2. A root **Directory**, described below. The Header and Root combined must be less than 16,384 bytes.
|
||||||
|
3. JSON metadata.
|
||||||
|
4. Optionally, a section of **Leaf Directories**, encoded the same way as the root.
|
||||||
|
5. The tile data.
|
||||||
|
|
||||||
|
## Entries
|
||||||
|
|
||||||
|
A Directory is a list of `Entries`, in ascending order by `TileId`:
|
||||||
|
|
||||||
|
Entry = (TileId uint64, Offset uint64, Length uint32, RunLength uint32)
|
||||||
|
|
||||||
|
* `TileId` starts at 0 and corresponds to a cumulative position on the series of square Hilbert curves starting at z=0.
|
||||||
|
* `Offset` is the position of the tile in the file relative to the start of the data section.
|
||||||
|
* `Length` is the size of the tile in bytes.
|
||||||
|
* `RunLength` is how many times this tile is repeated: the `TileId=5,RunLength=2` means that tile is present at IDs 5 and 6.
|
||||||
|
* If `RunLength=0`, the offset/length points to a Leaf Directory where `TileId` is the first entry.
|
||||||
|
|
||||||
|
# Directory Serialization
|
||||||
|
|
||||||
|
Entries are stored in memory as integers, but serialized to disk using these compression steps:
|
||||||
|
1. A little-endian varint indicating the # of entries.
|
||||||
|
2. Delta encoding of `TileId`
|
||||||
|
3. Zeroing of `Offset`:
|
||||||
|
* `0` if it is equal to the `Offset` + `Length` of the previous entry
|
||||||
|
* `Offset+1` otherwise
|
||||||
|
4. Varint encoding of ll numbers
|
||||||
|
5. Columnar ordering: all `TileId`s, all `RunLength`s, all `Length`s, then all `Offset`s
|
||||||
|
6. Finally, general purpose compression as described by the `Header`'s `InternalCompression` field.
|
||||||
|
|
||||||
|
# Directory Hierarchy
|
||||||
|
* The number of entries in the root directory and leaf directories is up to the implementation.
|
||||||
|
* However, the compressed size of the header plus root directory is required in v3 to be under **16,384 bytes**. This is to allow latency-optimized clients to prefetch the root directory and guarantee it is complete. A sophisticated writer might need several attempts to optimize this.
|
||||||
|
* Root size, leaf sizes and depth should be configurable by the user to adjust for optimize for different trade-offs: cost, bandwidth, latency.
|
||||||
|
|
||||||
|
# Header Design
|
||||||
|
|
||||||
|
*Certain fields belonging to metadata in v2 are promoted to fixed-size header fields. This allows a map container to be initialized to the desired extent or center without blocking on the JSON metadata.*
|
||||||
|
|
||||||
|
The `Header` is 127 bytes, with little-endian integer values:
|
||||||
|
|
||||||
|
| offset | description | width |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| 0 | magic number `PMTiles` | 7 |
|
||||||
|
| 7 | spec version, currently `3` | 1 |
|
||||||
|
| 8 | offset of root directory | 8 |
|
||||||
|
| 16 | length of root directory | 8 |
|
||||||
|
| 24 | offset of JSON metadata, possibly compressed by `InternalCompression` | 8 |
|
||||||
|
| 32 | length of JSON metadata | 8 |
|
||||||
|
| 40 | offset of leaf directories | 8 |
|
||||||
|
| 48 | length of leaf directories | 8 |
|
||||||
|
| 56 | offset of tile data | 8 |
|
||||||
|
| 64 | length of tile data | 8 |
|
||||||
|
| 72 | # of addressed tiles, 0 if unknown | 8 |
|
||||||
|
| 80 | # of tile entries, 0 if unknown | 8 |
|
||||||
|
| 88 | # of tile contents, 0 if unknown | 8 |
|
||||||
|
| 96 | boolean clustered flag | 1 |
|
||||||
|
| 97 | internal compression enum (0 = Unknown, 1 = None, 2 = Gzip, 3 = Brotli, 4 = Zstd) | 1 |
|
||||||
|
| 98 | tile compression enum | 1 |
|
||||||
|
| 99 | tile type enum (0 = Unknown/Other, 1 = MVT (PBF Vector Tile), 2 = PNG, 3 = JPEG, 4 = WEBP | 1 |
|
||||||
|
| 100 | min zoom | 1 |
|
||||||
|
| 101 | max zoom | 1 |
|
||||||
|
| 102 | min longitude (IEEE 754 float) | 4 |
|
||||||
|
| 106 | min latitude | 4 |
|
||||||
|
| 110 | max longitude | 4 |
|
||||||
|
| 114 | max latitude | 4 |
|
||||||
|
| 118 | center zoom | 1 |
|
||||||
|
| 119 | center longitude | 4 |
|
||||||
|
| 123 | center latitude | 4 |
|
||||||
|
|
||||||
|
### Notes
|
||||||
|
|
||||||
|
* **# of addressed tiles**: the total number of tiles before run-length encoding, i.e. `Sum(RunLlength)` over all entries.
|
||||||
|
* **# of tile entries**: the total number of entries across all directories where `RunLength > 0`.
|
||||||
|
* **# # of tile contents**: the number of referenced blobs in the tile section, or the unique # of offsets. If the archive is completely deduplicated, this is equal to the # of unique tile contents. If there is no deduplication, this is equal to the number of tile entries above.
|
||||||
|
* **boolean clustered flag**: if `True`, blobs in the data section are generally ordered by Hilbert TileID. More concretely, this means that: when traversing all entries in TileID order, the offsets are either contiguous with the immediately previous entry, or refer to a lesser offset - a deduplicated tile.
|
||||||
|
* **compression enum**: Mandatory, tells the client how to decompress contents as well as provide correct `Content-Encoding` headers to browsers.
|
||||||
|
* **tile type**: A hint as to the tile contents. Clients and proxies may use this to:
|
||||||
|
* Automatically determine a visualization method
|
||||||
|
* provide a conventional MIME type HTTP `Content-Type` header
|
||||||
|
* Enforce a canonical file path extension e.g. `.mvt`, `png`, `jpeg`, `.webp`
|
||||||
|
|
||||||
|
### Organization
|
||||||
|
|
||||||
|
In most cases, the archive should be in the order `Header`, Root Directory, JSON Metadata, Leaf Directories, Tile Data. It is possible to relocate sections other than `Header` arbitrarily, but no current writers/readers take advantage of this.
|
||||||
Reference in New Issue
Block a user