This repository has been archived on 2024-07-04. You can view files and clone it, but cannot push or open issues or pull requests.
osm_map_processing/README.md

132 lines
7.6 KiB
Markdown
Raw Normal View History

# Overview
2017-01-20 17:38:42 +00:00
This project contains useful info/tools for processing OSM maps into the mapwriter file format (.map). It was developed to help generating offline map files for Locus on Android.
# Dependencies / Prereqs
2017-01-20 17:38:42 +00:00
To run this program you'll need the following
2017-01-20 17:38:42 +00:00
* A Linux environment
* Python 3.x installed
* Java 1.8 or higher installed (OpenJDK works)
* wget installed
* gdal-bin
* python3-gdal (Ubuntu) installed
2017-01-20 17:38:42 +00:00
# Installation
2017-01-20 17:40:34 +00:00
Head over to releases tab and download the latest. Extract the file and you're good to go. Everything is self-contained minus the above dependencies.
2017-01-20 17:38:42 +00:00
# Running
2017-01-20 17:38:42 +00:00
To run the program, cd to the directory where you extracted the release and run './process_maps.py' with at least one '--map-list /path/to/list.txt parameter'. See the lists directory for examples on how to format the map list files.
### Program Usage
2017-01-20 17:39:55 +00:00
```
usage: process_maps.py [-h] --map-list MAP_LIST [--use-ram]
2018-02-25 21:17:57 +00:00
[--max-heap-space MAX_HEAP_SPACE]
[--output-map-name OUTPUT_MAP_NAME] --maps-dir MAPS_DIR
[--no-map-download] [--tag-mapping TAG_MAPPING]
[--tag-transform TAG_TRANSFORM]
2017-01-20 17:38:42 +00:00
optional arguments:
2018-02-25 21:17:57 +00:00
-h, --help show this help message and exit
--map-list MAP_LIST a text file with one map URL per line, can be
specified more than once
--use-ram use RAM for mapsforge processing -- WARNING mapsforge
uses 10x the map size in RAM for processing (ie. 100Mb
map = 1Gb RAM usage), you want a LOT of RAM for this
option
--max-heap-space MAX_HEAP_SPACE
set the max heap space for the JVM, use standard -Xmx
values, default (1g) should be fine if not using
2018-02-25 21:17:57 +00:00
--use-ram argument
--output-map-name OUTPUT_MAP_NAME
set the output .map and .poi file names
--maps-dir MAPS_DIR Where downloaded maps will be stored/read from
--no-map-download Do NOT download maps, re-use maps from maps-dir
--tag-mapping TAG_MAPPING
Specify a custom tag mapping xml file for use with
mapsforge processing
--tag-transform TAG_TRANSFORM
Specify a tag transform file for use PRIOR to
mapsforge processing (this is DIFFERENT than the
mapsforge mapping xml file)
2017-01-20 17:39:55 +00:00
```
2016-01-12 22:51:36 +00:00
### Examples
```
./process_maps.py --tag-transform ./openandromaps/tag-transform.xml --tag-mapping ./openandromaps/tag-mapping.min.xml --use-ram --max-heap-space 12g --maps-dir cache --no-map-download --output-map-name custom --map-list lists/custom.txt
./process_maps.py --tag-transform ./openandromaps/tag-transform.xml --tag-mapping ./openandromaps/tag-mapping.min.xml --use-ram --max-heap-space 16g --maps-dir cache --no-map-download --output-map-name pennsylvania --map-list lists/pennsylvania.txt
./process_maps.py --tag-transform ./openandromaps/tag-transform.xml --tag-mapping ./openandromaps/tag-mapping.min.xml --use-ram --max-heap-space 12g --maps-dir cache --no-map-download --output-map-name idaho --map-list lists/idaho.txt
./process_maps.py --tag-transform ./openandromaps/tag-transform.xml --tag-mapping ./openandromaps/tag-mapping.min.xml --use-ram --max-heap-space 24g --maps-dir cache --no-map-download --output-map-name australia --map-list lists/australia.txt
2018-03-06 18:45:10 +00:00
./process_maps.py --use-ram --max-heap-space 24g --maps-dir cache --no-map-download --output-map-name canada_ontario --map-list lists/ontario.txt
./process_maps.py --tag-transform ./openandromaps/tag-transform.xml --tag-mapping ./openandromaps/tag-mapping.min.xml --max-heap-space 8g --maps-dir cache --no-map-download --output-map-name midwest-2 --map-list lists/midwest.txt --map-list lists/ontario.txt
./process_maps.py --max-heap-space 8g --maps-dir cache --no-map-download --output-map-name usa_northeast --map-list lists/usa_northeast.txt
./process_maps.py --max-heap-space 8g --maps-dir cache --no-map-download --output-map-name usa_pacific --map-list lists/usa_pacific.txt
./process_maps.py --max-heap-space 8g --maps-dir cache --no-map-download --output-map-name usa_south --map-list lists/usa_south.txt
2018-03-06 18:45:10 +00:00
./process_maps.py --use-ram --max-heap-space 16g --maps-dir cache --no-map-download --output-map-name mexico --map-list lists/mexico.txt
./process_maps.py --use-ram --max-heap-space 24g --maps-dir cache --no-map-download --output-map-name mexico_central_america --map-list lists/mexico.txt --map-list lists/central_america.txt
./process_maps.py --use-ram --max-heap-space 12g --maps-dir cache --no-map-download --output-map-name central_america --map-list lists/central_america.txt
./process_maps.py --max-heap-space 24g --maps-dir cache --no-map-download --output-map-name south_america --map-list lists/south_america.txt
./process_maps.py --max-heap-space 8g --maps-dir cache --no-map-download --output-map-name netherlands --map-list lists/netherlands.txt
./process_maps.py --use-ram --max-heap-space 24g --maps-dir cache --no-map-download --output-map-name spain --map-list lists/spain.txt
./process_maps.py --max-heap-space 8g --maps-dir cache --no-map-download --output-map-name france --map-list lists/france.txt
./process_maps.py --use-ram --max-heap-space 16g --maps-dir cache --no-map-download --output-map-name belgium --map-list lists/belgium.txt
./process_maps.py --use-ram --max-heap-space 12g --maps-dir cache --no-map-download --output-map-name luxembourg --map-list lists/luxembourg.txt
./process_maps.py --use-ram --max-heap-space 12g --maps-dir cache --no-map-download --output-map-name portugal --map-list lists/portugal.txt
```
# Caching maps for re-use / re-processing
Included in the repo there is a ```download_maps.py``` script that is meant to help with downloading maps on a schedule or for caching. It uses the same logic as the main script but for download only.
### Program usage
```
usage: download_maps.py [-h] --map-list MAP_LIST --map-dir MAP_DIR
optional arguments:
-h, --help show this help message and exit
--map-list MAP_LIST a text file with one map URL per line, can be specified
more than once
--map-dir MAP_DIR The directory where maps should be downloaded
```
### Example Usage
```
./download_maps.py --map-dir ./cache \
--map-list lists/central_america.txt
./download_maps.py --map-dir ./cache \
--map-list lists/midwest.txt
./download_maps.py --map-dir ./cache \
--map-list lists/south_america.txt
```
2018-03-06 18:45:10 +00:00
# Performance
This deserves an expansion but for now: notes on how to improve performance on 'small' systems that may not have >64Gb of RAM to work with for larger maps.
* tmpfs works well for source/tmp/output in disk mode for speedups
* zram /should/ work well if disk mode eats a little too much RAM when backed by tmpfs
* type=ram uses closer to 15-20x source map size in practice
* stacking maps from multiple sources **increases** memory footprint beyond the rough rules of thumb that are called out by the mapsforge devs
* threads=n>1 is a fast way to ruin a heap space (don't do this with type=ram as a general rule of thumb)
* there *are* choke points that are not multi-threaded once the main osmosis 'bits' are done running, deal
### Look Into
If you're concerned about Java heap usage, the ZGC announced at the 2018 fosdem conference may be useful. I have no had success building a jre/jdk from sources but you may find the following link useful.
* https://fosdem.org/2018/schedule/event/zgc/attachments/slides/2211/export/events/attachments/zgc/slides/2211/ZGC_FOSDEM_2018.pdf
# Licencing
2016-01-12 22:51:36 +00:00
All code is licensed Apache 2.0 and all non-code is licensed Creative Commons CC-BY-SA-3.0