User Configurations

The user configuration: config.yaml can be found in GP2 root under /user/ directory.

The user configurations is the place where to configure Green Paths 2.0. It uses YML fileformat. For more on YML syntax see YML Basics.

User configuratinos can be thought to belong into five groups (YML dictionaries):

  1. project

  2. osm_network

  3. data_sources

  4. routing

  5. analysing

These groups names need to be included in the user configurations, and the names need to be the ones described here! Example of the groups (YML dictionaries) and formatting:

project:
    some_project_key: 123
osm_network:
    some_osm_key: "123"
data_sources:
    - some_data_source_name: "some name"
      some_data_source_key: 123

    - some_data_source_name2: "some name 2"
      some_data_source_key2: 456
routing:
    some_routing_key: 123
analysing:
    some_analysing_key: True

Attention

When not using optional configuration e.g. data_buffer, remove both, the key and the value!

Hint

Remember to only choose one if multiple examples given here for the user configurations!

Tip

All strings can be written with or without apostrophes / quotation marks e.g. aqi, ‘aqi’ or “aqi”.

User Configuration Validation and Descriptor

When filling the user configurations user can use Descriptor command to describe the exposure data sources etc. NOTE: the Descriptor currently demands filled and valid user configurations, so it should be improved!

Users should use “validate” command to check the validity of user configurations, when filling it.

See User Inteface Commands for more on using these commands.


Project Group

Project group configures the project wide settings.

Groups name in YML: project

project_crs

  • Type: integer

  • Required: mandatory

  • Explanation: The coordinate reference system (CRS) code to which all spatial data will be reprojected to. Needs metric unit CRS projection, not degrees.

  • Examples: 3879

Warning

CRS should be projected and using meters as units, not degrees.

datas_coverage_safety_percentage

  • Type: integer | float

  • Required: optional

  • Explanation: percentage of which all exposure data needs to cover on the OSM road segments. Is calculated by simply dividing covered segments by all segments in OSM pbf. The extent of OSM street network will thus affect the coverage percentage.

  • Default: 33

  • Example: 50

Project YAML Group Examples

Example of the project group configurations, with mandatory configurations. Dismissing all the optional “key: value” fields.

project:
  project_crs: 3879

Example of the project group configurations

project:
  project_crs: 3879
  datas_coverage_safety_percentage: 75

OSM Network Group

OSM Network configures the OSM PBF settings.

Name in YML: osm_network

osm_pbf_file_path

  • Type: string

  • Required: mandatory

  • Explanation: file path to the OSM pbf file.

  • Examples: user_data_dir/osm/hki.osm.pbf

Hint

Filepath can be relative if file located within this project root directory. Otherwise use absolute path.

original_crs

  • Type: integer

  • Required: mandatory

  • Explanation: The original CRS code of the OSM network. This will be transformed to project crs, if not the same.

  • Examples: 4326

segment_sampling_points_amount

  • Type: integer

  • Required: optional

  • Explanation: force a segment sampling points amount. If not given, the sampling points will be created based on the length of the segment (recommended).

  • Default: Generated by using segment length and raster cell size.

  • Examples: 5

Hint

Recomended not to use this and go with the default length based value, unless good reason!

Example of the osm_network group configurations, with mandatory configurations. Dismissing all the optional “key: value” fields.

osm_network:
  osm_pbf_file_path: /user_data_dir/osm/hki.osm.pbf
  original_csr: 4326

Example of the osm network group configurations with optional configurations

osm_network:
  osm_pbf_file_path: /user_data_dir/osm/hki.osm.pbf
  original_csr: 4326
  segmented_sampling_points_amount: 10

Data Sources

Data Sources configures the exposure data sources and their individual settings.

These items are YML list items, so they start with character “-”. There can be 1-n data sources. See example below.

Groups name in YML: data_sources

name

  • Type: string

  • Required: mandatory

  • Explanation: name for the exposure data source, can be anything but needs to be the same for the same data throught out user configurations. Prefer short names!

  • Example: aqi

Warning

Data source name needs to be the same in routing and analysing configurations.

filepath:

  • Type: string

  • Required: mandatory

  • Explanation: filepath to the data.

  • Example: /user_data_dir/data/gvi_green.shp

Hint

Filepath can be relative if file located within this project root directory. Otherwise use absolute path.

data_type

  • Type: string

  • Required: optional, recommended

  • Explanation: Data type of the exposure data, can be: “raster” or “vector”.

  • Example: raster

Hint

This field is not mandatory, data type will be determined from file name if not given. Recomended to give for more robust and solid execution.

original_crs

  • Type: integer

  • Required: mandatory

  • Explanation: Th original CRS code of the data. This will be transformed to project crs, if not the same.

  • Examples: 3879

min_data_value

  • Type: integer | float

  • Required: mandatory

  • Explanation: The theoretical minimum value of the data source.

  • Examples: 0.0, 1

Hint

Use theoretical value so that the data does not scew the results!

max_data_value

  • Type: integer | float

  • Required: mandatory

  • Explanation: The theoretical maximum value of the data source.

  • Examples: 5, 97.9

Hint

Use theoretical value so that the data does not scew the results!

good_exposure

  • Type: boolean

  • Required: mandatory

  • Explanation: Determines if the exposure values are treated as positive (bad) or negative (good) weights for the road segments. Adding cost to segments makes it more expensive and vice versa.

  • Examples: True

Hint

Make sure this is correct! True means good exposure like greenery (decreasing traversal cost), False means bad exposure like noise or air quality (increasing traversal cost).

Warning

As air quality should be bad exposure, but low value in air quality e.g. 1.2 (form 1-5 scale) is actually clean air, but it will be slightly penalized, compared to segments that do not have any value. This should only be problem with sparse exposure data.

data_buffer

  • Type: integer | float

  • Required: optional

  • Explanation: Isotropic buffer for vector data, in meters. Can be used to increase the effect of points or lines etc. Should be used with caution and with a good reason, as it can twist the results.

  • Example: 5

Warning

Use only with good reason, know what you are doing.

data_column

  • Type: string

  • Required: mandatory (vector) | not used for raster

  • Explanation: The name of the data field “column” in the data source.

  • Example: db_hi

no_data_value

  • Type: integer | float

  • Required: optional

  • Explanation: Value to be used for no data values (no exposure raster for segment found). If this is given, the segments with no data value do not get any good or bad weighting from exposure data sources that are not found for them.

  • Examples: 0.0, 1

  • Note: Set this if the data has some specific value for no_data, e.g. -999. The no data will be filtered out and not used for routing or analysing exposure.

layer_name

  • Type: string

  • Required: optional (vector), recommended

  • Explanation: For vector data that might have multiple layers (e.g. GPKG), the name of the layer. If not given, will take first layer if only one layer available. Otherwise will cause error.

  • Example: comb_gvi

raster_cell_resolution

  • Type: integer | float

  • Required: mandatory (vector), optional (raster)

  • Explanation: The resolution (in meters) that the exposure raster will have. If this is given to raster data source, will reproject to this cell resolution.

  • Example: 20

save_raster_file

  • Type: boolean

  • Required: optional

  • Explanation: Decides if the exposure raster should be saved to cache for inspections etc.

  • Default: False

  • Example: True

custom_processing_function

  • Type: string

  • Required: optional, experimental

  • Explanation: Experimental: if a data set needs some pre-pre-processing, a function needs to be manually written to globals in custom_functions.py and this given the name. It is recommended to process the exposure data sources so that no pre-pre-processin is needed. This is mainly done for AQI .nc data for Helsinki.

Data sources YAML Group Examples

Example of the Data sources group configurations, with mandatory configurations. Dismissing all the optional “key: value” fields.

gvi_lines is vector, aqi is raster data note that some configuration fields are needed for vector but not raster e.g. raster_cell_resolution

data_sources:
  - name: 'gvi_lines'
      filepath: /user_data_dir/data/gvi_lines.shp
      data_column: Comb_GVI
      no_data_value: 0
      min_data_value: 0.0
      max_data_value: 97.9
      good_exposure: True
      raster_cell_resolution: 10
      original_crs: 3879

  - name: "aqi"
      filepath: /user_data_dir/data/aqi.nc
      original_crs: 4326 
      data_column: AQI
      no_data_value: 1
      min_data_value: 1
      max_data_value: 5
      good_exposure: False

Example of the data sources group configurations with optional configurations

data_sources:
  - name: 'gvi_lines'
      filepath: /user_data_dir/data/gvi_lines.shp
      data_type: vector
      data_buffer: 10
      save_raster_file: True
      data_column: Comb_GVI
      no_data_value: 0
      min_data_value: 0.0
      max_data_value: 97.9
      good_exposure: True
      raster_cell_resolution: 10
      original_crs: 3879

  - name: "aqi"
    filepath: /user_data_dir/data/gvi_lines.shp
    data_type: raster
    original_crs: 4326
    data_column: AQI
    no_data_value: 1
    min_data_value: 1
    max_data_value: 5
    good_exposure: False
    raster_cell_resolution: 10
    save_raster_file: True 
    custom_processing_function: convert_aq_nc_to_tif_and_scale_offset

Routing Group

Routing group configures the routing settings.

Name in YML: routing

transport_mode

  • Type: string

  • Required: mandatory

  • Explanation: travelling mode. Options: walking, cycling.

  • Example: walking

travel_speed

  • Type: integer | float

  • Required: mandatory

  • Explanation: define travelling speed in km/h.

  • Example: 5.5

  • Defaults: 5.0 (walking), 15.0 (cycling)

od_crs

  • Type: integer

  • Required: mandatory

  • Explanation: CRS of the origin destination (OD) files. Both need to be in same CRS.

  • Example: 3879

origins

  • Type: string

  • Required: mandatory

  • Explanation: filepath to the origin(s) file. Can be in filetypes: gpkg, shp, csv. Csv needs od_lon_name, od_lat_name.

  • Example: user_folder/origins.shp

destinations

  • Type: string

  • Required: mandatory

  • Explanation: filepath to the destination(s) file. Can be in filetypes: gpkg, shp, csv. Csv needs od_lon_name, od_lat_name.

  • Example: user_folder/destinations.shp

od_lon_name

  • Type: string

  • Required: mandatory only for csv OD file, otherwise optional.

  • Explanation: name of the longitude column in the OD csv.

  • Example: lon

od_lat_name

  • Type: string

  • Required: mandatory only for csv OD file, otherwise optional.

  • Explanation: name of the latitude column in the OD csv.

  • Example: lat

precalculate

  • Type: boolean

  • Required: optional

  • Explanation: defines if segment weights should be precalculated to the network before routing. Using precalculate should be faster, especially for larger calculations. If this is False, will calculate segment costs while routing.

  • Example: False

  • Default: True

exposure_parameters

  • Type: dictionary

  • Required: optional

  • Explanation: defines the individual settings for each exposure data source. Fields: name, sensitivity, allow_missing_data. Needs list(s) of dicts, see the Routing YAML examples.

  • Example: - name: gvi_lines sensitivity: 2.5 - name: aqi sensitivity: 2.5 allow_missing_data: false

  • Default: allow_missing_data = True

Hint

name: needs to be the same as in data sources

sensitivity: this is the weight which is used in formula to weighten the exposure factor derived from exposure data. Formula: traversal time + (traversal time * sensitivity * exposure factor). All exposure factors will be normalized between 0-1 and for positive exposures, made negative.

allow_missing_data: Experimental feature, if set to False, will crash the route finding if any segment does not have exposure value. Most likely should not be used!!! Default is True.

Attention

Every exposure data source needs to be given name and sensitivity. If exposure results are wanted from some paths, but that data source is not wanted to include in the path optimization that data sources sensitivity should then be set to 0.

e.g. user want to find air quality optimized paths, but would also like to know the amount of greenery, but only want to route based on air quality. Setting greenery (and other possible exposure datasource) sensitivity to 0.

Routing YAML Group Examples

Example of the Routing configurations, with mandatory configurations and using SHP OD’s. Dismissing all the optional “key: value” fields.

note that this example is using gpkg OD’s

routing:
  transport_mode: walking
  origins: /user_folder/some_origins_point(s).gpkg
  destinations: /user_folder/some_destination(s)_points.gpkg
  od_crs: 27700
  exposure_parameters:
    - name: gvi_lines
      sensitivity: 1.5
    - name: aqi
      sensitivity: 1.25

Hint

Using relatively small sensitivities (weights) produced the most optimal exposure routes, some even too optimal, neclecting time too much.

“Best” results were gained with 1.5, 2.5 and 5 sensitivities (weights). Using too large sensitivities (weigths) e.g. 10, 20 decreased the positive exposure so much that all segments got cheap. Read more from documentation section (and thesis).

note that this example is using csv OD’s, so need to define od_lon_name and od_lat_name

routing:
  transport_mode: cycling
  travel_speed: 5
  precalculate: True
  od_lon_name: long
  od_lat_name: lat    
  origins: /user_folder/some_origins_point(s).gpkg
  destinations: /user_folder/some_destination(s)_points.gpkg
  od_crs: 27700
  exposure_parameters:
    - name: gvi_lines
      sensitivity: 1.5
      allow_missing_values: False
    - name: aqi
      sensitivity: 2.5

Analysing Group

Analysing group configures the last module of analysing results settings.

Name in YML: analysing

keep_geometry

  • Type: boolean

  • Required: optional

  • Explanation: Defines if geometries should be included in the final results. If they are, final output file will be .gpgk, if not it will be .csv.

  • Default: False

  • Example: True

Warning

Taking geometries to masscalculations will take more time and the final file more memory!

save_output_name

  • Type: string

  • Required: optional

  • Explanation: Custom name for the final output file.

  • Default: “output_results_[time_of_finnish]”

  • Example: london_routes_greenery_lit

cumulative_ranges

  • Type: dictionary

  • Required: optional

  • Explanation: Custom ranges to divide the results and save to final output as a new column/field. Needs the main dict (header) of cumulative_ranges, should have data sources names as dicts and ranges as list of lists, see the Analysing YAML examples.

  • Example: gvi_lines: - [0,10] - [10.01, 20] - [20.01, 50] aqi: - [0, 0.99] - [1, 1.99] - [2, 2.99] - [3, 3.99] - [4, 5]

Attention

Exposure data source names need to be exactly the same as defined earlier.

Analysing YAML Group Examples

Example of the Analysing group configurations, it only has optional parameters. Dismissing all the optional “key: value” fields.

analysing:
    keep_geometry: True
    save_output_name: london_routes_greenery_fam
    cumulative_ranges:
      gvi_lines:
        - [0,10]
        - [10.01, 20]
        - [20.01, 50]
      aqi:
        - [0, 0.99]
        - [1, 1.99]
        - [2, 2.99]
        - [3, 3.99]
        - [4, 5]

Complete User Configuration YAML example

Here is full example of filled user/config.yaml. This configuration is using vector gvi_lines, and raster aqi data sets. All data will be reprojected to project_crs of 3879. Exposure raster from gvi_lines will be created of 10m pixel cell resolution, aqi raster will be reprojected to match this with 10m resolution.

The route finding will use walking with speed 5 km/h. It will prefer and weight the greenery gvi_lines values little more than the aqi. The weights for segments will be precalculated as there seems to be thousands of OD points.

The geometries will not be kept for such large masscalculations. The resulting exposures will be grouped to cumulative ranges.

Attention

Note the importance of the correct intendations!

user/config.yaml

project:
  project_crs: 3879

osm_network:
  osm_pbf_file_path: /user_data_dir/osm/hki.osm.pbf
  original_csr: 4326

data_sources:
  - name: 'gvi_lines'
    filepath: /user_data_dir/data/gvi_lines.shp
    data_type: vector # optional
    data_buffer: 10
    save_raster_file: True # optional
    data_column: Comb_GVI
    no_data_value: 0
    min_data_value: 0.0
    max_data_value: 97.9
    good_exposure: True
    raster_cell_resolution: 10
    original_crs: 3879

  - name: "aqi"
    filepath: /user_data_dir/data/gvi_lines.shp
    data_type: raster # optional
    original_crs: 4326
    data_column: AQI
    no_data_value: 1
    min_data_value: 1
    max_data_value: 5
    good_exposure: False
    raster_cell_resolution: 10 # optional
    save_raster_file: True # optional
    custom_processing_function: convert_aq_nc_to_tif_and_scale_offset # optional

routing:
  transport_mode: walking
  travel_speed: 5
  origins: /user_folder/thousands_origins_point(s).gpkg
  destinations: /user_folder/thousands_destination(s)_points.gpkg
  od_crs: 27700
  exposure_parameters:
    - name: gvi_lines
      sensitivity: 1.5
    - name: aqi
      sensitivity: 1.25

analysing:
    keep_geometry: False
    save_output_name: example_routes_greenery_airquality
    cumulative_ranges:
      gvi_lines:
        - [0,10]
        - [10.01, 20]
        - [20.01, 50]
      aqi:
        - [0, 0.99]
        - [1, 1.99]
        - [2, 2.99]
        - [3, 3.99]
        - [4, 5]