Introduction

geocoder R package aims at managing large amounts of geocodes easier. Instead of loading geocodes from storage formats like shape, geojson etc, it is easier for data analysts to have access to their own offline/remote database of geocodes.

UK geocodes case

UK Office of National Statistics Open Geography Portal contains various files which are used by data scientists to plot data on maps. Details of different layers etc.

Round trip through MongoDB

To showcasee the capabilities of geocoder package, we can do a round trip data processing from a seudo-standard R spatial sf class back to itself through MongoDB’s geospatial querying. That is:

  • take a shapefile from UK’s Open Geography Portal

  • turn it into an sf object

  • write it into a MongoDB collection and create a spatial index on it.

  • reassemble the output from find queries back to an sf object.

One of the UK census boundaries is called Middle Superr Output Area (MSOA), which can be found here.

msoa.folder = file.path(tempdir(), "msoa_folder")
msoa.zip = file.path(tempdir(), "msoa.zip")
if(!exists(msoa.file)) {
  download.file(
    paste0("https://opendata.arcgis.com/datasets/",
           "f341dcfd94284d58aba0a84daf2199e9_0.zip"),
    msoa.zip)
  unzip(msoa.zip, exdir = msoa.folder)
}
# read the shape file using `sf`
msoa.sf = sf::read_sf(
  list.files(msoa.folder, pattern = "England_and_Wales.shp"))
class(msoa.sf)
# substring(geojsonsf::sf_geojson(msoa.sf[1,]), 1, 350)

The data is ready in a clean sf object and we can convert each row into a ready to be used by geocder to import into Mongodb:

geocoder::gc_import_sf(msoa.sf, collection = "msoa")

We can now query the database with lists of MSOA codes and easily return GeoJSON formatted “features” ready to be reassembled into sf.

Reproducible example

A reproducible (as the UK msoa example is too large) for this Rmarkdown document would be a slice of Uber’s Vancouver land area price.

Let us retrieve the data and plot the land:

Visual check that both entries are the same: