Distributed/Big Data Geospatial Processing Tools
Work-in-progress. I will write more about each approach later in details.
Just summarizing the tools for connecting to Hadoop and running geospatial processing on a large dataset. I am working on a ~100 GB Hive Table which is just a small subset of the original dataset
- http://geospark.datasyslab.org/
- https://pypi.org/project/geopyspark/
- https://github.com/Esri/gis-tools-for-hadoop/wiki
- Kinetica GPU Database – Graph solver and Match solver
- PySpark python libraries
- Spatial Hadoop
- Alteryx – Using Connect-in-DB function to connect to Hadoop