Stable | Latest | Source code |
---|---|---|
GeoSpark@Twitter || GeoSpark Discussion Board || || (since Jan. 2018)
GeoSpark is listed as Infrastructure Project on Apache Spark Official Third Party Project Page
GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark / SparkSQL with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs)/ SpatialSQL that efficiently load, process, and analyze large-scale spatial data across machines.
GeoSpark contains three modules:
Name | API | Spark compatibility | Dependency |
---|---|---|---|
GeoSpark-core | RDD | Spark 2.X/1.X | Spark-core |
GeoSpark-SQL | SQL/DataFrame | SparkSQL 2.1 and later | Spark-core, Spark-SQL, GeoSpark-core |
GeoSpark-Viz | RDD | Spark 2.X/1.X | Spark-core, GeoSpark-core |
- Core: GeoSpark SpatialRDDs and Query Operators.
- SQL: SQL interfaces for GeoSpark core.
- Viz: Visualization extension of GeoSpark core.
Please visit GeoSpark website for details and documentations.
- GeoSpark 1.1.3 is released. This release contains a critical bug fix for GeoSpark-core RDD API. Release notes || Maven Coordinate.
- GeoSpark 1.1.2 is released. This release contains several bug fixes. Thanks for the patch from Lucas C.! Release notes || Maven Coordinate.
- GeoSpark 1.1.0 is released. This release contains new SQL functions, custom Quad-Tree/R-Tree index serializers and bug fixes. GeoSpark 1.1.0 supposrt Apache Spark 2.3. Note, GeoSparkSQL Maven Coordinate changed Release notes || Maven Coordinate (Thanks for the index serializer patch contributed by Zongsi Zhang!)
- GeoSpark wiki is now moved to GeoSpark new website! Users are welcome to contribute your tutorials and stories by making a PR!