geomesa
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.
Top Related Projects
GeoTrellis is a geographic data processing engine for high performance applications.
Official GeoTools repository
The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
PostGIS spatial database extension to PostgreSQL [mirror]
Quick Overview
GeoMesa is an open-source, distributed spatial-temporal database built on top of Apache Accumulo, HBase, Google Bigtable, and Apache Cassandra. It provides geospatial querying and analytics on top of cloud-native databases, enabling efficient storage, retrieval, and analysis of large-scale geospatial data.
Pros
- Scalable and distributed architecture for handling large volumes of geospatial data
- Supports multiple cloud-native databases as backend storage
- Integrates well with popular geospatial tools and frameworks like GeoServer and Apache Spark
- Provides advanced spatio-temporal indexing for efficient querying
Cons
- Steep learning curve for users new to distributed geospatial systems
- Configuration and setup can be complex, especially for large-scale deployments
- Limited documentation for some advanced features and use cases
- Performance may vary depending on the chosen backend database and configuration
Code Examples
- Creating a GeoMesa data store:
import org.locationtech.geomesa.accumulo.data.AccumuloDataStoreParams._
import org.locationtech.geomesa.accumulo.data.AccumuloDataStore
val params = Map(
InstanceIdParam.key -> "myInstance",
ZookeepersParam.key -> "localhost:2181",
UserParam.key -> "user",
PasswordParam.key -> "password",
CatalogParam.key -> "myCatalog"
)
val ds = DataStoreFinder.getDataStore(params).asInstanceOf[AccumuloDataStore]
- Writing features to GeoMesa:
import org.locationtech.geomesa.utils.geotools.SimpleFeatureTypes
val sft = SimpleFeatureTypes.createType("example", "name:String,*geom:Point:srid=4326")
ds.createSchema(sft)
val writer = ds.getFeatureWriterAppend("example", Transaction.AUTO_COMMIT)
val feature = writer.next()
feature.setAttribute("name", "GeoMesa Point")
feature.setAttribute("geom", "POINT(0 0)")
writer.write()
writer.close()
- Querying data from GeoMesa:
import org.geotools.data.Query
import org.geotools.filter.text.ecql.ECQL
val filter = ECQL.toFilter("BBOX(geom, -10, -10, 10, 10)")
val query = new Query("example", filter)
val reader = ds.getFeatureReader(query, Transaction.AUTO_COMMIT)
while (reader.hasNext) {
val feature = reader.next()
println(feature.getAttribute("name"))
}
reader.close()
Getting Started
To get started with GeoMesa:
- Add GeoMesa dependencies to your project:
libraryDependencies ++= Seq(
"org.locationtech.geomesa" %% "geomesa-accumulo-datastore" % "3.5.0",
"org.locationtech.geomesa" %% "geomesa-utils" % "3.5.0"
)
- Set up a compatible backend database (e.g., Accumulo, HBase, or Cassandra)
- Create a GeoMesa data store using the appropriate parameters for your backend
- Define your data schema and start writing and querying geospatial data
For more detailed instructions, refer to the official GeoMesa documentation.
Competitor Comparisons
GeoTrellis is a geographic data processing engine for high performance applications.
Pros of GeoTrellis
- Focuses on raster data processing and analysis, making it more specialized for certain geospatial tasks
- Provides a high-performance, distributed processing engine for large-scale geospatial data
- Offers strong integration with Apache Spark for distributed computing
Cons of GeoTrellis
- Limited support for vector data compared to GeoMesa
- Steeper learning curve due to its functional programming approach in Scala
- Smaller community and ecosystem compared to GeoMesa
Code Comparison
GeoTrellis (Scala):
import geotrellis.raster._
import geotrellis.raster.io.geotiff._
val tiff = SinglebandGeoTiff("path/to/tiff.tif")
val raster = tiff.raster
val reprojected = raster.reproject(tiff.crs, LatLng)
GeoMesa (Scala):
import org.locationtech.geomesa.utils.geotools.SimpleFeatureTypes
import org.geotools.data.DataStoreFinder
val params = Map("geomesa.data.store" -> "hbase")
val dataStore = DataStoreFinder.getDataStore(params)
val sft = SimpleFeatureTypes.createType("example", "name:String,*geom:Point:srid=4326")
dataStore.createSchema(sft)
Both libraries offer powerful geospatial processing capabilities, but GeoTrellis excels in raster data handling and distributed processing, while GeoMesa provides more comprehensive support for vector data and various data stores.
Official GeoTools repository
Pros of GeoTools
- More comprehensive and mature library with a wider range of geospatial functionalities
- Larger community and more extensive documentation
- Better integration with other OGC standards and GIS software
Cons of GeoTools
- Heavier and more complex, which can lead to longer learning curves
- May be overkill for simpler geospatial projects
- Less focus on big data and distributed processing capabilities
Code Comparison
GeoTools example:
SimpleFeatureType schema = DataUtilities.createType("Location", "geom:Point,name:String");
SimpleFeatureBuilder featureBuilder = new SimpleFeatureBuilder(schema);
GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory();
Point point = geometryFactory.createPoint(new Coordinate(10, 20));
featureBuilder.add(point);
featureBuilder.add("Example Point");
SimpleFeature feature = featureBuilder.buildFeature(null);
GeoMesa example:
val sft = SimpleFeatureTypes.createType("Example", "name:String,dtg:Date,*geom:Point:srid=4326")
val featureBuilder = new SimpleFeatureBuilder(sft)
featureBuilder.addAll(Array("example", new Date(), "POINT(10 20)"))
val feature = featureBuilder.buildFeature("1")
GeoMesa focuses more on distributed storage and querying of geospatial data, while GeoTools provides a broader set of geospatial tools and utilities. GeoMesa builds upon GeoTools, extending its capabilities for big data scenarios.
The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
Pros of JTS
- Focused on core geometry operations and algorithms
- Lightweight and easy to integrate into various Java projects
- Widely adopted and well-established in the geospatial community
Cons of JTS
- Limited to 2D geometries and operations
- Lacks built-in support for distributed processing and big data
Code Comparison
JTS (simple geometry creation):
GeometryFactory factory = new GeometryFactory();
Point point = factory.createPoint(new Coordinate(1, 1));
LineString line = factory.createLineString(new Coordinate[]{
new Coordinate(0, 0), new Coordinate(1, 1)
});
GeoMesa (creating a simple feature):
val sft = SimpleFeatureTypes.createType("example", "name:String,*geom:Point:srid=4326")
val feature = SimpleFeatureBuilder.build(sft, List("feature1", "POINT(1 1)"), "id1")
Summary
JTS (Java Topology Suite) is a core library for geometric operations, while GeoMesa is a suite of tools for distributed spatial data processing. JTS excels in providing fundamental geometry algorithms and is widely used in various geospatial applications. GeoMesa, on the other hand, focuses on big data processing and distributed computing for geospatial data, leveraging platforms like Apache Accumulo, HBase, and Cassandra.
While JTS is more suitable for projects requiring basic geometric operations, GeoMesa is better suited for large-scale geospatial data processing and analysis in distributed environments.
PostGIS spatial database extension to PostgreSQL [mirror]
Pros of PostGIS
- Mature and widely adopted spatial extension for PostgreSQL
- Extensive documentation and community support
- Seamless integration with standard SQL queries
Cons of PostGIS
- Limited scalability for massive datasets compared to distributed systems
- Potentially slower performance for complex spatial operations on large datasets
- Requires PostgreSQL as the underlying database system
Code Comparison
PostGIS:
SELECT ST_Distance(
ST_GeomFromText('POINT(0 0)'),
ST_GeomFromText('LINESTRING(2 0, 0 2)')
);
GeoMesa:
val query = ECQL.toFilter("DWITHIN(geom, POINT(0 0), 2, kilometers)")
dataStore.getFeatureReader(query, Transaction.AUTO_COMMIT)
Summary
PostGIS is a robust spatial extension for PostgreSQL, offering extensive functionality and SQL integration. GeoMesa, on the other hand, is designed for distributed big data systems, providing better scalability for massive datasets. PostGIS excels in traditional relational database environments, while GeoMesa is more suitable for handling large-scale geospatial data in distributed computing frameworks.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
GeoMesa is an open source suite of tools that enables large-scale geospatial querying and analytics on distributed computing systems. GeoMesa provides spatio-temporal indexing on top of the Accumulo, HBase and Cassandra databases for massive storage of point, line, and polygon data. GeoMesa also provides near real time stream processing of spatio-temporal data by layering spatial semantics on top of Apache Kafka. Through GeoServer, GeoMesa facilitates integration with a wide range of existing mapping clients over standard OGC (Open Geospatial Consortium) APIs and protocols such as WFS and WMS. GeoMesa supports Apache Spark for custom distributed geospatial analytics.
GeoMesa is a member of the LocationTech working group of the Eclipse Foundation.
Join the Community
Documentation
- Main documentation
- Upgrade Guide
- Quick Starts: Accumulo | HBase | Cassandra | Kafka | Redis | FileSystem
- Tutorials
Downloads
Latest release: 5.3.0 - Accumulo | HBase | Cassandra | Kafka | Redis | FileSystem | PostGIS
Verifying Downloads
Downloads hosted on GitHub include SHA-256 hashes and gpg signatures (.asc files). To verify a download using gpg, import the appropriate key:
gpg2 --keyserver hkp://pool.sks-keyservers.net --recv-keys CD24F317
Then verify the file:
gpg2 --verify geomesa-accumulo_2.12-5.3.0-bin.tar.gz.asc geomesa-accumulo_2.12-5.3.0-bin.tar.gz
The keys currently used for signing are:
Key ID | Name |
---|---|
CD24F317 | Emilio Lahr-Vivaz <elahrvivaz(-at-)ccri.com> |
1E679A56 | James Hughes <jnh5y(-at-)ccri.com> |
Maven Integration
GeoMesa is hosted on Maven Central. To include it as a dependency, add the desired modules, for example:
<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-accumulo-datastore_2.12</artifactId>
<version>5.3.0</version>
</dependency>
GeoMesa provides a bill-of-materials module, which can simplify version management:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-bom_2.12</artifactId>
<version>5.3.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
GeoMesa depends on several third-party libraries that are only available in separate repositories. To include GeoMesa in your project, add the following repositories to your pom:
<repositories>
<!-- geotools -->
<repository>
<id>osgeo</id>
<url>https://repo.osgeo.org/repository/release</url>
</repository>
<!-- confluent -->
<repository>
<id>confluent</id>
<url>https://packages.confluent.io/maven/</url>
</repository>
</repositories>
Nightly Snapshots
Snapshot versions are published nightly to the Eclipse repository:
<repository>
<id>geomesa-snapshots</id>
<url>https://repo.eclipse.org/content/repositories/geomesa-snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
Spark Runtimes
GeoMesa publishes spark-runtime
JARs for integration with Spark environments like Databricks. These
shaded JARs include all the required dependencies in a single artifact. When importing through Maven, all
transitive dependencies can be excluded. There are Spark runtime JARs available for most of the different
DataStore implementations:
<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-gt-spark-runtime_2.12</artifactId>
<version>5.3.0</version>
<exclusions>
<exclusion>
<!-- if groupId wildcards are not supported, the two main ones are jline:* and org.geotools:* -->
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
These JARs are also included in the Downloads bundles, above.
sbt
Integration
Similarly, integration with sbt
is straightforward:
// Add necessary resolvers
resolvers ++= Seq(
"osgeo" at "https://repo.osgeo.org/repository/release",
"confluent" at "https://packages.confluent.io/maven"
)
// Select desired modules
libraryDependencies ++= Seq(
"org.locationtech.geomesa" %% "geomesa-utils" % "5.3.0"
)
Building from Source
Requirements:
- Git
- Java JDK 11
- Apache Maven 3.6.3 or later
- Docker (only required for running unit tests)
Use Git to download the source code. Navigate to the destination directory, then run:
git clone git@github.com:locationtech/geomesa.git
cd geomesa
The project is built using Maven. To build, run:
mvn clean install -DskipTests
The full build takes quite a while. To speed it up, you may use multiple threads (-T 1.5C
).
To run unit tests, omit the -DskipTests
(note: requires docker
to be available).
Build with Bloop Compile Server
GeoMesa also provides experimental support for the Bloop compile server, which provides fast incremental compilation. To export the GeoMesa build to Bloop, run:
./build/scripts/bloop-export.sh
For more information on using Bloop, refer to the Bloop documentation.
Build with Zinc Compile Server
GeoMesa also provides experimental support for the Zinc compile server,
which provides fast incremental compilation. However, please note that Zinc is no longer actively maintained.
To use an existing Zinc server, run maven with -Pzinc
. GeoMesa provides a helper script at build/mvn
, which
is a wrapper around Maven that downloads and runs Zinc automatically:
build/mvn clean install -T8 -DskipTests
If the Zinc build fails with an error finding "javac", try setting the JAVA_HOME environment variable to point to the root of your JDK. Example from a Mac:
JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_51.jdk/Contents/Home" build/mvn clean install
Scala Cross Build
To build for a different Scala version (e.g. 2.13), run the following script, then build as normal:
./build/scripts/change-scala-version.sh 2.13
Building on OS X
When building on OS X and using Docker Desktop in a non-default configuration, you may need to edit ~/.testcontainers.properties
to contain the following:
docker.client.strategy=org.testcontainers.dockerclient.UnixSocketClientProviderStrategy
Top Related Projects
GeoTrellis is a geographic data processing engine for high performance applications.
Official GeoTools repository
The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
PostGIS spatial database extension to PostgreSQL [mirror]
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot