Convert Figma logo to code with AI

locationtech logogeotrellis

GeoTrellis is a geographic data processing engine for high performance applications.

1,363
363
1,363
243

Top Related Projects

Rasterio reads and writes geospatial raster datasets

Python tools for geographic data

5,431

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.

4,160

Manipulation and analysis of geometric objects

11,615

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)

Quick Overview

GeoTrellis is an open-source, distributed geographic data processing engine for high-performance applications. It is designed to work with large-scale geospatial data, providing fast and efficient processing capabilities for raster and vector datasets. GeoTrellis is built on top of Apache Spark and is written in Scala.

Pros

  • High-performance processing of large-scale geospatial data
  • Seamless integration with Apache Spark for distributed computing
  • Supports both raster and vector data operations
  • Extensive set of geospatial operations and algorithms

Cons

  • Steep learning curve, especially for those unfamiliar with Scala
  • Limited documentation and examples for some advanced features
  • Requires significant computational resources for large datasets
  • Smaller community compared to some other geospatial libraries

Code Examples

  1. Reading a GeoTIFF file:
import geotrellis.raster.io.geotiff.reader.GeoTiffReader

val tiff = GeoTiffReader.readSingleband("path/to/file.tif")
  1. Performing a raster operation:
import geotrellis.raster._

val raster: Raster[Tile] = // ... load raster
val result = raster.mapTile(tile => tile.map(cell => cell * 2))
  1. Creating a vector feature:
import geotrellis.vector._

val point = Point(0, 0)
val feature = Feature(point, Map("name" -> "Example Point"))
  1. Reprojecting a raster:
import geotrellis.proj4._

val raster: Raster[Tile] = // ... load raster
val sourceCRS = CRS.fromEpsgCode(4326)
val targetCRS = CRS.fromEpsgCode(3857)
val reprojected = raster.reproject(sourceCRS, targetCRS)

Getting Started

To get started with GeoTrellis, add the following dependencies to your build.sbt file:

libraryDependencies ++= Seq(
  "org.locationtech.geotrellis" %% "geotrellis-raster" % "3.6.3",
  "org.locationtech.geotrellis" %% "geotrellis-vector" % "3.6.3",
  "org.locationtech.geotrellis" %% "geotrellis-spark" % "3.6.3"
)

Then, import the necessary modules in your Scala code:

import geotrellis.raster._
import geotrellis.vector._
import geotrellis.spark._

You can now start using GeoTrellis functions and classes in your project.

Competitor Comparisons

Rasterio reads and writes geospatial raster datasets

Pros of rasterio

  • Written in Python, making it more accessible to data scientists and GIS professionals
  • Simpler API and easier to get started with for basic raster operations
  • Better integration with other Python libraries like NumPy and scikit-image

Cons of rasterio

  • Less performant for large-scale distributed processing compared to GeoTrellis
  • Limited support for vector data operations and advanced geospatial analytics
  • Fewer built-in functionalities for complex geospatial workflows

Code Comparison

rasterio example:

import rasterio

with rasterio.open('example.tif') as src:
    data = src.read()
    profile = src.profile

GeoTrellis example:

import geotrellis.raster.io.geotiff.reader.GeoTiffReader

val tiff = GeoTiffReader.readSingleband("example.tif")
val tile = tiff.tile
val extent = tiff.extent

Both libraries provide methods for reading raster data, but rasterio's Python syntax is generally more concise and familiar to many data scientists. GeoTrellis, being Scala-based, offers strong typing and functional programming paradigms, which can be advantageous for complex, distributed processing tasks.

Python tools for geographic data

Pros of GeoPandas

  • Easier to learn and use, especially for those familiar with pandas
  • Better integration with the Python data science ecosystem
  • More extensive documentation and community support

Cons of GeoPandas

  • Less performant for large-scale geospatial data processing
  • Limited support for distributed computing
  • Fewer advanced geospatial analysis capabilities

Code Comparison

GeoPandas:

import geopandas as gpd

# Read a shapefile
gdf = gpd.read_file("data.shp")

# Perform a spatial join
result = gpd.sjoin(gdf1, gdf2, how="inner", op="intersects")

GeoTrellis:

import geotrellis.vector._
import geotrellis.vector.io._

// Read a shapefile
val features = ShapeFileReader.readMultiPolygonFeatures("data.shp")

// Perform a spatial join
val joined = features.spatialJoin(otherFeatures)

GeoTrellis offers more advanced geospatial processing capabilities and better performance for large datasets, while GeoPandas provides a more user-friendly interface and better integration with the Python ecosystem. The choice between the two depends on the specific requirements of the project, such as data size, processing complexity, and the preferred programming language.

5,431

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.

Pros of GDAL

  • Broader support for geospatial data formats and operations
  • More mature and widely adopted in the geospatial community
  • Extensive command-line utilities for data processing

Cons of GDAL

  • Steeper learning curve, especially for non-GIS specialists
  • Less integrated with big data processing frameworks
  • Primarily C/C++ based, which may be less accessible for some developers

Code Comparison

GDAL (Python bindings):

from osgeo import gdal
dataset = gdal.Open("example.tif")
band = dataset.GetRasterBand(1)
data = band.ReadAsArray()

GeoTrellis (Scala):

import geotrellis.raster.io.geotiff.reader.GeoTiffReader
val tiff = GeoTiffReader.readSingleband("example.tif")
val tile = tiff.tile

GeoTrellis focuses on distributed processing of geospatial data using Scala and Apache Spark, making it well-suited for big data applications. It offers a more functional programming approach and integrates seamlessly with other Spark-based workflows.

GDAL, on the other hand, provides a comprehensive toolkit for working with geospatial data across various formats and coordinate systems. It's widely used in the GIS industry and offers bindings for multiple programming languages, making it versatile for different development environments.

4,160

Manipulation and analysis of geometric objects

Pros of Shapely

  • Simpler API and easier to learn for beginners
  • Lightweight and focused on geometric operations
  • Better documentation and more examples available

Cons of Shapely

  • Limited to 2D geometries
  • Lacks advanced geospatial analysis capabilities
  • No built-in support for distributed processing

Code Comparison

Shapely:

from shapely.geometry import Point, Polygon

point = Point(0, 0)
polygon = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
is_within = point.within(polygon)

GeoTrellis:

import geotrellis.vector._

val point = Point(0, 0)
val polygon = Polygon(List((0, 0), (1, 0), (1, 1), (0, 1)))
val isWithin = polygon.contains(point)

Both libraries provide similar functionality for basic geometric operations, but GeoTrellis offers more advanced features for large-scale geospatial data processing. Shapely is more accessible for Python developers and simpler projects, while GeoTrellis is better suited for complex, distributed geospatial applications in Scala.

11,615

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)

Pros of QGIS

  • Comprehensive GUI for geospatial data visualization and analysis
  • Extensive plugin ecosystem for additional functionality
  • Supports a wide range of geospatial data formats and operations

Cons of QGIS

  • Steeper learning curve for non-GIS professionals
  • Performance can be slower for large datasets compared to GeoTrellis
  • Less suitable for distributed processing of big geospatial data

Code Comparison

QGIS (Python):

layer = QgsVectorLayer("path/to/shapefile.shp", "layer_name", "ogr")
if not layer.isValid():
    print("Layer failed to load!")
QgsProject.instance().addMapLayer(layer)

GeoTrellis (Scala):

val rdd: RDD[(ProjectedExtent, Tile)] = S3GeoTiffRDD.spatial("s3://bucket/key")
val (zoom, tiled) = TileLayerMetadata.fromRdd(rdd, FloatingLayoutScheme(512))
tiled.cache()

GeoTrellis focuses on distributed processing of geospatial data using Scala and Apache Spark, making it more suitable for big data applications. QGIS, on the other hand, provides a user-friendly interface for various GIS tasks and is more accessible to users without programming experience. The choice between the two depends on the specific use case and technical requirements of the project.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

GeoTrellis

CI Join the chat at https://gitter.im/geotrellis/geotrellis ReadTheDocs Changelog Contributing

Maven Central Snapshots

GeoTrellis is a Scala library and framework that provides APIs for reading, writing and operating on geospatial raster and vector data. GeoTrellis also provides helpers for these same operations in Spark and for performing MapAlgebra operations on rasters. It is released under the Apache 2 License.

Please visit the project site for more information as well as some interactive demos.

You're also welcome to ask questions and talk to developers (let us know what you're working on!) via Gitter.

Getting Started

GeoTrellis is currently available for Scala 2.12 and 2.13, using Spark 3.3.x.

To get started with SBT, simply add the following to your build.sbt file:

libraryDependencies += "org.locationtech.geotrellis" %% "geotrellis-raster" % "<latest version>"

To grab the latest SNAPSHOT, RC or milestone build, add these resolvers:

// maven central snapshots
resolvers ++= Seq(
  "central-snapshots" at "https://central.sonatype.com/repository/maven-snapshots/"
)

// or eclipse snapshots
resolvers ++= Seq(
  "eclipse-releases" at "https://repo.eclipse.org/content/groups/releases",
  "eclipse-snapshots" at "https://repo.eclipse.org/content/groups/snapshots"
)

If you are just getting started with GeoTrellis, we recommend familiarizing yourself with the geotrellis-raster package, but it is just one of the many available. The complete list of published GeoTrellis packages includes:

  • geotrellis-accumulo: Accumulo store integration for GeoTrellis
  • geotrellis-accumulo-spark: Accumulo store integration for GeoTrellis + Spark
  • geotrellis-cassandra: Cassandra store integration for GeoTrellis
  • geotrellis-cassandra-spark: Cassandra store integration for GeoTrellis + Spark
  • geotrellis-gdal: GDAL bindings for GeoTrellis
  • geotrellis-geotools: Conversions to and from GeoTools Vector and Raster data
  • geotrellis-hbase: HBase store integration for GeoTrellis
  • geotrellis-hbase-spark: HBase store integration for GeoTrellis + Spark
  • geotrellis-layer: Datatypes to describe sets of rasters
  • geotrellis-macros: Performance optimizations for GeoTrellis operations
  • geotrellis-proj4: Coordinate Reference systems and reproject (Scala wrapper around Proj4j)
  • geotrellis-raster: Raster data types and operations, including MapAlgebra
  • geotrellis-raster-testkit: Testkit for testing geotrellis-raster types
  • geotrellis-s3: Amazon S3 store integration for GeoTrellis
  • geotrellis-s3-spark: Amazon S3 store integration for GeoTrellis + Spark
  • geotrellis-shapefile: Read ESRI Shapefiles into GeoTrellis data types via GeoTools
  • geotrellis-spark: Geospatially enables Spark and provides primitives for external data stores
  • geotrellis-spark-pipeline: DSL for geospatial ingest jobs using GeoTrellis + Spark
  • geotrellis-spark-testkit: Testkit for testing geotrellis-spark code
  • geotrellis-store: Abstract interfaces for storage services, with concrete implementations for local and Hadoop filesystems
  • geotrellis-util: Miscellaneous GeoTrellis helpers
  • geotrellis-vector: Vector data types and operations extending JTS
  • geotrellis-vector-testkit: Testkit for testing geotrellis-vector types
  • geotrellis-vectortile: Experimental vector tile support, including reading and writing

A more complete feature list can be found on the Module Hierarchy page of the GeoTrellis documentation. If you're looking for a specific feature or operation, we suggest searching there or reaching out on Gitter.

For older releases, check the complete list of packages and versions available at locationtech-releases.

Hello Raster

scala> import geotrellis.raster._
import geotrellis.raster._

scala> import geotrellis.raster.render.ascii._
import geotrellis.raster.render.ascii._

scala> import geotrellis.raster.mapalgebra.focal._
import geotrellis.raster.mapalgebra.focal._

scala> val nd = NODATA
nd: Int = -2147483648

scala> val input = Array[Int](
     nd, 7, 1, 1,  3, 5, 9, 8, 2,
      9, 1, 1, 2,  2, 2, 4, 3, 5,
      3, 8, 1, 3,  3, 3, 1, 2, 2,
      2, 4, 7, 1, nd, 1, 8, 4, 3)
input: Array[Int] = Array(-2147483648, 7, 1, 1, 3, 5, 9, 8, 2, 9, 1, 1, 2,
2, 2, 4, 3, 5, 3, 8, 1, 3, 3, 3, 1, 2, 2, 2, 4, 7, 1, -2147483648, 1, 8, 4, 3)

scala> val iat = IntArrayTile(input, 9, 4)  // 9 and 4 here specify columns and rows
iat: geotrellis.raster.IntArrayTile = IntArrayTile([I@278434d0,9,4)

// The renderAscii method is mostly useful when you're working with small tiles
// which can be taken in at a glance.
scala> iat.renderAscii(AsciiArtEncoder.Palette.STIPLED)
res0: String =
∘█  ▚▜██▖
█  ▖▖▖▜▚▜
▚█ ▚▚▚ ▖▖
▖▜█ ∘ █▜▚

scala> val focalNeighborhood = Square(1)  // a 3x3 square neighborhood
focalNeighborhood: geotrellis.raster.op.focal.Square =
 O  O  O
 O  O  O
 O  O  O

scala> val meanTile = iat.focalMean(focalNeighborhood)
meanTile: geotrellis.raster.Tile = DoubleArrayTile([D@7e31c125,9,4)

scala> meanTile.getDouble(0, 0)  // Should equal (1 + 7 + 9) / 3
res1: Double = 5.666666666666667

Documentation

Documentation is available at geotrellis.io/documentation.

Scaladocs for the the master branch are available here.

Further examples and documentation of GeoTrellis use-cases can be found in the docs/ folder.

Contributing

Feedback and contributions to the project, no matter what kind, are always very welcome. A CLA is required for contribution, see Contributing for more information. Please refer to the Scala style guide for formatting patches to the codebase.

Where is our commit history and contributor list prior to Nov 2016?

The entire old history is available in the _old/master branch.

Why?

In November 2016, GeoTrellis moved it's repository from the GeoTrellis GitHub Organization to it's current home in the LocationTech GitHub organization. In the process of moving our repository, we went through an IP review process. Because the Eclipse foundation only reviews a snapshot of the repository, and not all of history, we had to start from a clean master branch.

Unfortunately, we lost our commit and contributor count in the move. These are significant statistics for a repository, and our current counts make us look younger than we are. GeoTrellis has been an open source project since 2011. This is what our contributor and commit count looked like before the move to LocationTech:

Commit and contributor count before LocationTech move

Along with counts, we want to make sure that all the awesome people who contributed to GeoTrellis before the LocationTech move can still be credited on a contributors page. For posterity, I will leave the following contributors page to what it was before the move:

https://github.com/lossyrob/geotrellis-before-locationtech/graphs/contributors

Tie Local History to Old History

You can also tie your local clone's master history to the old history by running

> git fetch origin refs/replace/*:refs/replace/*

if origin points to https://github.com/locationtech/geotrellis. This will allow you to see the old history for commands like git log.