GeoKettle

GeoKettle

NEW! Current version : 2.5 //  Licence :  LGPL

Download GeoKettle on the Spatialytics Market .

Information and documentation on GeoKettle in the Documentation Center .

Get support and interact with GeoKettle users in the forum forum of Spatialytics.

Report a bug or request for a new feature for GeoKettle in the Bug/Issue tracking system trac trac_logo.

News and information about GeoKettle here in the blog of Spatialytics.ORG.

Stay tuned for all the GeoKettle news via Twitter @GeoKettle.

Link to Spatialytics ETL

What is GeoKettle :

Learn more about what is NEW in GeoKettle 2.5 here.

GeoKettle is a powerful, metadata-driven Spatial ETL tool dedicated to the integration of different spatial data sources for building and updating geospatial data warehouses. GeoKettle enables the Extraction of data from data sources, the Transformation of data in order to correct errors, make some data cleansing, change the data structure, make them compliant to defined standards, and the Loading of transformed data into a target DataBase Management System (DBMS) in OLTP or OLAP/SOLAP mode, GIS file or Geospatial Web Service.

GeoKettle is a spatially-enabled version of the generic ETL tool Kettle (Pentaho Data Integration). GeoKettle also benefits from Geospatial capabilities from mature, robust and well know Open Source libraries like JTS, GeoTools, deegree, OGR and, via a plugin, Sextante.

ORG_Figure_ETLen

GeoKettle has been released under the LGPL.


Geospatial-specific features:


Extract data from:

  • Spatial database types: PostGIS, Oracle spatial, MySQL,  Microsoft SQL Server 2008*, Ingres* and IBM DB2*
  • SOLAP (Spatial OLAP) system: GeoMondrian
  • Geo files (data formats): Shapefile, GML, KML, OGR
  • OGC Web services: Sensor Observation Service (SOS), Catalogue Web Service (CSW)

*Non native formats, can be used with some modifications.


Transformation of data:

Calculating:

  • Buffers
  • Centroid
  • Random point on surface
  • Area
  • Length
  • Distance
  • Intersection
  • Union
  • Envelope
  • Boundary
  • Convex hull
  • Difference
  • Symetric difference
  • Inverse geometry

Geoprocessing:

  • General:
    • Clip
    • Clip with rectangle
    • Split multiparts
  • Points:
    • Delaunay algorithm
  • Lines:
    • Polylines -> polygons
    • Simplify lines
    • Smooth lines
    • Polylines to single segments
    • Split polylines at nodes
    • Split polylines with points
  • Polygons:
    • Simplify polygons
    • Remove holes
    • Polygons -> polylines


Load data into a target format:

  • Spatial database loads
  • Spatial data warehouse population
  • Data formats: Shapefile, GML, KML, OGR
  • OGC Web services: Catalogue Web Service (CSW)


Environment:

Cartographic viewer to preview your transformations, including map customization tools and basic cartographic functions.

GeoKettle Geo Preview


Generic ETL Features:


Extract data from:

  • 35+ database types: MySQL, PostgreSQL, Oracle, …
  • Data warehouse types: Mondrian
  • XML files
  • XLS files
  • Web services
  • Xbase files (dBase, Foxpro, etc)
  • File systems information
  • Generated data
  • MS Access files
  • LDAP
  • Other flat files: text files, Excel files, CSV files


Transformation of data:

  • Engine based data transfer (no code generator)
  • Looking up data in databases, files or memory
  • Calculating
  • Scripting: Javascript, SQL, RegExp
  • Splitting
  • Mapping
  • Selecting
  • Partitioning
  • Filtering
  • Merging
  • Joining
  • Duplicating
  • Clustering (MPP)
  • Pivotting


Load data into a target format:

  • Database loads
  • Data warehouse population
  • Partitioned loading
  • Bulk loading
  • Parallel loading
  • Clustering


Environment:

  • Full GUI to edit every transformation options
  • Command line tools: execute jobs and transformations
  • Web server: remote execution and clustering perfect in cloud computing environment for very large datasets processing
  • Programming API for Java
  • Plugin eco-system

This post is also available in: French