Notes: Platforms for Big EO data

The following texts are the extracts from the research article. All the texts have been copied and very little edits have been done and have been presented for the study purposes.

An Overview of Platforms for Big Earth Observation Data Management and Analysis https://www.mdpi.com/692396  

  1. Google Earth Engine:

It is the cloud-based platform for large-scale scientific analysis and visualization of geospatial datasets as a free service based  on Google’s infrastructure

GEE provides a JavaScript API and a Python API for data management and analysis. There are four object types to represent data that can be manipulated by its API. 


The Image type is the raster data (one or more bands, with name, data type, scale, and projection). The time series of Images is by the ImageCollection type, vector data by the Feature type. This type is represented by a geometry (point, line, or polygon) and a list of attributes. The FeatureCollection type represents groups of related Features and provides functions to manipulate this data, such as sorting, filtering, and visualization.


To process and analyze data available in the GEE public catalog or data from the user’s private repository, GEE provides an operator library for the object types.


The GEE library has over 800 functions for handling big EO data sets. Despite this large number of built in functions, these functions are close and users can not update or extend their basic functionalities. While GEE provides a friendly environment for scientists, the implementation of procedures that are not available through the GEE API functions requires significant user effort. Besides that, GEE only offers programming interfaces that support pixel-based processing, and there are natively no region-based methods such as image segmentation or large-scale time series analysis. Users can share their scripts and assets with other users of the platform. Nevertheless, it is important to keep in mind that these scripts use algorithms implemented internally by the platform and that these algorithms are close and can not be extended on the server side.


  1. Sentinel Hub:

 The platform is developed by Sinergise that provides Sentinel data access and visualization services. This is a private platform with public access (https://www.sentinel-hub.com). Unlike Google Earth Engine, SH limits access to functionality in different payment plans. The free plan only allows viewing, selection and downloading of raw data. Paid access enables data access through OGC protocols and a specific API, data processing, mobile application data access, higher resource access limits, and technical support .


SH uses the concepts of Data Source, Instances and Layer to represent the data available in its services. Data Source is an abstraction equivalent to the GEE ImageCollection, 


An Instance in SH platform acts as a distinct OGC service that can be configured to provide a set of Layers that fulfill user needs. Each Layer is associated with one or more bands of a specific Data Source and a processing script. These scripts, called Evalscripts by the SH, are applied to each pixel of the data requested by the user. It is not possible to access the data of a pixel’s neighborhood during the execution of the script, which can basically perform operations between bands.


  1. Open Data Cube:

Previously known as the Australian Geoscience Data Cube (AGDC), it is an analytical framework composed of a series of data structures and tools that facilitate the organization and analysis of EO data. It is available under Apache 2.0 license as a suite of applications

The source code of ODC and its tools are open and are officially distributed through dozens of git repositories (https://github.com/opendatacube).


  1. 2.4. SEPAL

The System for Earth Observation Data Access, Processing and Analysis for Land Monitoring (SEPAL) is a cloud computing platform developed for the automatic monitoring of land cover. It combines cloud services, such as Google Earth Engine, Amazon Web Services Cloud (AWS), with free software, such as Orfeo Toolbox, GDAL, RStudio, R Shiny Server, SNAP Toolkit and OpenForis Geospatial. The main focus of this platform is on building an environment with previously configured tools and on managing the use of computational resources in the cloud to facilitate the way scientists search, access, process and analyze EO data, especially in countries that have difficulties with connection with the Internet and few computational resources 

 

SEPAL is an initiative of the Forestry Department of the United Nations Food and Agriculture Organization (FAO) and financed by Norway. Its source code (https://github.com/openforis/sepal) is available under MIT license 

 

 It can be accessed through a web portal (https://sepal.io)


  1.  JEODPP

The Joint Research Center (JRC) Earth Observation Data and Processing Platform (JEODPP) is a closed solution developed since 2016 by the JRC for the storage and processing of large volumes of Earth observation data. This platform has features for interactive data processing and visualization, virtual desktop and batch data processing. This platform uses a set of servers for data storage and another set for processing. The storage servers use the EOS distributed file system and store the data in its original format, with only pyramidal representations added to speed up the reading and visualization of the data.

The JEODPP does not have tools to facilitate the sharing of analysis among researchers. This capability is only available for internal use of JRC and there is no source code available for implementation in other institutions.


  1.  pipsCloud

pipsCloud is a proprietary solution developed by Chinese research institutions for the management and processing of large volumes of EO data based on cloud computing. The file system used in pipsCloud is HPGFS, a proprietary file system also developed by Chinese institutions and which is not available for use by third parties. Its cloud environment is implemented in the organization’s internal infrastructure using OpenStack technology, which allows the construction of virtualized services infrastructure.


  1. OpenEO

The OpenEO project started in October 2017 in order to meet the need to consolidate available technologies for storing, processing and analyzing large volumes of EO data. This demand arises from the difficulty that many users of EO data have in migrating their data analytics to cloud-based processing platforms. The main reason is not, in many cases, of a technical nature, but the fear of becoming dependent on the provider of the chosen platform. OpenEO aims to reduce these concerns, by providing a mechanism for scientists to develop their applications and analyzes using a single standard that can be processed in different systems, even facilitating the comparison of these providers. With this approach, OpenEO aims to reduce the entry barriers for the EO community in cloud computing technologies and in big EO data analysis platforms.

To this end, this system has been developing as a common and open source (https://github.com/Open-EO) interface (Apache license 2.0) to facilitate the integration between storage systems and analysis of EO data and applications of the European program Copernicus.

Text extracted and copied from Gomes, V., Queiroz, G., & Ferreira, K. (2020). An Overview of Platforms for Big Earth Observation Data Management and Analysis. Remote Sensing, 12(8), 1253. https://doi.org/10.3390/rs12081253


बढी पढिएका सामग्रीहरुः

About Me

My photo
Learner, Loves/Learns agronomy, remote sensing, gis, other interests in writing poetry, learning languages, literature, learning the guitar, (+ve person)