EO Exploitation Platform Common Architecture
ADES Design Document
EOEPCA.SDD.xxx
COMMENTS and ISSUES |
PDF |
EUROPEAN SPACE AGENCY CONTRACT REPORT |
TELESPAZIO VEGA UK Ltd |
- AMENDMENT HISTORY
-
This document shall be amended by releasing a new edition of the document in its entirety.
The Amendment Record Sheet below records the history and issue status of this document.Table 1. Amendment Record Sheet ISSUE DATE REASON 0.1
14/04/2020
Initial in-progress draft
0.1.1
09/06/2020
Update Design with ADES release 0.1
0.1.2
06/07/2020
Updated Design for ADES release 0.1. Added Functional Design and External Interfaces sections
0.2.0
14/09/2020
Updated Design for ADES release 0.2 Updated Functional Design section
0.2.1
07/10/2020
Updated Design for ADES release 0.2 Updated Functional Design and Components Design section
1. Introduction
1.1. Purpose and Scope
This document presents the ADES Design for the Common Architecture.
1.2. Structure of the Document
- Section 2 - Overview
-
Provides an overview of the ADES component, within the context of the wider Common Architecture design.
- Section 3 - [mainDesign]
-
Provides the design of the ADES component.
1.3. Reference Documents
The following is a list of Reference Documents with a direct bearing on the content of this document.
Reference | Document Details | Version |
---|---|---|
EOEPCA - Master System Design Document |
Issue 1.0, |
|
EOEPCA - Use Case Analysis |
Issue 1.0, |
|
Exploitation Platform - Functional Model, |
Issue 1.0, |
|
Thematic Exploitation Platform Open Architecture, |
Issue 1, |
|
OGC Web Service Common, |
06-121r9, |
|
OGC Web Processing Service, |
05-007r7, |
|
OGC® WPS 2.0 Interface Standard, |
14-065r2, |
|
OGC WPS 2.0 REST/JSON Binding Extension, Draft, |
1.0-draft |
|
OGC OWS Context Conceptual Model, |
12-080r2, |
|
OGC OWS Context Atom Encoding Standard, |
12-084r2, |
|
Common Workflow Language Specifications, |
v1.0.2 |
|
OGC Testbed-13, EP Application Package Engineering Report, |
17-023, |
|
OGC Testbed-13, Application Deployment and Execution Service Engineering Report, |
17-024, |
|
OGC Testbed-14, Application Package Engineering Report, |
18-049r1, |
|
OGC Testbed-14, ADES & EMS Results and Best Practices Engineering Report, |
18-050r1, |
|
OGC Testbed-14: WPS-T Engineering Report, |
18-036r1, |
|
OGC API Hackathon 2019 Engineering Report, |
19-062, |
|
OpenSearch GEO: OpenSearch Geo and Time Extensions, |
10-032r8, |
|
OpenSearch EO: OGC OpenSearch Extension for Earth Observation, |
13-026r9, |
|
OGC EO Dataset Metadata GeoJSON(-LD) Encoding Standard, |
17-003r1/17-084 |
|
OGC OpenSearch-EO GeoJSON(-LD) Response Encoding Standard, |
17-047 |
|
The Payment Card Industry Data Security Standard, |
v3.2.1 |
|
CEOS OpenSearch Best Practise, |
v1.2, |
|
OpenID Connect Core 1.0, |
v1.0, |
|
OGC Catalogue Services 3.0 Specification - HTTP Protocol Binding (Catalogue Services for the Web), |
v3.0, |
|
OGC Web Map Server Implementation Specification, |
v1.3.0, |
|
OGC Web Map Tile Service Implementation Standard, |
v1.0.0, |
|
OGC Web Feature Service 2.0 Interface Standard – With Corrigendum, |
v2.0.2, |
|
OGC Web Coverage Service (WCS) 2.1 Interface Standard - Core, |
v2.1, |
|
Web Coverage Processing Service (WCPS) Language Interface Standard, |
v1.0.0, |
|
Amazon Simple Storage Service REST API, |
API Version 2006-03-01 |
|
OpenAPI Specification, |
API Version 3.0.3 2020-02-20 |
1.4. Terminology
The following terms are used in the Master System Design.
Term | Meaning |
---|---|
Admin |
User with administrative capability on the EP |
Algorithm |
A self-contained set of operations to be performed, typically to achieve a desired data manipulation. The algorithm must be implemented (codified) for deployment and execution on the platform. |
Analysis Result |
The Products produced as output of an Interactive Application analysis session. |
Analytics |
A set of activities aimed to discover, interpret and communicate meaningful patters within the data. Analytics considered here are performed manually (or in a semi-automatic way) on-line with the aid of Interactive Applications. |
Application Artefact |
The 'software' component that provides the execution unit of the Application Package. |
Application Deployment and Execution Service (ADES) |
Web Processing Service that incorporates the Docker execution engine, and is responsible for the execution of the processing service (as a WPS request) within the ‘target’ Exploitation Platform. |
Application Descriptor |
A file that provides the metadata part of the Application Package. Provides all the metadata required to accommodate the processor within the WPS service and make it available for execution. |
Application Package |
A platform independent and self-contained representation of a software item, providing executable, metadata and dependencies such that it can be deployed to and executed within an Exploitation Platform. Comprises the Application Descriptor and the Application Artefact. |
Bulk Processing |
Execution of a Processing Service on large amounts of data specified by AOI and TOI. |
Code |
The codification of an algorithm performed with a given programming language - compiled to Software or directly executed (interpretted) within the platform. |
Compute Platform |
The Platform on which execution occurs (this may differ from the Host or Home platform where federated processing is happening) |
Consumer |
User accessing existing services/products within the EP. Consumers may be scientific/research or commercial, and may or may not be experts of the domain |
Data Access Library |
An abstraction of the interface to the data layer of the resource tier. The library provides bindings for common languages (including python, Javascript) and presents a common object model to the code. |
Development |
The act of building new products/services/applications to be exposed within the platform and made available for users to conduct exploitation activities. Development may be performed inside or outside of the platform. If performed outside, an integration activity will be required to accommodate the developed service so that it is exposed within the platform. |
Discovery |
User finds products/services of interest to them based upon search criteria. |
Execution |
The act to start a Processing Service or an Interactive Application. |
Execution Management Service (EMS) |
The EMS is responsible for the orchestration of workflows, including the possibility of steps running on other (remote) platforms, and the on-demand deployment of processors to local/remote ADES as required. |
Expert |
User developing and integrating added-value to the EP (Scientific Researcher or Service Developer) |
Exploitation Tier |
The Exploitation Tier represents the end-users who exploit the services of the platform to perform analysis, or using high-level applications built-in on top of the platform’s services |
External Application |
An application or script that is developed and executed outside of the Exploitation Platform, but is able to use the data/services of the EP via a programmatic interface (API). |
Guest |
An unregistered User or an unauthenticated Consumer with limited access to the EP’s services |
Home Platform |
The Platform on which a User is based or from which an action was initiated by a User |
Host Platform |
The Platform through which a Resource has been published |
Identity Provider (IdP) |
The source for validating user identity in a federated identity system, (user authentication as a service). |
Interactive Application |
A stand-alone application provided within the exploitation platform for on-line hosted processing. Provides an interactive interface through which the user is able to conduct their analysis of the data, producing Analysis Results as output. Interactive Applications include at least the following types: console application, web application (rich browser interface), remote desktop to a hosted VM. |
Interactive Console Application |
A simple Interactive Application for analysis in which a console interface to a platform-hosted terminal is provided to the user. The console interface can be provided through the user’s browser session or through a remote SSH connection. |
Interactive Remote Desktop |
An Interactive Application for analysis provided as a remote desktop session to an OS-session (or directly to a 'native' application) on the exploitation platform. The user will have access to a number of applications within the hosted OS. The remote desktop session is provided through the user’s web browser. |
Interactive Web Application |
An Interactive Application for analysis provided as a rich user interface through the user’s web browser. |
Key-Value Pair |
A key-value pair (KVP) is an abstract data type that includes a group of key identifiers and a set of associated values. Key-value pairs are frequently used in lookup tables, hash tables and configuration files. |
Kubernetes (K8s) |
Container orchestration system for automating application deployment, scaling and management. |
Login Service |
An encapsulation of Authenticated Login provision within the Exploitation Platform context. The Login Service is an OpenID Connect Provider that is used purely for authentication. It acts as a Relying Party in flows with external IdPs to obtain access to the user’s identity. |
EO Network of Resources |
The coordinated collection of European EO resources (platforms, data sources, etc.). |
Object Store |
A computer data storage architecture that manages data as objects. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier. |
On-demand Processing Service |
A Processing Service whose execution is initiated directly by the user on an ad-hoc basis. |
Platform (EP) |
An on-line collection of products, services and tools for exploitation of EO data |
Platform Tier |
The Platform Tier represents the Exploitation Platform and the services it offers to end-users |
Processing |
A set of pre-defined activities that interact to achieve a result. For the exploitation platform, comprises on-line processing to derive data products from input data, conducted by a hosted processing service execution. |
Processing Result |
The Products produced as output of a Processing Service execution. |
Processing Service |
A non-interactive data processing that has a well-defined set of input data types, input parameterisation, producing Processing Results with a well-defined output data type. |
Products |
EO data (commercial and non-commercial) and Value-added products and made available through the EP. It is assumed that the Hosting Environment for the EP makes available an existing supply of EO Data |
Resource |
A entity, such as a Product, Processing Service or Interactive Application, which is of interest to a user, is indexed in a catalogue and can be returned as a single meaningful search result |
Resource Tier |
The Resource Tier represents the hosting infrastructure and provides the EO data, storage and compute upon which the exploitation platform is deployed |
Reusable Research Object |
An encapsulation of some research/analysis that describes all aspects required to reproduce the analysis, including data used, processing performed etc. |
Scientific Researcher |
Expert user with the objective to perform scientific research. Having minimal IT knowledge with no desire to acquire it, they want the effort for the translation of their algorithm into a service/product to be minimised by the platform. |
Service Developer |
Expert user with the objective to provide a performing, stable and reliable service/product. Having deeper IT knowledge or a willingness to acquire it, they require deeper access to the platform IT functionalities for optimisation of their algorithm. |
Software |
The compilation of code into a binary program to be executed within the platform on-line computing environment. |
Systematic Processing Service |
A Processing Service whose execution is initiated automatically (on behalf of a user), either according to a schedule (routine) or triggered by an event (e.g. arrival of new data). |
Terms & Conditions (T&Cs) |
The obligations that the user agrees to abide by in regard of usage of products/services of the platform. T&Cs are set by the provider of each product/service. |
Transactional Web Processing Service (WPS-T) |
Transactional extension to WPS that allows adhoc deployment / undeployment of user-provided processors. |
User |
An individual using the EP, of any type (Admin/Consumer/Expert/Guest) |
Value-added products |
Products generated from processing services of the EP (or external processing) and made available through the EP. This includes products uploaded to the EP by users and published for collaborative consumption |
Visualisation |
To obtain a visual representation of any data/products held within the platform - presented to the user within their web browser session. |
Web Coverage Service (WCS) |
OGC standard that provides an open specification for sharing raster datasets on the web. |
Web Coverage Processing Service (WCPS) |
OGC standard that defines a protocol-independent language for the extraction, processing, and analysis of multi-dimentional coverages representing sensor, image, or statistics data. |
Web Feature Service (WFS) |
OGC standard that makes geographic feature data (vector geospatial datasets) available on the web. |
Web Map Service (WMS) |
OGC standard that provides a simple HTTP interface for requesting geo-registered map images from one or more distributed geospatial databases. |
Web Map Tile Service (WMTS) |
OGC standard that provides a simple HTTP interface for requesting map tiles of spatially referenced data using the images with predefined content, extent, and resolution. |
Web Processing Services (WPS) |
OGC standard that defines how a client can request the execution of a process, and how the output from the process is handled. |
Workspace |
A user-scoped 'container' in the EP, in which each user maintains their own links to resources (products and services) that have been collected by a user during their usage of the EP. The workspace acts as the hub for a user’s exploitation activities within the EP |
1.5. Glossary
The following acronyms and abbreviations have been used in this report.
Term | Definition |
---|---|
AAI |
Authentication & Authorization Infrastructure |
ABAC |
Attribute Based Access Control |
ADES |
Application Deployment and Execution Service |
ALFA |
Abbreviated Language For Authorization |
AOI |
Area of Interest |
API |
Application Programming Interface |
CMS |
Content Management System |
CWL |
Common Workflow Language |
DAL |
Data Access Library |
EMS |
Execution Management Service |
EO |
Earth Observation |
EP |
Exploitation Platform |
FUSE |
Filesystem in Userspace |
GeoXACML |
Geo-specific extension to the XACML Policy Language |
IAM |
Identity and Access Management |
IdP |
Identity Provider |
JSON |
JavaScript Object Notation |
K8s |
Kubernetes |
KVP |
Key-value Pair |
M2M |
Machine-to-machine |
OGC |
Open Geospatial Consortium |
PDE |
Processor Development Environment |
PDP |
Policy Decision Point |
PEP |
Policy Enforcement Point |
PIP |
Policy Information Point |
RBAC |
Role Based Access Control |
REST |
Representational State Transfer |
SSH |
Secure Shell |
TOI |
Time of Interest |
UMA |
User-Managed Access |
VNC |
Virtual Network Computing |
WCS |
Web Coverage Service |
WCPS |
Web Coverage Processing Service |
WFS |
Web Feature Service |
WMS |
Web Map Service |
WMTS |
Web Map Tile Service |
WPS |
Web Processing Service |
WPS-T |
Transactional Web Processing Service |
XACML |
eXtensible Access Control Markup Language |
2. Overview
2.1. Building Block Overview
Content Description
This section contains:
|
The ADES provides a WPS 1.0 & 2.0 service interfaces as well as an OGC API Processes compliant API. It is responsible for the execution of the processing service (as a WPS request) within the ‘target’ Exploitation Platform (i.e. one that is close to the data). In the global scenario, we assume that the EMS ensures that the processor is deployed as a WPS service before it is invoked.
As illustrated in the EOEPCA Master System Design Document (ref. [EOEPCA-MSD]), the ADES provides an execution engine that handles the complexities of constructing the jobs and interfacing with the processing cluster. See the context wthin the EOEOPA, as depicted below in Figure 1:
The main responsibilities of the ADES are:
-
Validate and accept an execution request from the EMS
-
Submit the process execution to the processing cluster
-
Monitor the process execution
-
Retrieve the processing results
In order to accomplish the execution and monitor steps above, it also need to be responsible for the operations of:
-
Data Stage-In for the process inputs
-
Data Stage-Out for the process outputs
Those operations are internal sub-steps of the Processing Service execution operation.
2.1.1. Execution
The ADES invokes a Processing Service when a processing job is submitted to the WPS Server.
A processing job execution can be summarized by the execution of the following major steps on the Processing CLuster:
-
Data Stage-In
-
Processing
-
Data Stage-Out
The Workflow Executor is responsible for the internal orchestration of the three steps above.
Additionally, when instructing the processing cluster to execute the various steps (stage-in/out or processing), the ADES provides also the reference of the Docker image to use.
2.1.1.1. Data Stage-In phase
Data Stage-In is the process to locally retrieve the inputs for the processing. Processing inputs are provided as EO Catalogue references and the ADES is responsible to translate those references into inputs available for the local processing.
2.1.1.2. Application Processing phase
Processing is the core step of the Execute operation. During this step input data is transformed into outputs data.
ADES supports it by executing the Application Workflow in Common Workflow Language ([CWL-spec]) provided by the user when the Application Package was <<deployed.
2.1.1.3. Data Stage-Out phase
Data Stage-Out is the process to store persistently the outputs of the processing and publish their results catalogs to the resource managers.
ADES retrieves the processing outputs and automatically stores them onto an external persistent storage. Additionally, ADES publishes the metadata of the outputs onto a Catalogue, exposing the OpenSearch interface, and provides their references as an output.
2.2. Static Architecture
Content Description
This section contains:
|
The ADES is based on an Web API framework architecture. It offers primarly an OWS Server with WPS 1.0 & 2.0 OGC services and it exposes a REST/JSON interface compliant with the OGC Processes API [OGC-API-PROC] compliant with the OpenAPI specification [OPENAPI].
Figure 2 shows an overview of the ADES building block. It provides the framework for the WPS Server to manage and execute the processing services that interact with the external Kubernetes cluster via the Calrissian tool.
The WPS Service provides the external interface of the ADES, available to the EMS only. It enables internal code and configuration for creating the Processing Services via the Deploy/Undeploy WPS operations.
Then, the WPS service triggers various function oif the Processing Services for the Execute, GetStatus, GetResult and Dismiss WPS operations.
Internally, every Processing Service deployed on the ADES, uses the common library to perform properly it’s execution as per the Calrissian to submit, monitor, retrieve the results and dismiss a processing execution.
Section Components Design contains a detailed description of the ADES components.
2.3. Use Cases
Content Description
This section contains:
|
The following diagram describes the main use cases of the ADES
2.3.1. Deploy Processing Service
As an EMS service, I want to deploy (and un-deploy) a processing service, using a reference posting a OWS Context from a Resource Catalog referencing a [Common Workflow Language (CWL)] so that I can make it available (or remove) for the Execution, on behalf of my User.
2.3.2. Execute Service as Job
As an EMS service, I want to Execute an available processing service via the WPS & API interface, so that I can submit an execution, on behalf of my User.
2.3.3. Monitor Job
As an EMS service, I want to get the status of a given processing job, via the WPS & API interface, so that I can monitor a current or past execution, and provide the information back to my User.
2.3.4. Retrieve Result
As an EMS service, I want to get the reference of the processing results of a given processing job, via the WPS & API interface, so that I can provide the information back to my User.
2.4. Required Resources
Content Description
This section contains:
|
2.4.1. Data
2.4.1.1. OWS Context
The OGC Web Services Context Document (OWS Context) was created to allow a set of configured information resources (service set) to be passed between applications primarily as a collection of services.
In the ADES, an OWS Context is used as the catalogue entry for referencing the Application Package
2.4.1.2. Common Workflow Language ([CWL-spec])
Common Workflow Language (Common Workflow Language ([CWL-spec])) is used as the application package descriptor. It covers the following elements necessary to describe the application:
* cyclic graph orchestrating the steps in order mapping workflow input/output with steps input/output
* Steps describing a command line with their input/output
* CWL specification extensions that may be used to provide the additional information elements
The application package is thus composed of a CWL file with the role of the application descriptor.
The container reference is included in the CWL as a requirement.
3. Components Design
Content Description
This section contains:
When a breakdown is necessary, a genral overview of the building block can be given. On the contrary, no breakdown indicates a single component development with the same expected sections. |
3.1. ADES Core Framework
3.1.1. Overview and Purpose
The ADES Core Framework is the main software container for all the components composing the ADES.
3.1.2. Software Reuse and Dependencies
All software and their dependencies are described in the next sections
3.1.3. Interfaces
All interfaces are external and described in the External Interfaces section
3.1.4. Data
Not applicable
3.1.5. Applicable Resources
3.2. WPS Server
3.2.1. Overview and Purpose
The WPS Server provides the processing interfaces of the ADES. Basically, it exposes 2 web services:
-
OGC Web Services Common [OGC-COMMON] hosting WPS 1.0.0 & 2.0.0 processing services
-
OGC API Processes [OGC-API-PROC] hosting WPS in RESTful core OpenAPI
3.2.1.1. WPS 1.0 & 2.0 compliance
"ZOO-Kernel implements and complies with the WPS 1.0.0 and the WPS 2.0.0 standards edited by the Open Geospatial Consortium".
The Dismiss operation, which is "only available in WPS 2.0.0, it lets the client ask the server to stop a running service and remove any file it created", requires changes in the source code to support the EMS operations. The current ZOO-Kernel implementation relies on local processes which are being removed from the execution environment. This approach prevents the management of remote processes. In this context, the ZOO-Kernel third-party software is modified to allow managing remote execution via APIs, in particular when the remote process exposes an OGC WPS-T 2.0.0 interface. It is foreseen an upstream contribution to the open-source ZOO-Project project.
3.2.1.2. OGC API Processes compliance
The ZOO-Project product, available at www.zoo-project.org, is an open-source implementation of both the Web Processing Service (WPS) standard (version 1.0.0 and 2.0) and the OGC API - Processes specification. The ZOO-Kernel is the main component written in C which enables the deployment of geospatial processes on the Web in a way that conforms to OGC standards.[1]
3.2.2. Software Reuse and Dependencies
The following open-source software is reused:
-
ZOO-Kernel http://zoo-project.org/docs/kernel/
3.2.3. Interfaces
The WPS Server exposes the OGC OWS services WPS 1.0 & WPS 2.0 as well as an OGC API Processes. According to their respective specifications, they allow to deploy, execute, monitor and dismiss processing services.
More details in the External Interfaces section.
3.2.4. Data
Not Applicable.
3.2.5. Applicable Resources
3.3. Workflow Executor
3.3.1. Overview and Purpose
The workflow executor is the component in charge of the scheduling ang the execution of the processing jobs.
When a Processing Service is trigger by the WPS Server via a job submission, this latter invokes the workflow executor to perform the steps necessary for the application to execute properly
-
Prepare the workflow Execution
-
On the Processing Cluster
-
Create the application namespace is not present
-
Provision Storage for the job
-
-
Create the main CWL by integrating
-
job inputs
-
stage-in/out
-
application workflow
-
-
-
Execute the Data Stage-In
-
Execute the Application Workflow
-
Execute the Data Stage-Out
-
Dispose the workflow execution cleaning the storage (e.g. delete the volume)
3.3.2. Software Reuse and Dependencies
The following open-source software is reused:
3.3.3. Data
-
Job inputs
3.3.4. Applicable Resources
3.4. Processing Service
3.4.1. Overview and Purpose
Processing Services are the representation of the Application Package deployed on the ADES.
Indeed, when a user deploys a service on the ADES, this latter creates a Processing Service from the Application Package.
Internally, a Processing service represents 3 items
-
the Application Workflow in Common Workflow Language ([CWL-spec])
-
the WPS describeProcess generated from the Application Workflow inputs, outputs and from the OWS Context description (title, description)
-
the Application Image in [Docker]
The following diagram shows the Processing Service’s items as components in the ADES once deployed and the relationships with the other components
The Processing Service component will be invoked by the Workflow Executor when a processing job is submitted to the WPS Server.
3.4.2. Software Reuse and Dependencies
The following open-source software is reused:
-
CWL Tool Common Workflow Language ([CWL-spec])
-
Calrissian https://github.com/Duke-GCB/calrissian
-
ZOO-Services http://zoo-project.org/docs/services/index.html
3.4.3. Data
-
OWS Context
-
CWL
3.4.4. Applicable Resources
4. Functional Design
Content Description
This section contains:
When a breakdown is necessary, a general overview of the function can be given. On the contrary, no breakdown indicates a single component development with the same expected sections. |
4.1. Application Deployment to Web Processing Service
4.1.1. Overview and Purpose
This is the main function in charge of building a Processing Service from an Application Package.
Concretely, this function read an Application Item as an OWS Context referencing the Application Workflow. With all the information in those documents, it builds a Processing Service describing
-
General information about the service: identifier, title, abstract
-
Input parameters definition: identifiers, types, options, default values,…
-
Ouput results options: identifiers, types, format
Regardless of the deployment mechanism, the Application Workflow is used to convey the information that describes the process inputs/outputs into WPS Server data model.
The mapping between CWL and WPS follows the work performed in the OGC Testbed 16.
All software and their dependecies are described in the nextr sections
4.1.2. Data Input/Output
-
Input
-
Application Item as an OWS Context describing the application referencing a Application Workflow
-
-
Output
4.1.3. Applicable Resources
4.2. Job Submision
4.2.1. Overview and Purpose
The submission of a job starts a process instantiation and execution of an application. Basically, the ADES executes the Service Component built during the application deployment.
This function triggers 3 main operations sequantially
4.2.2. Data Input/Output
-
Input
-
Job Execute Request
-
-
Output
-
Job Identifier (status location)
-
4.3. Data Stage-In
This is a pre-processing step that provision all the data needed for the workflow execution and referenced in the job execute request.
4.3.1. Overview and Purpose
The data Stage-In identifies all the Application Workflow inputs
that are of the type Directory. For each of them, if the input is in the URI form, it tries to resolve it as a resource manager link and copy the referenced data items on the processing job workspace. It also creates a STAC catalog JSON document referencing the locally copied data. The local directory path containing the catalog.json file and the data is passed to the corresponding CWL input
at Workflow Execution.
4.3.2. Data Input/Output
-
Input
-
Job Execute Request
-
Workflow (CWL)
-
-
Output
-
Input data folder
-
Path to catalog referencing input data folder
-
4.3.3. Applicable Resources
4.4. Workflow Execution
Calrissian is the responsible for orchestrating and executing the CWL defining the application. It directly interacts with Kubernetes for submitting jobs on different pods.
4.4.1. Overview and Purpose
4.4.2. Data Input/Output
-
Data Staged-in catalog
-
Application CWL
-
Data produced catalog
4.4.3. Applicable Resources
4.5. Data Stage-Out
This is a post-processing step that retrieve all the results produced the workflow execution, copy and publish them in a resource manager.
4.5.1. Overview and Purpose
The data Stage-Out identifies all the Application Workflow outputs
that are of the type Directory. For each of them, it creates a STAC catalog JSON document referencing relatively the locally produced data. Then the local directories are stored persistently to the Resource Manager store. A root catalog referencing all the directory catalogs is also stored with the results and published in the Resource Manager Catalog.
4.5.2. Data Input/Output
-
Input
-
Job Results folder
-
Catalog referencing results (optional)
-
-
Output
-
Reference to catalog published to the resource manager
-
4.5.3. Applicable Resources
5. External Interfaces
This section focuses on the available functionality published over interfaces.
Each interface in the list contains:
Each port in the list contains:
|
5.1. Description
The ADES provides a WPS 1.0 & 2.0 service interfaces as well as an OGC API Processes compliant API. It is responsible for the execution of the processing service (as a WPS request) within the ‘target’ Exploitation Platform (i.e. one that is close to the data). In the global scenario, we assume that the EMS ensures that the processor is deployed as a WPS service before it is invoked.
Refer to ADES component documentation - https://eoepca.github.io/proc-ades
5.2. Web Processing Services
The ADES exposes interfaces for the processing based on the OGC "Web Processing Service". This service port is available in 3 versions corresponding to the OGC WPS eveolution.
The 2 next sections describes
-
The 2 first versions implementing the Web Service Common
-
The latest API implementing a REST interface described in OpenAPI
5.2.1. WPS 1.0.0 & 2.0.0
The OpenGIS® Web Processing Service (WPS) Interface Standard provides rules for standardizing how inputs and outputs (requests and responses) for geospatial processing services, such as polygon overlay. The standard also defines how a client can request the execution of a process, and how the output from the process is handled. It defines an interface that facilitates the publishing of geospatial processes and clients’ discovery of and binding to those processes. The data required by the WPS can be delivered across a network or they can be available at the server.
-
Applicable standards - [OGC-WPS] [OGC-WPS2]
Base URL: https://ades.eoepca.org/ows
URL | Methods | Function |
---|---|---|
GET, POST |
OGC Web Service Common port
-
Description - The WPS server is self-contained, it provides an initial endpoint that can be used by a WPS client to determine the server’s capabilities.
-
Operations - GET, POST according to the payload of the requested service
-
Usage examples
https://ades.eoepca.org/ows?service=WPS&version=1.0.0&request=GetCapabilities
-
Full references : [OGC-WPS] [OGC-WPS2]
5.2.2. OGC API Processes
The OGC API - Processes enables the execution of computing processes and the retrieval of metadata describing their purpose and functionality. Typically, these processes combine raster, vector, and/or coverage data with well-defined algorithms to produce new raster, vector, and/or coverage information.
-
Applicable standards - [OGC-API-PROC]
Base URL: https://ades.eoepca.org/api
Resource | Method | Description | Operation |
---|---|---|---|
/ |
GET |
landing page of this API |
|
/conformance |
GET |
Lists all requirements classes specified in the standard (e.g., OGC API - Processes Part 1: Core) that the server conforms to |
|
/processes |
GET |
Lists all available processes this server offers. |
|
/processes/{id} |
GET |
Describes a process. |
|
/processes/{id}/jobs |
GET |
Lists available jobs of a process. |
|
/processes/{id}/jobs |
POST |
Submits a new job. |
|
/processes/{id}/jobs/{jobID} |
GET |
Shows the status of a job. |
|
/processes/{id}/jobs/{jobID} |
DELETE |
Cancel a job execution and remove it from the jobs list. |
|
/processes/{id}/jobs/{jobID}/result |
GET |
Lists available results of a job. In case of a failure, lists exceptions instead. |
5.3. Application Package
The Application Package is the interface for the application developer. It contains 3 essentials elements:
-
The Application Item is the entry point of the application in a Resource Manager. This item is encoded in a OWS Context document.
-
The Application Workflow is the processor graph describing the steps execution of the software processor.
-
The Application Image is the container image including the processor software and its dependencies.
In the following section, we will use the vegetation-index
application as an example to illustrate the spcifications and convention of the Application Package
5.3.1. Application Item
The Application Item is the entry point of the application in a Resource Manager. It is an OWS Context document designed to be searchable and queryable as an Atom entry
.
Below is the Application Item of the vegetation-index
application.
We will describe the essantial elements of this document.
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"> (1)
<title type="text">Result for OpenSearch query in eoepca-services</title>
<id>https://catalog.terradue.com:443/eoepca-services/search?uid=vegetation-index&format=atom</id>
<updated>2020-10-12T07:32:51.013305Z</updated>
<link rel="self" type="application/atom+xml" title="Reference link" href="https://catalog.terradue.com/eoepca-services/search?uid=vegetation-index&format=atom" />
<link rel="search" type="application/opensearchdescription+xml" title="OpenSearch Description link" href="https://catalog.terradue.com/eoepca-services/description" />
<entry> (2)
<id>https://catalog.terradue.com:443/eoepca-services/search?format=atom&uid=vegetation-index</id>
<title type="text">Vegetation index</title> (3)
<summary type="html">Vegetation index processor</summary> (4)
<published>2020-10-12T07:32:51.01297Z</published>
<updated>2020-10-25T10:34:19.721262Z</updated> (5)
<link rel="self" type="application/atom+xml" title="Reference link" href="https://catalog.terradue.com/eoepca-services/search?format=atom&uid=vegetation-index" /> (6)
<owc:offering code="http://www.opengis.net/eoc/applicationContext/cwl" xmlns:owc="http://www.opengis.net/owc/1.0"> (7)
<owc:content type="application/cwl" href="https://raw.githubusercontent.com/EOEPCA/proc-ades/develop/test/sample_apps/vegetation-index/vegetation-index.cwl" /> (8)
</owc:offering>
<identifier xmlns="http://purl.org/dc/elements/1.1/">vegetation-index</identifier> (9)
</entry>
<totalResults xmlns="http://a9.com/-/spec/opensearch/1.1/">1</totalResults>
<startIndex xmlns="http://a9.com/-/spec/opensearch/1.1/">1</startIndex>
<itemsPerPage xmlns="http://a9.com/-/spec/opensearch/1.1/">20</itemsPerPage>
<os:Query role="request" xmlns:dct="http://purl.org/dc/terms/" xmlns:param="http://a9.com/-/spec/opensearch/extensions/parameters/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:os="http://a9.com/-/spec/opensearch/1.1/" geo:uid="vegetation-index" />
</feed>
1 | The Atom Feed enclosing the Application Item. in our case, the feed is the result of an OpenSearch query to a Catalog. |
2 | The Application Item as an Atom Entry |
3 | The Application Title used to label the Processing Service deployed on the ADES. |
4 | The Application Summary used as the description of the Processing Service deployed on the ADES. |
5 | The Application Date used to timestamp the Processing Service deployed on the ADES. |
6 | Following the OpenSearch protocol, the self link is the persistent link to the resource. this is often managed automatically by the Resource Manager |
7 | The OWS Context offering declaring the Application Workflow with the specific code http://www.opengis.net/eoc/applicationContext/cwl . |
8 | The actual Application Workflow resource. It can be either enclosed in the document or referenced with a link like in this example. The Mime-Type is application/cwl . |
9 | The Application Identifier used to uniquely identify the Processing Service deployed on the ADES. |
5.3.2. Application Workflow
The Application Workflow is the core document of the processor in Common Workflow Language ([CWL-spec]). It describes, as a Directed Acyclic Graph (DAG), the execution of the user processor defining the commands to be executed with their arguments as well as the inputs/outputs.
The ADES relies on the Common Workflow Language ([CWL-spec]) tool via [Calrissian] to execute the workflow. Thus, the ADES supports entirely the Common Workflow Language ([CWL-spec]) specifications. Therefore, we will not explain in details here the Common Workflow Language ([CWL-spec]) specifications available as a Reference Documents.
The purpose of this section is to describe the elemnts used in the CWL in the functional design of the ADES.
Below is the Application Workflow Common Workflow Language ([CWL-spec]) of the vegetation-index
application.
We will describe the essantial elements of this document.
$graph:
cwlVersion: v1.0
- class: CommandLineTool
id: clt
baseCommand: vegetation-index
inputs:
inp1:
inputBinding:
position: 1
prefix: --input-reference
type: Directory
inp2:
inputBinding:
position: 2
prefix: --aoi
type: string
outputs:
results:
outputBinding:
glob: .
type: Any
hints:
DockerRequirement:
dockerPull: terradue/eoepca-vegetation-index:0.1 (7)
requirements:
EnvVarRequirement:
envDef:
PATH: /opt/anaconda/envs/env_vi/bin:/opt/anaconda/envs/env_vi/bin:/opt/anaconda/envs/env_default/bin:/opt/anaconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PREFIX: /opt/anaconda/envs/env_vi
ResourceRequirement: {}
- class: Workflow (1)
doc: Vegetation index processor
id: vegetation-index
inputs: (2)
aoi:
doc: Area of interest in WKT
label: Area of interest
type: string (3)
input_reference:
doc: EO product for vegetation index
label: EO product for vegetation index
stac:catalog: catalog.json
stac:collection: source
type: Directory[] (3)
label: Vegetation index
outputs: (4)
- id: wf_outputs
outputSource:
- node_1/results
type:
items: Directory (5)
type: array
requirements:
- class: ScatterFeatureRequirement (6)
steps:
node_1:
in:
inp1: input_reference
inp2: aoi
out:
- results
run: '#clt'
scatter: inp1
scatterMethod: dotproduct
1 | the Workflow class object of the Common Workflow Language ([CWL-spec]) is the main element used by the ADES to extract the information about the inputs/outputs of the Application |
2 | the inputs section defines all the input necessary for the command. When the Application Package is deployed as as Processing Service, the inputs used by the WPS Server to declare the processing inputs at the Web Processing Services interface. |
3 | the type used for input is replicated at Web Processing Services interface level to describe the exepected input. A specific convention is applied when using the type Directory and is explained in the Data Stage-In section. |
4 | the outputs section defines the outputs of the command. When the Application Package is deployed as as Processing Service, the outputs used by the WPS Server to declare the processing outputs at the Web Processing Services interface. |
5 | the type used for output is replicated at Web Processing Services interface level to describe the produced output. A specific convention is applied when using the type Directory and is explained in the Data Stage-Out section. |
6 | the scattering functions of CWL is replicated in the ADES with [Calrissian] using different virtual machines instances to run the process in parallel. |
7 | All the Application Image specified in the Application Workflow are pulled and used to instantiate the processing container on the cluster. |
5.3.3. Application Image
One or more Application Image can be used in the [Application PAckage] to support the execution of the processor. The Application Image is specified as a [Docker] Image in the Application Workflow
<< End of Document >>