General recommendations for service implementation and operation.

NEANIAS aims at contributing to the materialization of the European Open Science Cloud (EOSC) by delivering innovative thematic services in the Underwater, Atmospheric and Space research sectors.

To effectively deliver, operate and monitor such services, NEANIAS relies on facilities provided by the consortium or by single partners, accessed and utilized through procedures and guidelines described in the Delivery Activities Methodology and Plan, a document delivered as a task of WP7. The requirements for these facilities are based on well-established practices and adopting mature tools, with directions dictated by both the adopted Service Management schemes, i.e. FitSM, and an internal survey within NEANIAS Service Providers. Moreover, this document defines the methodologies employed for managing the software codebase lifecycle (i.e. development, building, testing, licensing, releasing and monitoring) along with proposed quality metrics concerning the operation of the NEANIAS services.

The  Delivery Activities Methodology and Plan begins with a description of the service management processes that we are employing in NEANIAS as well as the software implementation facilities, along with procedures, methodologies and guidelines used to implement the processes.  Next, the document reports on procedures and guidelines concerning software documentation, provides source code licensing recommendations, and reports on facilities to supply effective feedback and resolve issues concerning NEANIAS services source code and operations. Lastly, addresses monitoring and alerting facilities, along with a description of the recommended quality metrics to be collected and describes the infrastructures employed to enable the NEANIAS services provisioning as well as infrastructure specific access and usage procedures.

This article presents some general recommendations and remarks concerning service implementation and operation, extracted from the deliverable, which may be applicable for open science cloud services projects.

1. Cloud Infrastructure Reliability

Compared to on-premises infrastructures, deploying services on the cloud requires a paradigm shift: indeed, cloud datacenters are usually built using cheap hardware, with the idea of quickly replacing failed components. Thus, applications in the cloud should not rely on the availability of the underlying infrastructure, and should implement high availability (HA) at the application layer. To this aim, several service providers are leveraging chaos engineering practices [1] to ensure that services are designed to be available despite failures in the underlying systems.

As failures may as well occur in the infrastructures hosting the NEANIAS services, the recommendation is to adopt high availability service design principles and to leverage on deployments on different geographical regions.

2. Infrastructure as Code

The main idea of the Infrastructure as Code (IaC) paradigm is to use textual infrastructure descriptions, which can be then employed to execute automatic service deployments, and which can be managed through revision control systems. This allows to automate the deployments and have an always up to date documentation of the processes.

Thus, the recommendation is to employ IaC oriented tools, such as Dockerfiles, Kubernetes resource files and Ansible playbooks, while the use of binary images or artifacts is not recommended.

3. Service Registration and Configuration

According to the Service Oriented Architecture (SOA), the process by which a consumer locates a required service often requires that some means of discovery mechanism is employed. Within the NEANIAS ecosystem, the Service Catalog will have the role of describing the services following the EOSC standards for service description, registration and configuration and facilitate the discoverability of the service offering both to end users as well as to other services.

This mechanism though includes several steps, possibly offline communications, quality checks and other processes. After a service has been successfully catalogued, running instances of the service need to be registered, differentiated based on their capacity, quality of service, configuration and a number of other service specific characteristics. This registration does not necessarily match the purposes of the Service Catalog as a customer-facing list of all live services offered along with relevant information about these services.

Regardless of whether the service running instance registry is maintained under the same enabling service of the Service Catalog or as a different stand-alone service, the ability to dynamically discover and utilize instances of a service is paramount to the proper decoupling of different services and can become a key enabling element when servicing specific use cases with targeted requirements.

Equivalently to the dynamic discovery of services, the ability to dynamically retrieve service configuration and parametrize the operation of each service is a key enabler to some automatic service provisioning use cases. For this characteristic to become available, related configuration needs to be centrally controlled and provided to bootstrapped services at the appropriate time.

4. Service Versioning and Upgrading

During the lifetime of the NEANIAS services, upgrades of both the service components, the exposed API and possibly the underlying data model may be required. These changes need to be communicated to the service clients as well as ensure the continuity of the offered service. To this end, some versioning scheme needs to be employed to differentiate between different service level offerings and respective functionality.

When upgrading a service and affecting its respective version number, the process must be well defined and reproducible in an automated fashion to the highest possible degree. This includes all possible changes, including data model changes, service components, configuration, hosting environment, etc.

5. Testing & Development Infrastructures

The development cycle of a service will require that it passes various stages and respective infrastructures that support the service provision lifecycle. These include:

  • Development – The service is actively developed.
  • Testing / Quality Assurance – The service is tested to ensure proper operation and integration.
  • Staging – Pre-production environment where the service is used, tested and evaluated in “close to production” environment.
  • Production – Service fully operational and service clients.

Throughout these stages, depending on the nature and purpose of the service, a hosting environment will be needed to cover the execution of the service itself, underpinning dependencies, computing and storage requirements, etc. This environment and corresponding setup will differ from service to service.

Service providers need to formalize their requirements per service for each of the key aspects that are affected by the hosting environment. This includes both what is required by them as well as if and how they can, on demand, service client requests within each of these environments.

Some, non-exhaustive, options include:

  • Provide mock service for integration purposes.
  • Provide sandbox environments within existing installations of the service.
  • Use production or otherwise shared instances of the service.
  • For data dependent services, provide synthetic datasets.

Concerning infrastructural requirements, infrastructure providers need to be able to provide isolated pools of resources to cover these service requirements. With respect to resource utilization and accounting, non-production instances that consume such data and processing resources from other services, need to do so under a common agreement and possibly throttle the resource consumption based on service provider constraints.

 

[1] The Netflix Simian Army.

 

April 2020.

 

Tags: , ,
EU Flag  NEANIAS is a Research and Innovation Action funded by European Union under Horizon 2020 research and innovation programme via grant agreement No.863448.