European Open Science Cloud and its role in Horizon Europe
The European Open Science Cloud (EOSC) aims to build and exploit the web of Scientific Data: an open space that will free the access and use of the research data produced by public-funded European researchers. EOSC will also provide data services, e.g. curation and storage, and novel software tools and computing resources needed for the interoperability of the data and the interdisciplinary reuse. EOSC is an initiative of the European Commission that started in 2015 to promote open science and pave the way to new forms of collaboration and have access to research results and processes. Now in 2020, it has reached the actual organizational set-up with the strategic view of being implemented as a Partnership of the Horizon Europe framework programme in 2021.
Regina Ciancio (NFFA.EU): As a start, may you please remind us very briefly of the ambitious objectives of the project and how research in Physics contributes to it and benefits from it?
Giorgio Rossi (EOSC & UniMi): The basic idea of EOSC is that by creating a space of free access and exploitation of research data one can expect a higher return in knowledge than today. More and differently aimed analysis of the ensemble of available data would become possible. Openly accessible datasets need to be organized and described according to the FAIR (Findable, Accessible, Interoperable and Reusable) criteria. In this way, other researchers can access these data sets and use the EOSC analysis tools. New analysis and combinations of data from different sources can be expected, with a strong potential for new knowledge.
In the community of astronomers and astrophysicists it is a well-established practice to store the observational data, and make them available for analysis independent from the initial observation proposal and research group, e.g. for thesis work, with great benefit to science.
New tools for collaborative work, data discovery and analysis of EOSC will be productive if FAIR data is abundant. In such a case, artificial intelligent methods will be efficient and represent a substantial help to the researchers. In this connection, there is a clear opportunity for physics laboratories to develop instrumentation and methodologies for the automatic generation of FAIR data and metadata (FAIR-by-design) that do not require from the researcher an extra effort for data care, beyond quality control and reproducibility.
To this end, developers must implement new hardware and new acquisition algorithms that can be of great benefit for physics and analytical laboratories.
RC: Are there any particular actions to promote multidisciplinarity?
GR: Interoperability is the key concept. FAIR datasets will be used and reused with software tools generated by different scientific communities, making interdisciplinarity possible. Multidisciplinarity is mandatory to address complex issues as in the mission-oriented research of Horizon Europe (climate, health, energy, sustainable cities, society…).
RC: Are there any actions that facilitate access to research infrastructures, especially the small ones?
GR: Research Infrastructures (RI) are a key in the development of EOSC as they can ensure both the sustainability of FAIR data generation, curation and storage, and most essentially the control on data quality. The scientific community of reference to the RI de facto assures data quality.
Small RIs will align with the best standards of FAIR data management and will benefit from all the EOSC services.
RC: Can we be satisfied with the current state-of-play ?
GR: EOSC will be implemented as a co-planned Strategic Partnership of Horizon Europe. EU Member States (MS) and Associated Counties (AC) can join and contribute in two ways: through membership to the EOSC Association AISBL, in particular by mandating a national legal organization, or through institutional members, universities, research performing organizations, research infrastructures, or any other subjects interested in EOSC. ICDI (the Italian Computing and Data Infrastructure) brings together the Italian IRs and EPRs and is open to universities. ICDI is the mandated organization for Italy and one of the four founders of EOSC-Association. MS and AC will also contribute through an independent board or forum, that will develop a coherent strategy for EOSC, and will monitor the impact of EOSC on the European Research Area.
RC: What role will EOSC play in HORIZON Europe, and concerning research infrastructures ( physics in particular)?
GR: On the one hand, the research infrastructures will have to align with the production of FAIR data as an obligation whenever their science production is public-funded. The development of EOSC will increase data traffic and the demand for analytical and storage resources. RIs will have to cope with this development and address the costs it generates.
RC: NFFA-Europe Pilot will define new schemes of Research Infrastructures. How will NEP contribute to EOSC?
GR: NEP will implement what a JRA dedicated to FAIR data management and metadata archive had designed in NFFA-Europe. NEP will continue the design work and set-up the Metastore as a service to access all metadata generated by the infrastructure. The ambition is to run the first EOSC-ready nanoscience open data infrastructure.
RC: What is the main challenge of this significant pilot project at the implementation level?
GR: The critical question, in my opinion, is to predict a growth curve in EOSC's activity volume and productivity. It is necessary to make projections of scenarios at 5, 10 and 15 years to be able to predict the critical issues in the general planning of research resources. The scenarios range from laboratories generating FAIR data, to the data networks to archive memories, to analytical software, numerical simulation codes and HPC, Cloud or Edge computing resources. These forecasts need to be updated continuously to check the sustainability of EOSC and the scientific return it will generate. In the construction phase, research infrastructures must tackle the need for specialized human resources urgently. But this is also felt urgent for the full deployment of the EOSC services open to research, innovation, and perhaps to society. Specialized training for Data Scientists is one crucial aspect. Another one is the training of Data Stewards that facilitate the interface between researchers and innovation operators and the use of FAIR data and advanced discovery and analysis tools. Finally, the promotion of a general literacy to enable, in all disciplines, the correct use of research data is a necessity that will have to permeate the university and the whole education system quickly.