1. What is the purpose of the data collection/generation and its relation to the objectives of Feel++?
The types and formats of data depends on the purpose of the data. Some datasets are provided for benchmarking and others for verification and validation. See Feel++ Data Types to understand what can be provided.
In general, the Feel++ library will not use existing data, except in the case of basic examples for the testing part. On the other hand, Feel++ applications will reuse data from different sources (mainly, third parties). Such data will be obtained, in general, directly from the third parties interested in the application outcomes. Also, some of the research groups will reuse their own data in the pilots.
In some cases, there are public repositories with data or public databases with part of the input data (e.g material properties), while in other cases, data will be kept private.
Since the Cemosis e-Infrastructure provides a data repository, partners collaborating on applications will aim at using such tool as much as possible, although we assume that in some cases this could not be the case (some simulations with some Feel++ applications might be done in the third party’s premises due to access restriction to some confidential data).
In the case of the Feel++ library, most of the data is generated by the Feel++ consortium (testing and gathering several metrics for validation purposes), with the exception of existing input examples.
Also, in the case of communication, data is generated by the consortium, although it is true that it is as result of questionnaires answered by third parties to the project (different stakeholders in the Modelling, Simulation and Optimization domain).
On the other hand, Feel++ applications uses data from third parties for the input of the pilot applications in some cases. Therefore, the origin of the data will vary depending on each application.
At this stage, the following external data sources have been identified:
Vivabrain: ICUBE Laboratory handling the MRI
Eye2Brain: Eugene and Marilyn Glick Eye Institute in Indianapolis;
HiFiMagnet: LNCMI National Lab for High Field Magnet;
PO: PlasticOmnium automotive
The size of the data will also depend on the kind of data and other aspects, such as the concrete application and tools involved.
The main variation is given in the Feel++ application, since each of them uses different formats as well:
Eye2Brain: A few MBytes to hundreds of Mbytes;
HifiMagnet: A few MBytes to hundreds of Mbytes;
Taking into account the main two categories of data we will deal with, we consider that data could be useful for different stakeholders:
- Data related to validation
Other researchers in the same field (HPC, Cloud, e-Infrastructures, MADFs) could be interested in order to do their own experiments and to compare solutions, as well as industry willing to participate in e-Infrastructures provisioning resources and software; domain;
- Data used and generated in the Feel++ applications
Any researcher, industry and even policy makers interested in simulations results, depending on the domain of each Feel++ applications. We expect that other stakeholders will be interested in the data generated in the project. However, special care must be taken when sharing data regarding privacy and confidentiality, when input data is provided by third parties.