Until now, these videos and photos have been difficult to find via the internet because they are not cataloged and described consistently. Led by GEOMAR, a team from the Helmholtz Association of German Research Centers has developed a universal data standard to facilitate the global use of images. The new metadata format is now presented in the journal Nature Scientific Data.
Life in the deep sea is increasingly being documented with high-resolution cameras attached to remotely-operated or autonomous underwater vehicles. Experts analyze these images scientifically to obtain information about life in the open water and on the seafloor as well as geological structures. Enormous amounts of such photo and video data are stored on the servers of marine research institutes worldwide – but catalogued very differently. In order to make this wealth of data internationally usable, important search terms and information such as the position of the diving robot during the recording, the camera technology used, and the names of the expedition and the scientists involved must be stored in the image file in an universally readable format.
In order to ensure this, a working group of the Helmholtz Association of German Research Centers, involving GEOMAR Helmholtz Centre for Ocean Research in Kiel, the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI) and the Helmholtz Centre Hereon, has developed an internationally standardized metadata format for underwater images. Experts from the DataHub, a data initiative of the Helmholtz Research Field Earth and Environment, and the Helmholtz Metadata Collaboration (HMC) were also involved. The proposal is presented in the current issue of the journal Nature Scientific Data.
Dr. Timm Schoening, lead author of the article and data scientist at GEOMAR, says: “There have already been efforts worldwide for several years to make data universally accessible. With our consistent metadata standard, we create the conditions for scientific photos and videos from the deep sea to also become internationally accessible in accordance with this initiative. And we are making software available to make this standard usable.”
The underwater vehicle ABYSS is deployed from the research vessel Meteor. Photo: Tim Benedikt von See, GEOMAR
The new format builds on the internationally recognized “FAIR” principles for sustainable research data management. The acronym stands for “findable, accessible, interoperable and reusable”. Files that can be obtained by other researchers based on their metadata are called FAIR Digital Objects (FDO). The FAIR data format for underwater imagery now presented has been named “image FAIR Digital Objects” (iFDO). In a way, it is an informatic index card that clearly summarizes all those aspects that are important for an image. It contains not only descriptive information about the image data itself, but also fixed web links to the image data.
The Helmholtz team has presented the iFDO concept internationally several times already. “Our approach draws great interest,” reports Dr. Schoening. “Therefore, we are confident that it will serve as a template for a new international standard for underwater imagery.”
Complementing the iFDO metadata format, the group has developed several software tools that make it possible to adopt the iFDO format for various biological or geological interpretations. Another idea is to equip camera systems in the future to automatically generate metadata in iFDO format while an image is being taken. How well this works was tested by GEOMAR researchers during expedition M182 with the research vessel METEOR in the Atlantic Ocean. Cameras aboard the autonomous underwater vehicles ANTON, LUISE and ABYSS, as well as on towed instruments and stationary moorings, stored iFDO metadata directly during operation. “This experience was very positive and still led to some additions to the documentation and software tools – the iFDOs themselves worked out very well during this expedition,” says Timm Schoening.
A particular challenge with image and video recordings is that a computer cannot readily evaluate them. This is different with temperature or depth measurements: the numerical values can be easily stored and displayed in a diagram. A video, on the other hand, is just a data stream of pixels for a computer. Therefore, each object in the image material must first be marked and defined – an elongated object as a sea cucumber, for example. Experts refer to this as annotation and use specialized software for this purpose, such as BIIGLE (Bio-Image Indexing and Graphical Labelling Environment), which was developed at Bielefeld University and is also used at GEOMAR.
“We included the annotation step directly in the data format when we developed the iFDOs. And we built the functionality into the BIIGLE software, so that this widely used tool now already supports the iFDO format,” says Dr. Schoening. “Those are two big advantages that have already gotten a lot of attention in our presentations: iFDOS can be used not only as a standard for metadata, but also as a standard for annotations, and there is usable software that supports the format.”
This is where the benefits of FAIR principles pay off, as it allows for effective reuse of the data: The image data and annotations can thus also be used to train machine learning (ML) algorithms. Since they are already available in the iFDOs in a FAIR data format, the development of ML algorithms is significantly less complex.
Original publication:
Schoening, T., Durden, J.M., Faber, C. et al. (2022): Making marine image data FAIR, Nature Scientific Data, doi: https://doi.org/10.1038/s41597-022-01491-3