HCMR-µCT

High impact use-cases

Pipeline

Digitisation procedure

Available metadata Fill one line by metadata : metadata_id; metadata_description; metadata_remark

Metadata term	Info point explanation
Specimen ID	A unique identifier for the specimen in the format mCT-xxxxx (where x = incrementing number from 00001 to 99999, with preceding zeros)
Scan ID	A unique code of the format scan-xxxxx (where x = incrementing number from 00001 to 99999, with preceding zeros)
Sample Category	The category to which the specimen belongs to e.g. Zoology, Botany
Scientific name	The lowest taxonomic name to which the specimen has been identified
Taxonomic Group	The general taxonomic group to which the specimen belongs to, e.g. Polychaeta, Insecta etc
Specimen Description	A verbatim description of the specimen, which allows to understand the nature of the specimen at a glance
Provider Institute	Institution which provided the specimen
Specimen Provider	Person who provided the specimen
Material	The material of the scanned sample e.g. soft tissue
Fixation Type	Original fixation type of the specimen e.g. formalin
Preservation Medium	Preservation medium of the specimen e.g. ethanol
Contrast Enhancement Method	Short name of the chemical used e.g. PTA
Scope of Scan	Aim of scan
Scan date	Start date of the scanning in the format MM/DD/YYYY
Scanned By	The person who performed the scan
Sample Holder	A description of the sample holder e.g. pipette tip
Scanning Medium	The medium that surrounds the sample during scanning e.g. air, ethanol
Scanned Part	Part of the specimen that has been scanned e.g. anterior part, full specimen
Digital Device Type	The brand (manufacturer) of the Digital Device that was used for the scanning
Voltage kV	The voltage in kilovolt (kV)
Current uA	The current in μAmpere
Filter	The type of the filter that is used for scanning e.g. Aluminium
Zoom (um)	The resolution of the scan in μm (zoom level) e.g. 1.24
Camera Resolution	Camera resolution settings in pixels e.g. 4000
Exposure Time (ms)	The exposure time in milliseconds used for scanning
360	360° or 180° rotation scan
Random Movement	Random movement value
Averaging	Frame averaging value
Oversize Settings	The number of oversize parts (vertical & horizontal) used for scanning
Dataset	Download the dataset (nifti format)
Micro-CT Images	Download the micro-CT images (jpg format) available for this dataset
Video File	Download the micro-CT video (mp4 format) available for this dataset

Standardisation protocols

Existing protocols for standardisation: Yes, HCMR has standardized protocols for metadata archiving.

Figure 2: Schema of the steps involved in creating the metadata management system

If no, describe the protocols you are planning to adopt during Synthesys+: n/a
Prepare graphical representation (workflow) of standardisation protocols: for HCMR see Fig.2 below.

The metadata that are collected for each micro-CT project are maintained in a relational database, with well-defined semantics for the tables and the columns that are used. The inclusion of this information in the metadata catalogue of the LifeWatchGreece portal, is of paramount importance since (a) the metadata will be integrated with information coming from other sources, enabling therefore the expansion of knowledge about them (e.g. the taxonomic information of species will be “linked” to the particular species that is referred to a particular micro-CT project), (b) the metadata will gain more visibility and become searchable and browsable through the LifeWatch Data Services.

Due to the fact that the metadata catalogue of LifeWatchGreece portal has been implemented using semantic web technologies a set of sub-activities are required.

Semantics

Data Normalization: during this step the harvested metadata from the microCT relational databases are being normalized as regards their structure. More specifically the harvested data are delivered as CSV resources, which are exported as such from the relational database, and they are structurally transformed to XML. This is required for the subsequent steps (i.e. implementation of schema mappings and data transformation). During this step, more activities can be carried out, which are not triggered though for the case of microCT, such as cleaning of data, normalization of specific types (e.g. dates), etc.
Schema Mappings: As already described above, the metadata in the catalogues of LifeWatchGreece project are modeled using semantic web technologies. More specifically, they are modelled using MarineTLO (Tzitzikas et al. 2016), which is an extension of the ISO 21127:2014 CIDOC-CRM, that can be used for modelling marine domain resources. For this reason, in this step we define the schema mappings that are necessary for realizing the transformation of the microCT XML resources (derived from the previous step) as MarineTLO-based descriptions. This is carried out using X3ML mapping definition language (Marketakis et al. 2017) which allows describing in a declarative manner which (and how) parts from the source data (i.e. the XML resources) are mapped to particular classes and instances of the target model (i.e. the MarineTLO). The result of this step is a set of X3ML descriptions that can be used in the next step to carry out the transformation. We should point out that as soon as the structure of the microct relational databases does not change (and as a result the corresponding CSV and XML resources), the X3ML definitions remain the same and no updates are required.
Data Transformation: this step takes as input the microCT XML resources (derived from the 1st step), and the X3ML definitions (derived from the 2nd step), and generates the MarineTLO-based descriptions in the form of an RDF dataset. This activity is carried out using X3ML engine[8].
Transformed Data Ingest: the last step of this workflow, imports the transformed RDF datasets with microCT metadata to the metadata catalogues of LifeWatchGreece portal. This is carried out using the LifeWatchGreece Data Services API. From this point onwards, the microCT metadata are also searchable and browsable from the Data Services.

The following figures (Fig. 3 and 4) show the indicative modelling with respect to MarineTLO of a microCT Specimen resource, and the microCT scan event.

Figure 3: The indicative modelling with respect to MarineTLO of a microCT Specimen resource

Figure 4: The indicative modelling with respect to MarineTLO of a microCT scan event

Storage capabilities

The HCMR server that hosts and distributes raw data produced by the micro-CT is a virtual machine of the central computer-systems hosting infrastructure of HCMR (proxmox cluster) with 4 CPUs, 4GB RAM and 16TB storage in the central storing infrastructure (96ΤΒ raw in RAID-6).

Data access

Contents

Picture1.png