O Sergeyeva, D Slipetskyy, V Gorbunov - Integrated data and information management system to collect and analyze biological data for climate change research - страница 1
INTEGRATED DATA AND INFORMATION MANAGEMENT SYSTEM TO COLLECT AND ANALYZE BIOLOGICAL DATA FOR CLIMATE CHANGE RESEARCH
O. SERGEYEVA, D. SLIPETSKYY, V. GORBUNOV, V. VLADYMYROV*
Institute of Biology of the Southern Seas, 2, Nakhimov av., Sevastopol, 99011 Ukraine
Abstract. The integrated data and information management system (InDIMS) has been developed at the Institute of Biology of the Southern Seas (Sevastopol, Ukraine), The system gives the possibility to have the convenient access to all types of data and information necessary for the effective scientific work including both the data collected within the institute and data form different national and international sources. The system can be used for a number of tasks including the climate change research.
Keywords: biological data, metadata, database, Internet, climate change AIMS AND BACKGROUND
For the effective and productive work with data, each scientist working in the research marine institution needs first of all a full convenient access to his/her institution data obtained him/her self as well as to historical data and information ever collected in the institution including the concomitant data from different disciplines and any publications including "gray literature". To minimize time necessary to find, collect and analyze all required data and information, it is desirable to have them loaded into the integrated data and information management system (InDIMS). The integration of data and information in the modern institute's information management system makes it possible to effectively present data and metadata outside the institute, combine data and information from different sources, apply quality control procedures and analyze large combined datasets to understand the global nature of phenomena such as climate change, over-fishing, and other changes in ecosystems .
The system with the similar ideology have been developed and actively used by the Flanders Marine Institute (VLIZ). The system is called the Integrated Marine Information System (IMIS) (http://www.vliz.be/imis/index.php) and its first release was in 2001 . The objective of the IMIS-database is to provide information on all topics relevant to marine sciences - the people with their expertise, institutions and their mandate, publications, etc. Different types of 'knowledge items' (persons, institutes, projects, maps, datasets etc.) correspond to different modules in the system, each with its own entry into the database. The IMIS collects information relevant to marine sciences in the Flanders, or to the southern part of the North Sea.
RESULTS AND DISCUSSION
In contrast to IMIS, fully written by the VLIZ programmers, data management group of the Institute of Biology of the Southern Seas
(Sevastopol, Ukraine) decided to use freely available software components where it possible and only where there are no suitable ready solution exists develop components by their own. Also, unlike IMIS, at first stage the IBSS InDIMS is being focused only on IBSS data and metadata.
As up-to-date information system has to provide a connection with the correspondent data providers and data centers around the world for online data exchange and verification and provide scientists with the most recent data and information, the web based approach has been chosen for
InDIMS (Fig. 1).
Figure 1. Structure of web based Integrated Data and Information Management System (InDIMS).
A prototype of such system is being created now at IBSS. At the moment it includes 4 main components:
1) Metadatabase of marine expeditions performed by the IBSS scientists on board of the IBSS research vessels and the vessels of other institutions.
2) Institutional database (DB) that includes the data ever collected by the IBSS scientists.
3) Electronic repository that contains all IBSS scientists' publications available in the digital form.
4) Black Sea marine species checklists developed based on WIKI technologies (under development).
Institute of Biology of the Southern Seas has huge amount of different multidisciplinary datasets collected since 1871. At the moment only part of them is digitized. The wast majority of digitized data was included to information products developed based on OceanBase database management
system (OceanBase has been developed and supported by the Database Laboratory of the Marine Hydrophysical Institute, Sevastopol. Ukraine) .
The structure of a metadatabase
(http://data.ibss.org.ua/Data/Cruises.apsx) of marine expeditions performed by the IBSS scientists on board of the IBSS research vessels and the vessels of other institutions was created to fit the international standard CSR (Cruise Summary Reports) developed within EU project SEA-SEARCH (2002 -2005)(www.sea-search.net). Within EU project SeaDataNet (http://www.seadatanet.org) the CSR standard was upgraded with the aim to improve interrelationships and to use of common vocabularies, wherever possible. The structure of IBSS metadatabase for marine expeditions will be improved to fit this new standard as soon as it is ready for the whole community. Also some efforts will be needed to update all existing cruise records in IBSS database and synchronize these updates with Cruise Summary Reports online database supported by Deutsches Ozeanographisches Datenzentrum (DOD).
For the moment (August 2008) IBSS database of marine expeditions contains CSR information on 70 cruises and 5468 stations worldwide (Fig 2.) and it is being actively populated with newly digitized and obtained metadata.
Figure2. IBSS cruises information and a map of stations.
IBSS data management system (http://data.ibss.org.ua/) is being filled with all data ever collected by the IBSS scientists. Preliminary data available in standalone systems and databases (Data on the Indian Ocean Ecosystem, Plankton biodiversity and biovariability in the Indian and Atlantic Oceans, Data on plankton and environmental characteristics, and others) developed within the different national and international projects are being transferred to the IBSS data management system. Detailed list of IBSS data products and their description can be found on IBSS website (http://www.ibss.org.ua/Default.aspx?tabid=325).
Where it is needed, the repeated quality check is applied and where it is possible additional metadata, which were skipped due to the limitations in storage systems in 90's, are being added to the datasets. Thus for every plankton sample the following fields have been added: instrument, net type, mesh size, preservation, handling, analyzes methods, principal investigators etc.
Figure 3. Some outputs from the IBSS database
IBSS data management system has been built on the up-to-date web technologies with a transparent and intuitively clear interface. The system provides a user with the following services:
- Selection of the multidiscipline data within the required geographical region.
- Verification of the taxonomic information.
- Integration in the international systems of the data and metadata exchange.
- Fast estimation of a state of the environment according to the different criteria.
- Data visualization (vertical profiles, maps, etc.).
- Export of data in formats suitable for the further analyzes and visualization in different systems such as ODV (Ocean Data View),
Surfer, Excel, DIVA, GIS, etc.
The metadata are fully accessible for the entire scientific community although the data themselves are accessible after the registration. Some of them are restricted and can be obtained only directly from the originators.
The database is being filled by data on the permanent basis. Howevr, for the moment not as many values have been loaded to the InDIMS as we would like to be loaded (see a table below):
Salinity, Temperature, Bioluminescence
Cu,Hg_Inorganic, Hg_Organic, P, Mn, Ni, Zn
(Species level data)
Phytoplankton abundance, biomass, counts
The IBSS data management group is recently working on providing bio-geographical records to the OBIS (http://www.iobis.org) - marine biogeographic information system, which is concentrated on datasets that record the particular species (or higher taxonomic group) from particular marine locations, at particular times. The IBSS will become a distributed data contributor. This means that IBSS keeps dataset locally, and set up a server that can respond to OBIS queries using a free software package called DiGIR (Distributed Generic Information Retrieval) to communicate with the
IBSS electronic repository (http://repository.ibss.org.ua/dspace/) is the information system that provides a possibility to collect, store, index, search, and automatic exchange of the scientific publications description and context, and free exchange of them via Internet in the international standards. At the moment the electronic repository exists as the separate system (Fig. 4) but it will be integrated in the InDIMS early next year. Recently (August 2008) IBSS electronic repository contains 469 publications and their number is growing fast. The additional role of the IBSS electronic repository is as well to make the publications of the IBSS scientists (especially published in the "gray literature") accessible to the entire scientific community and to make the IBSS "more visible" within the international scientific community.
йпш Institute of Biology of the Southern Seas E-Repository
■ IBSS Repository >
Advanced Search ■ Home
Communities & Collections Titles її , . Subjects By Date
Sign on to:
Welcome to the IBSS Institutional Repository!
Repository preserves and distributes digital collections including: IBSS-serialSj theses, research reports, preprints, papers, image collections etc.
If you are a member of the IBSS research community and you are interested in contributing digital content, please contact the repository administrator ■
Enter some text in the box below to search repository.
Communities in repository
Choose a community to browse its collections,
Books  Conferences  IBSS serials 
IBSS Institutional Repository was set up on the bases of DSpace software modified within QceanDocs project.
IBSS Institutional Repository provides a stable, secure and accessible repository for the long-term preservation of digital resources created by IBSS and it's staff members.
Copyright IBSS Repository developers
2007 - Feedback ', 'SorbunoY, O, Sergeeue, D, Slipetskyy
Figure 4. IBSS electronic repository home page.
By setting up the electronic repository and providing access to the institute publications IBSS supports the Open Archives Initiative (OAI) http://www.openarchives.org. OAI develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. Open access (OA) means free, immediate, permanent, full-text, online access, for any user, web-wide, to digital scientific and scholarly material, primarily research articles published in peer-reviewed journals.
IBSS repository is being harvested by several well known systems like OAIster (http://www.openarchives.org/), Avano
(http://www.ifremer.fr/avano/), The University of Illinois OAI-PMH Data Provider Registry (http://gita.grainger.uiuc.edu), OpenDOAR (http://www.opendoar.org/), ROAR (http://roar.eprints.org). This means broader access to the IBSS publications and scientific results.
While developing InDIMS we faced the problem which usual for countries where different from Latin alphabet is used. In Ukraine the situation is intensified by the usage of Russian and Ukraine language in documents. Moreover, different standards (at least 5) exist to transliterate names from Russian or Ukrainian and this year the new rules of transliteration were introduced in Ukraine, which, by the way, followed by not all official organizations. The result is frustrating because you never know whether you are talking about the same person or not (Vladimir Vladimirov and Volodymyr Vladymyrov - one person or two?)
Thus the name of our colleague Денис Слипецкий can be: Denis Slipetsky, Denys Slipetskyy, D. Ya. Slipetskiy, etc.
To solve this problem the special structure was introduced to the database in the tables where staff information is stored. As in our case the name of the person in Russian is unambiguous - it is used as the primary key and all variants which data managers met while fill in database from primary sources are added to the person as aliases. They are stored in database in XML format, which makes it possible to search and retrieve information rather fast. Also several procedures developed to transliterate names according existing standards like ISO 9. These developments will improve the search effectiveness when personal information is important like publication author or principal investigator.
One of the main advantages of InDIMS is a possibility to integrate together not only metadata but also data themselves with an additional information from different resources. For example, one can obtain geographical information on the data with detailed metadata, information about person responsible for the particular data and appropriate bibliographic references which can be accessible on-line. InDIMS allows combining together many entities from different sources on one page automatically. And a number of entities for possible integration is growing continuously.
Many researchers consider data management as technical, boring and an (un)necessary evil; so data management is often insufficiently planned, or not planned at all, and is assigned a low priority . For the many years IBSS researches kept their data and information on paper somewhere inside their tables or in scattered files on local computers.
The integrated data and information management system (InDIMS) will allow to introduce the new information and data management procedures at the IBSS and will give the user a possibility to have the convenient access to all types of data and information necessary for the effective scientific work including both the data collected within the institute and data from different national and international sources.
Acknowledgements. This work was supported in part by the OCEAN-UKRAINE, a project Supported by the Flemish Government - Department of Foreign Affairs.
1. E.VANDEN BERGHE, M.J. COSTELLO, (2007). Ocean Biodiversity Informatics - an emerging field of science, in: Vanden Berghe, E. et al. (Ed.). Proceedings of 'Ocean Biodiversity Informatics': an international conference on marine biodiversity data management Hamburg, Germany, 29 November - 1 December, 2004.
2. J. HASPESLAGH, E. VANDEN BERGHE (2001). IMIS: Integrated Marine Information System, IAMSLIC Conference Proceedings 2001, pp. 37-63
3. V.L. VLADIMIROV, V.G. LYUBARTSEV, V.V. MIROSHNICHENKO. (2003) Integrated Multidiscipline Marine
Environmental Databases: OceanBase System - effective tool to manage integrated databases. In: Integrated Technologies for Environmental Monitoring and Information Production. Edited by N.B. Harmancioglu, S.D. Ozkul, O. Fistikoglu, and P. Geerders. Kluwer Academic Publishers, Dordrecht/Boston/London/, pp. 241248.
4. J. SEYS, J. MEES, E. VANDEN BERGHE, P. PISSIERSENS.
(2003). Marine Data Management: we can do more, but can we do
better? Ocean Challenge, Vol. 13, No. 2, pp.20-24.