PROJECT DATA MANAGEMENT
Integrative Data management and processing
The overall goal of this project is to provide a knowledge management, sharing and processing platform that will enable the CRC/TR to sustainably store data, to efficiently access and process data as well as to explore new, integrative, and reproducible methods for knowledge generation from data. Such a platform will not only be beneficial to the work of individual researchers, but also crucial for communication of data and knowledge across different projects and/or CRC/TR funding periods. Moreover, it is needed to guarantee the sustainability of the CRC/TR results beyond its funding time. The provisioning of such a platform is challenging due not only to the huge expected data volume, but also the heterogeneity of the data in terms of type and processing needs and the fact, that for most projects, raw data is extensively processed with proprietary software before it is usable to answer research questions.
The main research issues that need to be addressed in the first phase are threefold: (1) The design and implementation of a data storage concept guaranteeing high reliability and scalability to a large amount of data. A hierarchical concept is necessary that takes into account the different requirements of individual projects regarding the efficiency of data access. (2) The design and implementation of a meta database that stores information describing the acquisition, quality, provenance, and interpretation of data. This includes linking to the actual data and processes. (3) The design and implementation of a collaborative platform as part of a virtual research environment that enables the efficient analysis of data across the locations and the participating institutions in Jena and Würzburg. The goal is to make possible this data analysis for a large class of relevant problems in an interactive fashion and to open avenues for interproject exploration and knowledge generation.
Rather than setting up such a platform from scratch, the solution will be based on a state-of-the-art data management platform that already provides a rich set of core functionalities. However, significant research effort is needed for its adaption and extension to meet the specific requirements of the CRC/TR. By considerably extending a popular platform with cutting-edge data management and processing capabilities, the work of this project will be beneficial not only to the CRC/TR, but also to the wider research community.
Beyond the technological development of the platform, another core topic of this project will be the design and implementation of innovative training concepts to ensure a thorough understanding of issues related to data management and processing by the involved CRC/TR scientists.
Over the 12-year course of the CRC/TR, we intend to develop a central software platform to store, manage, and share relevant data and processing information of all projects ensuring reproducibility and sustainability of research results. In addition, the platform will allow for an interactive access and a systematized analysis of the large and heterogeneous data. Thus, this project will not only provide CRC/ TR scientists with a tool to organize and analyze their data, but also foster the exchange of tools and the sharing of knowledge.
- Friedrich Schiller University Jena
|Combining P-Plan and the REPRODUCE-ME ontology to achieve semantic enrichment of scientific experiments using interactive notebooks||2018||Samuel, S. and König-Ries, B.||Posters & Demo Track at the 5th Extended Semantic Web Conference (ESWC), Crete, Greece||More|
|The story of an experiment: A provenance-based semantic approach towards research reproducibility||2018||Samuel, S., Groeneveld, K. Taubert, F., Walther, D., Kache, T., Langenstück, T., König-Ries, B., Bücker, H.M., and Biskup, C.||11. Intl. Conf. on Semantic Web Applications and Tools for Health Care and Life Sciences. Antwerp, Belgium.||More|
|ProvBook: Provenance-based semantic enrichment of interactive notebooks for reproducibility||2018||Samuel S. and König-Ries, B.||Posters & Demo Track at the 17th International Semantic Web Conference (ISWC), Monterey California||More|
|Towards Reproducibility of Microscopy Experiments||2017||Samuel, S., Taubert, F., Walther, D., könig-Ries, B., and Bücker, H. M.||D-Lib Magazine||More|
|REPRODUCE-ME: Ontology-based Data Access for Reproducibility of Microscopy Experiments||2017||Samuel, S., and König-Ries, B.||14th European Semantic Web Symposium (ESWS)|
|On the reproducibility of biological image workflows by annotating computational results automatically||2017||Taubert, F., and Bücker, H.M.||IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 1538-1545||More|
|Automatic differentiation of computer programs in the time and frequency domain||2017||Bücker, H.M., and Walther, D.||Proceedings of the 2017 European Conference on Electrical Engineering and Computer Science EECS, Bern, Switzerland, November 17–19, 2017, 335–340, Los Alamitos, CA, USA, 2017. IEEE Computer Society||More|
|Integrative data management for reproducibility of microscopy experiments||2017||Samuel, S.||PhD Symposium at 14th Extended Semantic Web Conference (ESWC), Portoroz, Slovenia||More|
|A quality management workflow proposal for a biodiversity data regulation repository||2014||Owonibi, M. and König-Ries, B.||Proc. of the 2nd International Workshop on Modeling and Management of Big Data (MOBiD’14)||More|
|RIOS: Efficient I/O in reverse direction||2014||Willkomm, J., Bischof, C. H., and Bücker, H. M.||Software: Practice and Experience||More|
|Explorative Analysis of Heterogeneous, Unstructured, and Uncertain Data: A Computer Science Perspective on Biodiversity Research||2014||Beckstein, C., Böcker, S., Bogdan, M., Bruehlheide, H., Bücker, H. M., Denzler, J., Dittrich, P., Grosse, I., Hinneburg, H., König-Ries, B., Löffler, F., Marz, M., Müller-Hannemann, M., Winter, M., and Zimmermann, W.||Proceedings of the 3rd International Conference on Data Management Technologies and Applications||More|
|A conceptual model for data management in the field of ecology||2014||Chamanara, J. and König-Ries, B.||Ecological Informatics||More|
|A new metric enabling an exact hypergraph model for the communication volume in distributed-memory parallel applications||2013||Fortmeier, O., Bücker, H. M., Fagginger Auer, B. O., and Bisseling, R. H.||Parallel Computing||More|
|Diverse or uniform? Intercomparison of two major German project databases for interdisciplinary functional biodiversity research||2012||Lotz, T., Nieschulze, J., Bendix, J., Dobbermann, M., and König-Ries, B.||Ecological Informatics||More|
|Solving a parameter estimation problem in a three-dimensional conical tube on a parallel and distributed software infrastructure||2011||Bücker, H. M., Fortmeier, O., and Petera, M.||Journal of Computational Science||More|
|Parallel re-initialization of level set functions on distributed unstructured tetrahedral grids||2011||Fortmeier, O. and Bücker, H. M.||J Comput Phys||More|
|EFCOSS: An interactive environment facilitating optimal experimental design||2010||Rasch, A. and Bücker, H. M.||ACM Transactions on Mathematical Software||More|
|Diane: A matchmaking-centered framework for automated service discovery, composition, binding, and invocation on the web||2007||Küster, U., König-Ries, B., Klein, M., and Obreiter, P.||International Journal of Electronic Commerce||More|