Pilot Project: Enabling WFSI Data Integration Through a Managed Data Platform
Shreyas Cholia | LBNL
This project will highlight the value of a data management platform and repository to manage SERDP Wildland Fire Science Initiative (WFSI) data. The project will demonstrate a pilot data platform to manage the end-to-end data management lifecycle for selected WFSI datasets. This includes data integration, harmonization, curation, publication, and access, to enable end-use fire modeling and analysis scenarios.
This pilot effort will implement and deploy a software system to enable an integrated approach to archiving data and metadata for WFSI. The project will demonstrate a core set of the features and best practices, to support WFSI data. Key features include (1) automated user-friendly web-based interfaces for data publication, search, and download (2) collaborative data curation abilities (3) long-lived identifiers with persistent locations, and (4) support for well-defined data and metadata formats. This project will support harmonized data and metadata for selected WFSI datasets on the data platform. This process is also expected to identify potential gaps in current datasets and opportunities to improve data standards. These data will be put through a curation pipeline in collaboration with wildland fire subject matter experts, and will demonstrate an end-use data analysis or modeling scenario to showcase how a managed and integrated dataset can enable scientific discovery. The demonstration will leverage the software used by the Department of Energy Environmental Systems Science Data Infrastructure for a Virtual Ecosystem data repository ( https://data.ess-dive.lbl.gov/).
The demonstration will directly benefit SERDP & ESTCP by providing a path toward a long-term data management system for WFSI projects. It will highlight the importance of a long-term well-managed persistent data publication platform that can serve data based on the FAIR (Findable, Accessible, Interoperable, Reproducible) principles. It will demonstrate the value of long-term stewardship of data, enabling data reuse, improving data standardization, and supporting reproducible research. In the context of fire models, this can facilitate comparisons between existing models, while providing an integration point across datasets. A successful demonstration will ultimately help illuminate the path to a common set of data management best practices that will significantly streamline the data integration and publication challenges across the diverse spectrum of WFSI projects.