Modern computational and experimental sciences face a major challenge in coping with the sheer volume and complexity of data produced. Storing, reading, querying, analyzing, and sharing data are tasks common across virtually all areas of science, yet advances in the data management infrastructure, particularly I/O, have not kept pace with our ability to collect and produce scientific data. Our proposed work consists of three thrust areas that address these contemporary challenges:
We are developing high performance I/O middleware that makes effective use of computational platforms and deploying them through the HDF5 software.
We are developing new auto-tuning and transparent data re-organization techniques, and extending our existing work in easy-to-use, high-level APIs that expose scientific data models.
We are extending query-based techniques, and developing novel in situ analysis capabilities for HDF5 data.
While our research is driven by close collaborations with a broad range of DOE science codes, we will broadly deploy new capabilities on DOE production supercomputing facilities (such as NERSC and ALCF), thereby benefiting the broader computational science community