SDMA&V

Scientific Data Management, Analysis and Visualization at Extreme Scale

The Scientific Data Management and Analysis at Extreme Scale program funds innovative basic research in computer science for management and analysis of extreme-scale scientific data in the context of petascale computers and/or exascale computers with heterogeneous multi-core architectures.

The value of scientific data is realized only when data are effectively analyzed and results are presented to the science community, policy makers, and the public in an understandable way.

The challenges of analyzing massive scientific data sets are compounded by data complexity that results from heterogeneous methods and devices for data generation and capture and the inherently multi-scale, multi-physics nature of many sciences, resulting in data with hundreds of attributes or dimensions and spanning multiple spatial and temporal scales. The combination of massive scale and complexity is such that high performance computers will be needed to analyze data, as well as to generate it through modeling and simulation.

Core Team

Program Manager: Lucy Nowell
E-mail: Lucy.Nowell@science.doe.gov

Projects

CINEMA
The goal of the CODES project is to use highly parallel simulation to explore the design of exascale storage architectures and distributed data-intensive science facilities. Increasingly, science endeavors rely heavily on data management, analysis, and storage as part of the discovery process. To...
Supporting Co-Design of Extreme-Scale Systems with In Situ Visual Analysis of Event-Driven Simulations
Adding Data Management Services to Parallel File Systems
This project will create a fundamental shift in the design and development of tools for next-generation scientific data by focusing on efficiency and scientist productivity that will be important to tackle some of the nation’s biggest scientific challenges. Improved usability of tools to harness...
Scalable, In-situ Data Clustering Data Analysis for Extreme Scale Scientific Computing
ECRP: Data Exploration at the Exascale
ECRP: Combating the Data Movement Bottleneck
DAX
A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale
High Performance Decoupling of Tightly-Coupled Data Flows
Extreme-Scale Distribution-Based Data Analysis
An I/O Platform for Exascale Data Models, Analysis and Performance
ECRP: Efficient graph kernels for extreme scale analysis of environmental community data
This project focuses on a key set of problems facing our community as we move towards the exascale regime and respond to challenges resulting from ever-increasing amounts of scientific data from simulations and experiments. First, we aim to better understand how to effectively take advantage of an...
IDEALS: Improving Data Exploration and Analysis at Large Scale
ECRP: Images Across Domains, Experiments, Algorithms and Learning
An Information-Theoretic Framework for Enabling Extreme-Scale Science Discovery
In situ Indexing and Query Processing of AMR Data
Exploration of Exascale In Situ Visualization and Analysis Approaches
Performance Understanding and Analysis for Exascale Data Management Workflows
Dynamic Non-Hierarchical File Systems for Exascale Storage
Optimizing the Energy Usage and Cognitive Value of Extreme Scale Data Analysis Approaches
Optimizing Power Usage for Data-Intensive Workflows and Algorithms on Modern Computing Architectures
Runtime System for I/O Staging in Support of In-Situ Processing of Extreme Scale Data
Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery
Domain-Specific Languages for in situ Data Analysis and Visualization on Emerging Architectures
Scientific Data Management in the Exascale Era
SDS
Scientific Data Services (SDS) – Autonomous Data Management on Exascale Infrastructure
ECRP: Scalable and Energy-Efficient Methods for Interactive Exploration of Scientific Data
Towards Exascale: High Performance Visualization and Analytics 
UDA
Usable Data Abstractions for Next-Generation Scientific Workflows
A Unified Data-Driven Approach for Programming In Situ Analysis and Visualization
Scalable and Power Efficient Data Analytics for Hybrid Exascale Systems
XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem

Resources

Handouts

As computers continue their exponential growth in computing power, they also produce and consume an ever increasing amount of data. This increase in...
The advent of the exascale era is forcing a re-examination of the computational processes for science, and particularly those for scientific data...
As scientists eagerly anticipate the benefits of extremescale computing, roadblocks to science discovery at scale threaten to impede their progress....
High-performance computing relies on ever finer threading. Recent advances in processor technology include greater numbers of cores, hyperthreading,...
Modern computational and experimental sciences face a major challenge in coping with the sheer volume and complexity of data produced. Storing,...

Posters

Decaf poster Jan. 2015 PI Meeting
High-performance computing relies on ever finer threading. Recent advances in processor technology include greater numbers of cores, hyperthreading,...
Modern computational and experimental sciences face a major challenge in coping with the sheer volume and complexity of data produced. Storing,...

Quad Charts

Xanalytics Quad Chart Oct 2013
SDMX Quad Chart Oct 2013
RSVP Quad Chart Oct 2013
InfoTheory Quad Chart Oct 2013
HPVA Quad Chart Oct 2013
Data Analysis at Extreme
Adding Data Management to Parallel File Systems
ExaHDF5 Quad Chart Oct 2013

Solicitations

The Office of Advanced Scientific Computing Research (ASCR) in the Office of Science (SC), U.S. Department of Energy (DOE), hereby invites...
The Office of Advanced Scientific Computing Research (ASCR) of the Office of Science (SC), U.S. Department of Energy (DOE), hereby announces its...