Democratizing data-driven scientific discovery

In a groundbreaking initiative aimed at democratizing data-driven scientific discovery, the National Science Data Fabric (NSDF) and the Cornell High Energy Synchrotron Source (CHESS) have collaborated in establishing a trans-disciplinary approach for integrated data delivery and access to research data visualization, shared storage, networking, and computing resources.

The NSDF pilot, a collaborative effort connecting an open network of institutions, offers a modular and easily accessible data delivery environment. Configurable for individual and shared scientific use, this environment operates at the best economies of scale, filling a crucial gap in the current computational infrastructure. Funded by the National Science Foundation, the pilot embraces equity in access to data and cyberinfrastructure resources, benefiting a wide range of scientific domains. The active involvement of Historically Black Colleges, the Minority Serving Cyberinfrastructure Consortium, and Hispanic Serving Institutions informs the NSDF’s development, advancing inclusivity in data-driven science.

The vision of the NSDF is to establish a globally connected infrastructure that transcends the limitations of extreme data. The mission is clear: to democratize access to large-scale scientific data by developing scalable solutions for data storage, movement, and processing – deployable on various platforms, including commodity hardware and cloud computing.

Werner Sun, Director of CHESS IT, shed light on this transformative collaboration, “CHESS is collaborating with NSDF to develop a set of applications for data-intensive science, focusing on real-time visualization of large three-dimensional datasets. The eventual goal is for CHESS to be part of this national cyberinfrastructure, facilitating the transport of CHESS data across the country for analysis by other researchers.”

Commissioned in November 2023, the NSDF Entry Point at CHESS serves as a customized server connecting CHESS to NSDF storage, compute, and networking components. This Entry Point empowers CHESS users with NSDF dashboards for easy-to-use and scalable tools, offering a complete software stack for accessing data services while simplifying the intricacies of high-speed data movements.

A pivotal development in this collaboration is the implementation of the NSDF dashboard built on OpenViSUS technology, a data-intensive analytics and visualization platform that streamlines data collection, improves data quality, and increases scientific productivity. By facilitating real-time visualization of large three-dimensional datasets collected at CHESS, OpenViSUS enables experimenters to perform preliminary analysis at the beamline, with data visualized in as little as 20 minutes. NSDF Dashboards integrated into the system provide interactive data quality monitoring, allowing researchers to identify and address issues during data collection. These dashboards can be accessed onsite or offsite, allowing remote users to monitor experimental progress from their home institutions.

Read more on CHESS website

Image: Screenshot of the NSDF dashboard showing two linked views of x-ray scattering intensity in a single 3D volume