Speaker
Description
Research and discovery are increasingly computation, data-intensive, interdisciplinary, and collaborative. However, reproducing results remains a significant challenge. Scholarly publications are often disconnected from the data and software that produced the results, making reproducibility difficult. Researchers today generate vast amounts of data, code, and software tools that need to be shared, but sharing data remains challenging, especially when data is large or sensitive. Moreover, funding agencies are increasingly requiring sharing data used to generate results, yet data is only valuable if it is reproducible. A key challenge is that reproducible artifacts are typically created only after the research is complete, hindered by a lack of standards and insufficient motivation. Despite growing recognition of the importance of reproducibility, the research community still lacks comprehensive tools and platforms to support reproducible practices throughout the research cycle, as well as a culture that educates and trains researchers on the topic.
This presentation will introduce SHARED (Secure Hub for Access, Reliability, and Exchange of Data), a new initiative at the University of Chicago to develop a comprehensive platform for data-driven research and data management. We will discuss the challenges and opportunities of reproducibility in computational research and strategies for capturing reproducible artifacts throughout the research process. Additionally, we will share progress on building a community of practice to democratize reproducibility in scientific research.