Project Details
Description
Innovation in chemistry and materials is a key driver of economic development, prosperity, and a rising standard of living. It also offers solutions to pressing problems on energy, environmental sustainability, and resources that shape our society. This research program is designed to boost the chemistry community's capacity to address these challenges by transforming the process that creates underlying innovation. The research promotes a shift away from trial-and-error searches and towards rational design. These combine traditional chemical research with modern data science by introducing tools such as machine learning into the chemical context. This project enables and advances this emerging field by building a cyberinfrastructure that makes data-driven research a viable and widely accessible proposition for the chemistry community, and thereby an integral part of the chemical enterprise. Tools and methods developed in this research provide the means for the large-scale exploration of chemical space and for a better understanding of the hidden mechanisms that determine the behavior of complex chemical systems. These insights can potentially accelerate, streamline, and ultimately transform the chemical development process. The project also tackles the concomitant need to adapt education to this new research landscape in order to adequately equip the next generation of scientists and engineers, to build a competent and skilled workforce for the cutting-edge R&D of the future, and to ensure the competitiveness of US students in the international job market. By promoting minority participation in this promising field, it contributes to a sustained push towards equal opportunity in our society. This project thus promotes the progress of science and advances prosperity and welfare as stated by NSF's mission.
While there is growing agreement on the value of data-driven discovery and rational design, this approach is still far from being a mainstay of everyday research in the chemistry community. This work addresses three key obstacles: (i) data-driven research is beyond the scope and reach of most chemists due to a lack of available and accessible tools, (ii) many fundamental and practical questions on how to make data science work for chemical research remain unresolved, and (iii) data science is not part of the formal training of chemists, and much of the community thus lacks the necessary experience and expertise to utilize it. This research centers around the creation of an open, general-purpose software ecosystem that fuses in silico modeling, virtual high-throughput screening, and big data analytics (i.e., the use of machine learning, informatics, and database technology for the validation, mining, and modeling of resulting data sets) into an integrated research infrastructure. A key consideration is to make this ecosystem as comprehensive, robust, and user-friendly as possible, so that it can readily be employed by interested researchers without the need for extensive expert knowledge. It also serves as a development platform and testbed for innovation in the underlying methods, algorithms, and protocols, i.e., it allows the community to systematically and efficiently evaluate the utility and performance of different techniques, including new ones that are being introduced as part of this project. A meta machine learning approach is being developed to establish guidelines and best practices that provide added value to the cyberinfrastructure. The work is driven by concrete molecular design problems, which serve to demonstrate the efficacy of the overall approach. The educational challenges that arise from the qualitative novelty of data-driven research and its inherent interdisciplinarity are addressesed by leveraging a new graduate program in Computational and Data-Enabled Science and Engineering for cross-cutting course and curricular developments, the creation of interactive teaching materials, and a skill-building hackathon initiative. This award is jointly made with the Division of Chemistry's, Chemical Theory, Models and Computational Methods Program.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
| Status | Finished |
|---|---|
| Effective start/end date | 03/1/18 → 08/31/24 |
Funding
- National Science Foundation: $561,685.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.