NSF 2D Materials Data Framework Training Workshop
November 11-15, 2018
This Materials Data Training Workshop was sponsored by the NSF and designed for graduate students and post-docs from research teams recently awarded NSF-2D Materials Data Framework Data Supplements. This four-day workshop was organized by the Platform for the Accelerated Realization, Analysis, and Discovery of Interface Materials (PARADIM), an NSF Materials Innovation Platform (MIP), in partnership with the NIST Office of Data and Informatics.
The workshop mission was to provide hands-on training to develop data-intensive knowledge and skills for DMR-2D research groups.
Dates: November 11-15, 2018. Location: Johns Hopkins University’s Mt. Washington Conference Center, Baltimore, MD.
Details: We are in the midst of a data revolution. The confluence of information rich measurement techniques and computing capabilities to store and analyze information are rapidly changing the face of how data is collected, distributed, analyzed, and interpreted. The Materials Genome Initiative and the NSF Materials Innovation Platforms are designed to tap into this revolution as applied to materials. This was the first in a series of data workshops that build upon the data supplements recently awarded to multiple NSF DMR teams and provided a series of training activities for students and post-docs in the realm of data sciences, with profound implications in workforce development. It was organized by the Platform for the Accelerated Realization, Analysis, and Discovery of Interface Materials (PARADIM), an NSF Materials Innovation Platform (MIP), in partnership with the NIST Office of Data and Informatics.
Specific curricular goals were for participants to be able to:
- Set up and navigate within a Python environment, with emphasis on PARADIM MIP applications
- Use Jupyter notebooks for data analysis and presentation
- Understand Python coding for control flow, data frames, plotting methods and basic statistics
- Access public materials datasets and MIP data through APIs
- Use the notebook interface for data mining and manipulation
- Use version control for their code and analysis (via GitHub)
Specific topics included:
- Bash shell
- The basics of Python
- Python packages and scripting for data analysis
- Introduction to databases and SQL scripting
- Git version control and the use of GitHub
- Introduction to Materials Domain Python packages
- Use of APIs for access to Materials datasets
- Basics of Data mining, wrangling, and visualization.
We gratefully acknowledge funding support from the National Science Foundation’s Division of Materials Research (Award #1853842). Dr. Eva Campo of the NSF provided the leadership and impetus for this workshop. Dr. Lisa Lewis, the AAAS Science and Technology Fellow at the NSF, provided additional help, insight and encouragement. Claudia Johnson, NSF Contractor, has provided administrative support throughout the planning process. We are also grateful for critical organizational support and hands-on assistance provided by a team from NIST including Chandler Becker, Daniel Wheeler, Gretchen Greene, Jonathan Guyer, and Kamal Choudhary.
JHU students have been a central part of the team with particular help from Nick Carey and the HEMI Data Rabble (Ali Rachidi, Connor Krill, and Alex Laubscher).