Description
Developed by the NCI Training Team, this online course showcases how you can apply Artificial Intelligence and Machine Learning to your materials science workflows.
Machine learning (ML) models have demonstrated human, or even superhuman, performance in many tasks, from playing traditional board games to image classification. In this AI/ML in Materials Science course, we will discuss how ML is poised to have a similar transformative impact in materials science. Applied on large data sets, ML techniques can be used to discover novel technological materials, to model complex systems at an accuracy beyond the reach of traditional computational techniques, and to enhance the accuracy and speed of interpreting characterisation data. A key focus of this course will be on the heterogeneity and scarcity of materials data, the challenges these characteristics present for ML and the potential approaches to overcome them.
Prerequisites
Having basic programming experience with Python is required. Knowledge about using python packages like NumPy, pandas and scikit learn is advantageous.
Attendees will ideally know some basic theory of Machine Learning and Deep Learning, and have intentions of using AI/ML and supercomputers to boost their research.
We will use the NCI ARE service and the Gadi Supercomputer. Attendees are encouraged to review the ARE User Guide for background information.
Recommended Materials
Course lectures:
-
Basics of ML: “Machine Learning” by Andrew Ng
-
Basics of TensorFlow: “DeepLearning.AI TensorFlow Developer Professional Certificate” by Laurence Moroney
Books:
-
Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville
-
Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow (second edition) by Geron
Objectives
This course is designed to help researchers apply AI/ML in Materials Science and take advantage of the Gadi supercomputer Gadi to boost their research. The objectives of this course include:
- Understand the basic concepts of ML in the Materials Science domain
- Gain knowledge about the strengths and weaknesses between different model representations (e.g. composition based vs structure based)
- Help attendees get familiar with the Gadi environment and raise awareness of tools that could accelerate the implementation of ML workflows
Learning Outcomes
- Know how to use python machine learning package: Scikit learn
- Become familiar with some commonly used supervised ML methods including linear regression, SVM, random forest, etc.
- Understand the featurization process in the study of solid state materials, which then can be used as input in the ML models
- Perform structural relaxation using Bayesian Optimisation and predict the formation energies using graph neural network
- Practice on the benchmark test suite (Matbench) and compare the performance of different algorithms