NCI supports the research community through our code optimisation expertise, modifying the code of popular software to enable more efficient running at the large scales of our supercomputer. ModEM3D is a popular code used in geology research, in particular Magnetotellurics (MT), the study of electric and magnetic fields deep within the earth’s crust. Developed originally by researchers at Oregon State University, ModEM3D provides the software required for processing large grids of data captured by transmitters in the field.

Processing these large arrays of data requires intense computation to turn the electric and magnetic field measurements into a three-dimensional representation of the underground geology. Building this map of underground structures and ore bodies requires thousands of data points at a time and a complex series of calculations to get to the final product. Originally developed for smaller computing platforms, the ModEM3D code now comes up against some limitations when used on dozens of processors at once.

NCI staff have analysed and engineered the code to remove certain critical bottlenecks and make it functional at these large scales. A major issue was around memory constraints: much of the computing takes place across many parallel processes running over multiple compute nodes, but each worker process sends its calculated figures to a master process for storage. When that process exceeds its memory limit, the entire calculation can crash. To get round the problem, NCI implemented a memory management scheme where the calculated figures are temporarily stored on Raijin’s high-performance filesystems. In this way, the memory requirements of the code can be dramatically reduced.

They also implemented hybrid parallelization in the code to improve the scalability of time-consuming functions. By enabling multithreading for key processes, the code can use more CPU cores than ever. Optimising that parallel process removes some of the slowdowns that take place with each calculation, leading to an overall faster and smoother performance. Certain functions in the code reduced their execution time by more than half, with improvements visible in almost every function.

Overall, the optimisations that NCI has engineered in the ModEM3D code make it much more capable of handling large datasets at supercomputer scales. As the improvements gain acceptance across the research community, each of the small improvements together compound to make for a more seamless research experience.

This research highlight was originally published in the 2018-2019 NCI Annual Report.