More ways to access NCI’s petascale data collections
Did you know that in addition to managing Australia’s largest supercomputer, NCI also play a vital role managing and integrating petascale data collections across the environment, water and geophysics research communities?
NCI provides researchers access to an ever-expanding national repository of Earth systems, Earth observations, climate and weather, and geophysical reference data (see https://datacatalogue.nci.org.au). This includes the latest Australian weather model products (APS), climate model and gridded products, satellite imagery from Landsat, Sentinel and Himawari, and national and state geophysics data products. Many of these significant data collections have never been accessible as integrated collections that can enable interdisciplinary analysis by the research or even the broader community. At NCI, these collections have now been prepared, standardised, and quality assured, and are progressively being made available for access and analysis. Preparation for their release has taken the combined efforts of NCI and its partner organisations: the Bureau of Meteorology, Geoscience Australia, ANU and CSIRO, as well as the NCRIS capabilities such as Astronomy (AAL), Marine (IMOS) and Terrestrial (TERN).
The work of NCI means that decades of publicly funded data gathering is now easily accessible online. Spread over 31 different collections and 12 different fields of research, NCI’s RDS data collections are a major step in the increased sharing of knowledge among Australian scientists.
Making these data collections available is a key outcome of NCI’s High Performance Data node under the NCRIS RDSI/RDS project – which supported the underlying storage and organisation of 10 Petabytes of nationally significant data collections, approximately one fifth of the national total funded by the project.
NCI has made this data available for users through a number of ways, using NCI’s National Environmental Research Data Interoperability Platform (http://nci.org.au/services/vdi/nerdip/). This platform allows users to access the data directly via the Raijin supercomputer, through interactive desktops using NCI’s Virtual Desktop Infrastructure service, or remotely through data services. These provide users with a range of ways to access and analyse the data.
NCI’s data services and data collections are also accessible from the widely-used Virtual Laboratories, and data portals. Examples include the NCRIS NeCTAR Clouds in Marine and Ecosystem science, the Climate and Weather Science Lab (CWSLab), the AuScope Virtual Geophysics laboratory, and the Australian Geoscience Data Cube. NCI’s open and accessible data collections are a significant component of the national infrastructure being provided, with wide-ranging impacts on agriculture, transport, energy and the environment. Industry groups and government departments are already making extensive use of this new trove of data.
NCI began running data-intensive training courses in 2016, with more sessions planned for 2017. As a result of the training courses so far, many researchers are now discovering the virtual environments, laboratories and range of tools they can use to work with our nationally significant data collections. We would like to encourage users to keep an eye out for upcoming training sessions. These will be advertised in NCI’s monthly newsletters and with dedicated email campaigns.