NCI increases the value and reach of research-ready datasets by providing high quality and high performance data-intensive services
NCI’s data services allow users, data portals and external science cloud environments to access, interact with and extract value from our data collections. Our approach to data services is focused on working with data as a living and connected resource, developing software, and providing portals and network protocols for accessing data.
By supporting programmatic data access methods, we help researchers use repeatable processes that benefit from the quality and management practices of our data repository.
Our data services provide a number of ways of interacting with and extracting value from data, including our open source software which is part of ongoing development to better utilise the HPC capability and data collections. In some cases, we also provide the ability to interrogate the datasets through federated data service technologies, across high speed networks. This supports our international collaborations in both virtual research environments, and advancing the federated data systems. See all our user-facing data services below.
NCI’s GeoNetwork portal is our primary catalogue of data collections and is regularly updated with the latest data expanding our growing collection of datasets. It provides an easily browsable and searchable interface to discover the datasets we manage, including descriptions of what each dataset contains, key scientific details about how it was created, and a range of access and data ownership details. Each record has a persistent digital object identifier that can be used in publications and as a shareable link to the dataset.
Virtual Desktop Infrastructure
NCI’s Virtual Desktop Infrastructure (VDI) is a complementary service to the NCI supercomputer, providing an interactive scientific desktop environment loaded with an extensive library of software packages and giving access to the datasets within NCI’s internal high-speed network.
The VDI enables tools and services such as command line and scripting access, programming and analysis using Python, Jupyter Notebooks, R and other discipline specific tools and libraries. The VDI enables some scientific data visualisation and analysis that is not available on the supercomputer, including desktop scientific tools that provide reference installations for researchers having difficulty working remotely.
For more information, please read the VDI User Guide.
NCI’s THREDDS server is a high-performance and high-availability installation of Unidata's Thematic Real-time Environmental Distributed Data Services (THREDDS). THREDDS serves many of NCI’s open data collections at the file level, as well as some aggregations. It provides many different types of services to allow individual files to be selected, as well as more advanced services such as OpenDAP, NetCDF subsetting, OGC WCS and WMS. The THREDDS server is programmatically accessible, which is how many advanced tools and portals use our service.
The NCI-developed GSKY service (pronounced ji-skee) provides a high performance service for environmental and geophysical data based on data aggregation, data cubes and coordinate transformations. GSKY provides the ability for users to interact with entire datasets and the information they contain using standard community protocols. Many client software services can access data via GSKY since we follow well-known protocols used throughout the geospatial community.
Transparent to users, GSKY uses a distributed back-end data server and parallel IO design to enable a fast response to data. We have enabled GSKY on many of our large data collections, providing real-time access. This is particularly relevant for collections such as large time-series data from satellite earth observations which span decades and uses petabytes of data.
To find out about what is available under GSKY, visit our User Guide.
Earth Systems Grid Federation
NCI manages the Australian node of the international Earth Systems Grid Federation (ESGF). NCI is a core node of the ESGF, and delivers many reference data collections to researchers with peer centres including the US Department of Energy, NASA, UK Centre for Data Analysis, and the German climate research centre (DRKZ). The ESGF allows users to query and access the worldwide archive of Coupled Model Intercomparison Project (CMIP) data. Notably, NCI is the only ESGF node in the southern hemisphere, providing our users with direct, programmatic access to the data.
All Australian CMIP data is published from NCI via our node. NCI also has a local cached copy of high-demand variables from the international data and publishes it for broader access. Similarly, NCI’s ESGF node is deeply integrated with the international centres so that key variables of the Australian data can be selected and automatically replicated at the other nodes across the federation. To view all stored CMIP data, visit our data portal.
Copernicus SARA Data Portal
The Australasian Copernicus Data Hub uses a data server managed by NCI for discovering and accessing the European Space Agency’s multi-petabyte Sentinel satellite data. This data portal, the Sentinel Australasia Regional Access (SARA), provides fast access to the data cache on NCI for selected datasets, particularly for the Australasian region, which are a replica from the master site maintained in Europe. The service is used by various state government departments for making internal copies as well as for researchers.
Optical Astronomy data services
NCI publishes Optical Astronomy datasets using IVOA compliant services, such as an installation of the CSIRO ASKAP Data Archive TAP data service.
The Skymapper data service is fully described on the Skymapper web site. NCI supports the VO services including an IVOA catalogue service (including TAP), Cone Search (including SCS), and Image cutouts (including SIAP).