Since 2005, Dr Adam Lewis and the team at Geoscience Australia's National Earth & Marine Observations Group have worked to maximise the value and impact of images taken from space by earth observation satellites.
Their Data Cube prototype, using high performance computing and RDSI-funded storage facilities at NCI, represents a new paradigm in analysing and providing public access to these massive datasets.
"The Data Cube evolved because we've become good at receiving and calibrating large amounts of satellite data," says Dr Lewis.
"We now have hundreds of Terabytes of data covering all of Australia going back 30 years, and we wanted to begin applying it to big problems. Once we knew this is what we wanted to do, it became clear we needed to organise the data differently."
In the past, scientists could calibrate a few satellite images and use them for a particular purpose. However, this approach did not scale to the level of analysis or access Geoscience Australia and their partners envisioned.
The Data Cube concept is to chop the Landsat images into tiles and stack them in time sequences covering the same area of ground. This creates something like a geographic time machine, unlocking decades of information about the Australian landscape.
The National Flood Risk Information Project (NFRIP) has been the first large test case for the Data Cube. This project aims to improve the quality, availability and accessibility of flood information in Australia and to raise community awareness of flood risks. The analysis of the Data Cube is showing areas of land where surface water has been observed from satellites in the past, and this will be a key dataset that will be available through the public portal developed through NFRIP.
"Sometimes places get wet that people forget about," says Dr Lewis. "For example, in middle Australia there are large flood plains where water moves slowly. People forget it's a flood plain, and they build something on it.
"Using the Data Cube we can look at how often surface water has been seen from satellites at locations across Australia. Manually, this would take 8-10 years to do. Now we can run the analysis in 8 hours."
The Data Cube requires a great deal of both storage and compute capacity, which is why the RDSI-funded storage at NCI, home to the Southern Hemisphere's most powerful supercomputer and fastest filesystems, is so important.
"Traditionally high performance computing uses lots of CPU but not much disk space. This needs lots of storage and lots of compute. The RDSI investment opens the door for us and others to use large datasets in the HPC environment," says Dr Lewis.
This has been a global problem, with agencies such as NASA, the US Geological Survey, and the European Space Agency, able to put only a small fraction of satellite data into the public domain where it can be used.
"With most scientific datasets, a limiting factor is how many people can look at, analyse, and extract information from the data. This solution means more people can extract that information.
"As a strategic investment, the RDSI is critical for us to build this capability. And we'll keep adding to it. We have about a petabyte of storage now and this will keep growing."
With the Data Cube computing and storage environment in place at NCI, Geoscience Australia is positioned to apply the satellite imagery to other problems.
"The next thing might be to look at all the places across Australia that have been affected by bushfires in a particular year," says Dr Lewis. "Or to analyse how much soil in an area is bare earth, and how that affects runoff and water quality into places like the Great Barrier Reef. It's all connected."
This article was originally published on the RDSI website: https://www.rdsi.edu.au/use-cases