Scientific data these days comes from a large number of geospatial instruments: from satellites, ground based radar and seismic sensors, just to name a few. The question today is how can we more easily do data-intensive research across the many kinds of extremely large and complex datasets that come from these sources. NCI has developed a new online tool that is making the analysis and visualisation of all of this data much easier.

For a long time, researchers have been working with data by downloading datasets that are often comprised of millions of individual files. When working with those datasets and their peculiarities, researchers often have to organise and laboriously work across these files to find the information needed. The sheer complexity means that researchers compromise on the scope of their work, or feel limited in their ability to work with the data.

NCI has created a new data service, called GSKY, which makes this old and inefficient way of working with data obsolete. GSKY accesses and analyses the big geospatial data on NCI's cloud and high-performance computing systems, and then delivers it to a user device or website. For example, hundreds of time-series and geospatially overlapping data can be seamlessly merged together so that a researcher can focus on the information rather than dealing with data files. Furthermore, using GSKY's processing capability, that data can be analysed on-the-fly using user-provided algorithms to extract new information over both space and time.

Behind the scenes, GSKY works out how to manipulate the datasets so that they seamlessly work together. For example, in large-scale environmental analysis, the images from different satellites can be in different shapes and sizes, environmental survey data can come in many different formats, and even urban boundary maps need to be considered. As a user of GSKY, working with data is as easy as choosing from a list of available datasets, specifying a region and time-frame, and asking GSKY to analyse the information as harmonised data. GSKY then returns the results of the data required, which can be accessed over the network to the client application or for visualisation in an online map.

One example of such a use is GEOGLAM-RAPP, an interactive online map produced by the Group on Earth Observations and its Global Agricultural Monitoring for tracking Rangeland and Pasture Productivity. GEOGLAM-RAPP allows users to track and analyse the condition of global rangelands used for activities like agriculture and livestock production.

Information that can be displayed using the GEOGLAM-RAPP includes:

  • Vegetation cover
  • Normalised Difference Vegetation Index
  • Monthly rainfall
  • Monthly soil moisture
  • Global land use and land cover
  • Livestock density

Users accessing GEOGLAM-RAPP are able to view one or more of the above datasets on a local or global scale, with at-a-glance comparisons of multiple datasets. The time-series datasets can be resolved down to a specific week over the course of the last several decades for further investigation, or the entire dataset can be played in its entirety to see changes over time. Using GSKY to access datasets promises to make innovative environmental research even better.