National Computational Infrastructure

NCI

The NCI Years (2007 onwards)

The NCRIS investment framework for eResearch was progressively implemented from 2007 onwards, with the signing of funding agreements by the Commonwealth Government and the lead organisations for each of NCI (with ANU), ARCS (with VPAC), and ANDS (with Monash University).  Indeed, it can be argued that APAC spawned the initial activities of each, with:

  • NCI conceptualised as solely a supercomputing facility, built from the base of the successful APAC National Facility program;
  • ARCS inheriting the grid and collaboration program activities of APAC, together with a governance structure based around the original APAC partners; and
  • ANDS building data discovery, access and publication services from the early roots established during the APAC years.

For NCI, its evolution has occurred in two phases, that are not entirely independent, and in ways that are significantly different to that of its antecedent.

The First Phase

The Funding Agreement that established NCI (known as NCRIS NCI Project) was executed by ANU and the Commonwealth Government in June 2007 for an amount of $26M, with the objectives being to:

  • Procure/sustain a capability computing system commensurate with international standing;
  • Advise Government on further “specialised system” investments to provide for targeted application areas;
  • Establish/operate a merit-based open access allocation scheme for researchers at publicly-funded research organisations;
  • Provide support and expertise services; and
  • Maintain a supported strategic plan for national computational needs.

With the regard to the major system acquisition, the goal at the time was to procure a capability system, the performance of which would be comparable to that of the “Track 2” systems being established in the USA, at the time, by the National Science Foundation. Quite early, it became clear that, with only the funds available, this would be a difficult goal to realise without the injection of further investment/co-investment. And so began the process of building the partnership and governance model that exists today, under the leadership of a Steering Committee chaired by Emeritus Professor Mark Wainwright, a former Vice-Chancellor of UNSW, and through the office of Professor Robin Stanton, a Pro Vice-Chancellor at ANU, and the ANU Delegate of the NCRIS NCI Contract.

With the Steering Committee (which was the antecedent of today’s Board) established in 2007, and expanded in 2008 to include, as institutional members, three of the national agencies—CSIRO, the national science agency, the Bureau of Meteorology, and Geoscience Australia—alongside ANU, as the host organisation, the first steps were taken to establish the sustaining partnership that is now in place. With the sails set, ANU’s delivery on the goals of the contract was initiated with the appointment of the foundation Director of NCI, Professor Lindsay Botten in 2008. As in the APAC era, all services in this first phase of NCI were provided through the ANU Supercomputing Facility, which was accountable to the NCI Director and to Steering Committee.

Early in 2008, at the time at which ANU was about to tender for a new peak system (under the NCRIS agreement), the Bureau of Meteorology was planning to approach the market for a new operational weather forecasting system.  With strong synergies apparent, it was agreed that BoM and ANU would issue a joint tender, with the goal of procuring interoperable systems that would facilitate research opportunities and collaboration in climate and weather science, and which also had the potential to enhance the service robustness for operational weather forecasting. This procurement took place under the governance of a Joint Steering Committee comprising BoM, ANU/NCI, and CSIRO representatives. The evaluations were completed by October 2008, at which time the world was then in the tightening grip of the Global Financial Crisis (GFC)—which spanned the entire period of the contract negotiations with the successful tenderer, Sun Microsystems.

A contract was executed by ANU and Sun in March 2009, for a 16 rack, 12,000 core, 140 TFlop Sun Constellation known as Vayu—a distributed memory cluster based on Intel Xeon Nehalem technology with a QDR Infiniband interconnect, with the US dollar exchange rate having dropped by approximately 25 per cent during contract negotiations, thereby impacting the scale of the procurement. The first phase (one-eighth) of the Sun system at NCI was installed in September 2009, more than replacing the capability of previous SGI Altix 3700 system, with subsequent upgrades in February and April of 2010 bringing the system to its full capacity.  Concurrently, BoM completed its negotiations with Sun at the same time, and its five rack system, with dual-rail Infiniband, entered production during 2010.

The emerging NCI collaboration took a most significant step forward late in 2008 with a decision by CSIRO to enter into partnership with ANU as the initial custodians of NCI—with CSIRO committing to its ongoing, strong level of co-investment in NCI by executing with ANU the Partner Service Agreement on 24 December 2008. QCIF and Pawsey (formerly iVEC) joined as minor partners shortly thereafter in 2009, followed in 2010 by Geoscience Australia, and Intersect (the NSW university consortium), which was able to repurpose a substantial ARC Linkage Infrastructure Grant for 2009 (led by the University of Technology, Sydney) to take services in NCI from 2010–14, rather than to procure a separate,standalone system for its consortium.

The year 2008–09 also saw the first steps taken to increase the diversity of computational resources available to Australian researchers, with the decision by the NCI Steering Committee to invest in two Specialised Facilities, in keeping with one of the goals of the NCRIS contract.  Following an open call for expressions of interest from research computing facility operators, the decision was taken to invest, alongside CSIRO, in each of the:

  • Specialised Facility in Bioinformatics, located at the University of Queensland—with institutional partners CSIRO, UQ, QCIF, QFAB, the State Government of Queensland and NCI, and
  • Specialised Facility in Imaging and Bioinformatics, located at Monash University (and known now as MASSIVE)—with institutional partners CSIRO, Monash University, the Australian Synchrotron, the State Government of Victoria, VPAC and NCI.

Contracts were signed with the lead agents for each facility in December 2009, with preliminary services available from mid-2010, and with a full production service made available to researchers through the NCI-funded share under its Merit Allocation Scheme from 2011–13, in the first instance.

The Second Phase

The second phase of NCI’s development commenced relatively early, with announcements in the Commonwealth Government Budget of May 2009 of the new Super Science initiatives that formed part of Government’s economic stimulus package, overlapping the implementation of the first phase under the NCRIS agreement. The 2009–10 Budget contained four major announcements for e-infrastructure in Australia:

  1. An allocation of $97M to support the implementation of data, cloud and collaboration infrastructure, to be taken forward through ARCS;
  2. An allocation of $50M for the continuation of ANDS;
  3. An allocation of $80M for the establishment of high-performance computing infrastructure to support radioastronomy and Australia’s bid for the SKA—from which emerged what is the Pawsey Centre through the iVEC consortium in Western Australia;
  4. An allocation of $50M for the Climate HPC Centre Project through ANU—which ultimately has been implemented through the NCI Collaboration.

The first of these did not proceed in the form originally planned, with ARCS being discontinued in mid-2011 at the conclusion of its NCRIS funding, and with the implementation of the project being reconceptualised into the two projects that are today known as:

  • NeCTAR (National eResearch Collaboration Tools and Resources) Project, through the University of Melbourne ($47M), and
  • RDSI (Research Data Storage Infrastructure) Project ($50M), through the University of Queensland.

Each of the Super Science investments, including the Climate HPC Centre Project through ANU/NCI, was funded from the Education Investment Fund, one of Australia’s Nation Building Funds, with this choice bringing special and challenging constraints that required government funding be used solely for infrastructure procurements and developmental activities, precluding its use for operations or recurrent costs.

The project objectives were:

  • The establishment of a petascale HPC facility prioritised to the needs of research in climate change, earth system science, national water management, which simultaneously would support meritorious and high-impact research in all fields that required access to capability-class computing;
  • The identification and support of data-intensive and flagship science application aligned with other government research and infrastructure investments;
  • The development and implementation of an access model that would meet the priority use requirements, as well as providing open access on research merit to researchers at publicly-funded research organisations;
  • The construction of a purpose-built data centre to house the HPC facility, capable  which could be upgraded to handle future usage for at least five years;

Implicit in these, and the strong constraint on the use of the funds, were that the ANU, as the contract holder, had to demonstrate to the Commonwealth its capacity to meet the recurrent costs of a supercomputer (of an appropriate scale), before commencing the construction of the data centre and the initiation of the system procurement. With all the substantial recurrent costs having to be met through the co-investment of stakeholders, it took some time to form and cement the partnership, and to establish the business plan and access model, before moving ahead with the establishment of the contracted infrastructure. Slightly more than two years elapsed from the Budget announcement date, to the commencement of construction of the data centre and the release of the tender for the petascale system—with major intermediate milestones being the execution of the Funding Agreement with the Commonwealth (April 2010), and the execution of the NCI Collaboration Agreement (July 2011).

The Collaboration Agreement lies at the heart of NCI partnership and sets out the framework and objectives for the collaboration, puts in place its governance, lays out the access model, and underpins NCI’s business model and planning. With the execution of the Collaboration Agreement by the initial partners, ANU, CSIRO and BoM in July 2011, the University was able to demonstrate to the Australian Government that it had secured at least $8M per annum for 2012–15 (i.e., sufficient to mount the operations of a supercomputer of an appropriate scale), at which point the green light was given for the commencement of construction of the new data centre, and the initiation of the public tender for the HPC system, both in August 2011.  The process of building the Collaboration continued and by early 2012, it had expanded to include Geoscience Australia, Intersect Australia (the NSW consortium), QCIF (the Queensland consortium), and a new consortium of six research intensive universities (ANU, Adelaide, Monash, USNW, Queensland, Sydney) whose participation in NCI was facilitated by a substantial ARC Linkage Infrastructure Grant led by ANU (on behalf of NCI), with approximately $11M per annum having been secured.  This partnership, which includes four national organisations, together with the high leveraging of the Commonwealth’s infrastructure monies from the Government through substantial co-investment, are two of most distinctive features of NCI.

The Collaboration Agreement in setting the foundations for the implementation of this current phase of NCI also formed the backdrop for an organisational change, from the beginning of 2012, by which the University brought together the NCI Project Office and the ANU Supercomputing Facility into a single operating unit (known as NCI-ANUSF internally, and NCI externally), and operating within with a governance model by which ANU governs the operations of NCI on the advice of the NCI Board, to within the limits of its Statutes and policies.

With the Collaboration Agreement in place from July 2011, the data centre construction was initiated in August 2011 under the managing contractor, G. E. Shaw and Associates, and was completed in September 2012, with the building formally handed over to ANU in November of that year. Concurrently, the tender for the HPC system also proceeded through a public tender from August 2011, closing in late October. The multi-stage evaluation process continued through until late-April 2012, with the Board accepting the final evaluation report in May 2012, and recommending to the University that it proceed with the procurement of a 1.2 petaflop, distributed memory cluster from Fujitsu.  The contract for the procurement was executed in mid-June 2012, leading to the delivery of the infrastructure from about September 2012 onwards, and ultimately to the debut of the system, now known as Raijin, on the Top500 list in November 2012 at rank 24, with a peak performance (Rmax) of 940 TFlops (for the 93 per cent of the system that had been implemented at the time).  In the months that followed, the DDN filesystem was commissioned, and system software evaluated and configured, culminating in an early user service from late April/early May 2013, and with a full user service from mid-June 2013. Also associated with the Fujitsu contract is a most significant collaboration framework which is being implemented to take NCI and its partners into the future, and, in particular, to deliver optimal value for the peak system procurement, through a program of work to optimise the implementation and performance of Australian climate and earth system science modelling suite, ACCESS, on Raijin, and which will investigate the implementation of today’s codes on the processor architectures that will likely be a feature of the next-generation systems.

From relatively early in the history of NCI (from around 2009 onwards), it was apparent to the Steering Committee, and subsequently the Board, that the conceptualisation of government investments at the infrastructure layer (i.e., HPC, cloud, data) was orthogonal to the realisation of research outcomes through a solutions-based approach. Thus, NCI began its path towards the comprehensive and integrated framework that underpins its delivery of high-performance solutions.  The initial activities, under NCI, in cloud computing and data-intensive computing, complementing its established role in high-performance computing, ramped up from 2009–10, with substantial upgrades to, and modernisation of, the storage infrastructure occurring in 2011–12 with a SGI solution.

Building on this framework, NCI’s Board took the strategic decision in 2011 that the NCI Collaboration, which had been established to support operations of a petascale supercomputer, should broaden its scope to provide the comprehensive and integrated services that had long been envisaged.  By that time, the RDSI and NeCTAR projects were underway, with both activities seeking proposals from national eResearch organisations to become nodes of each of the national storage infrastructure network, and the national research cloud. NCI submitted comprehensive proposals to each of RDSI and NeCTAR, with each built on a strong research community engagement, and with the goal of realising research outcomes through complementary high-performance solutions. The build-out of the integrated infrastructure solution was undertaken during 2013, with storage procurements from SGI and DDN, and with the establishment of a 3,200 core high-performance cloud from Dell (architected with a focus on data-intensive applications within the NeCTAR OpenStack Cloud Federation).

The story since 2013 is told incrementally in NCI’s annual reports, found online at http://nci.org.au/about-nci/annual-reports/.  These provide updates on Australian research outcomes and impacts that have been enabled by the NCI platforms and its expert support team, and the evolution of the infrastructure platform—with further incrementation of the storage from Netapp in 2015, a 40% upgrading of NCI’s supercomputer capacity in 2016 with a 22,000 core Lenovo NeXtScale (Broadwell) system co-funded by a substantial allocation from the NCRIS Agility Fund and the NCI Collaboration, together with major storage upgrades funded from the same source.

Winding the clock forward to 2017, NCI has evolved as the national, high-end research computing service—one which is in the vanguard of international advanced computing, delivering solutions that encompass computational modelling and the needs of big data, that enable research of excellence and impact, which deliver in the national benefit, and which maintain Australian’s international competitiveness in research and innovation.

The partnership that underpins NCI has grown in strength since its formation, anchored by the Australian National University as NCI’s host, and three of Australia’s national science agencies—the Australian Bureau of Meteorology, the national science agency, CSIRO, and the national geoscience agency, Geoscience Australia. Today, this partnership includes many of Australia’s research universities, or consortia thereof, and medical research institutes, and sustains two-thirds of the annual recurrent costs ($18 million in 2017), with contributions from the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS) providing the remainder.

Through its tightly-coupled, high-performance computing and data platforms, overlaid with internationally renowned expertise (~60 staff) in computational science, data science and data management, NCI provides essential services that underpin the requirements of research and industry, today and into the future.

As of May 2017, the infrastructure platform comprises:

  • Supercomputer, Raijin — a hybrid Fujitsu Primergy (2012-13, Intel Xeon Sandy Bridge) and Lenovo NeXtScale (2016, Intel Xeon Broadwell) system of 84,656 cores in 4416 compute nodes (together with a number of nVIDIA Tesla K80 and P100 GPUs, and Intel Xeon Phi Knights Landing nodes), with 300 terabytes of main memory, a hybrid FDR/EDR Mellanox Infiniband full fat-tree interconnect, and 8 petabytes (real) of high-performance (150 Gbyte/sec) operational storage — aggregated peak performance 2.1 petaflops
  • OpenStack Cloud — Dell, 3,200 cores (Intel Xeon Sandy Bridge), 25 TB of memory, 160 TB of SSD, 13 ceph nodes
  • Project and Collection Storage — totalling 22 (real) petabytes in three distinct Lustre filesystems of 50 Gbyte/sec (SGI), 70 Gbyte/sec (DDN) and 120 Gbyte/sec (Netapp) bandwidth — to be upgraded to  36 (real) petabytes in June/July 2017, with the replacement of the oldest of the filesystems with NetApp/Fujitsu and HPE storage.
  • Tape storage — comprising a dual site archive of  2 x 12.3 PByte capacity in Spectra libraries using LTO-5 technology, and HSM/Lustre of 2 x 18.2 Pbyte capacity using Spectra libraries and IBM TS1140/50 technology drives.

Today, NCI plays a pivotal role in Australia’s research and innovation system, supporting the work of more than 5,000 researchers across more than 500 projects being undertaken in 35 universities, 5 national science agencies (including CSIRO, Bureau of Meteorology, and Geoscience Australia), 3 medical research institutes, and industry.  The breadth and depth of NCI’s involvement in Australian research and innovation can be seen in case studies associated with Australia’s national science and research priorities (http://nci.org.au/research-news/nci-today-case-studies/), and in the numerous research highlights (http://nci.org.au/research-news/research/).

R&D which is reliant on NCI spans the full spectrum—from fundamental research, through the strategic and applied, and on to actual industrial applications.  Today, NCI services are both necessary and influential in enhancing the competitiveness and impact of outcomes in every field of science and technology. An increasing number of fields are now highly dependent on the fusion of “big compute” and “big data” that NCI provides—in weather and climate science, the earth sciences, earth observation and remote sensing, medical research, and astronomy, amongst others.

Within the university sector, NCI provides the essential high-performance computing and data foundation for more than 200 research projects, ARC and NH&MRC Centres of Excellence and Industry Hubs, and fellowships. Funding for these projects from the Australian Research Council and the National Health and Medical Research Council totals around $60 million per annum, or approximately $250 million over the lifetimes of these projects.

In the domain of the national science agencies, notably BoM, CSIRO and GA, NCI provides critical program-level support in earth system science, by serving as the development platform of the Australian national weather and climate modelling suite, ACCESS. NCI is also a national hub for major national and international satellite earth observation collections (through the Australian Geoscience Data Cube), used in the earth, marine and environmental sciences, agriculture for research, the informing of policy development, and the development of important information products for primary industry.

At the time of this update in May 2017, the Australian research sector is awaiting the release of the National Research Infrastructure Roadmap (2016-17) and its subsequent funding and implementation plan.  In the realm of HPC, the draft Roadmap has highlighted the urgency of funding for the renewal of national HPC, and has acknowledged that peak computing facilities must  “encompass the needs of big data (processing, analysis, data mining, machine learning), in addition to its traditional role of computational modelling and simulation …  compris[ing] tightly-integrated, high-performance infrastructure able to handle the computational and data-intensive workflows of today’s research, together with expertise in computational science, data science and data management.”—the directions in which NCI has evolved since 2012.

NCI looks forward to the next phase of its development in this environment, and the strengthening of its role in Australia’s national research and innovation system.

In Collaboration With