National Computational Infrastructure

NCI

The NCI Years (2007 onwards)

The NCRIS investment framework for eResearch was progressively implemented from 2007 onwards, with the signing of funding agreements by the Commonwealth Government and the lead organisations for each of NCI (with ANU), ARCS (with VPAC), and ANDS (with Monash University).  Indeed, it can be argued that APAC spawned the initial activities of each, with:

  • NCI conceptualised as solely a supercomputing facility, built from the base of the successful APAC National Facility program;
  • ARCS inheriting the grid and collaboration program activities of APAC, together with a governance structure based around the original APAC partners; and
  • ANDS building data discovery, access and publication services from the early roots established during the APAC years.

For NCI, its evolution has occurred in two phases, that are not entirely independent, and in ways that are significantly different to that of its antecedent.

The First Phase

The Funding Agreement that established NCI (known as NCRIS NCI Project) was executed by ANU and the Commonwealth Government in June 2007 for an amount of $26M, with the objectives being to:

  • Procure/sustain a capability computing system commensurate with international standing;
  • Advise Government on further “specialised system” investments to provide for targeted application areas;
  • Establish/operate a merit-based open access allocation scheme for researchers at publicly-funded research organisations;
  • Provide support and expertise services; and
  • Maintain a supported strategic plan for national computational needs.

With the regard to the major system acquisition, the goal at the time was to procure a capability system, the performance of which would be comparable to that of the “Track 2” systems being established in the USA, at the time, by the National Science Foundation. Quite early, it became clear that, with only the funds available, this would be a difficult goal to realise without the injection of further investment/co-investment. And so began the process of building the partnership and governance model that exists today, under the leadership of a Steering Committee chaired by Emeritus Professor Mark Wainwright, a former Vice-Chancellor of UNSW, and through the office of Professor Robin Stanton, a Pro Vice-Chancellor at ANU, and the ANU Delegate of the NCRIS NCI Contract.

With the Steering Committee (which was the antecedent of today’s Board) established in 2007, and expanded in 2008 to include, as institutional members, three of the national agencies—CSIRO, the national science agency, the Bureau of Meteorology, and Geoscience Australia—alongside ANU, as the host organisation, the first steps were taken to establish the sustaining partnership that is now in place. With the sails set, ANU’s delivery on the goals of the contract was initiated with the appointment of the foundation Director of NCI, Professor Lindsay Botten in 2008. As in the APAC era, all services in this first phase of NCI were provided through the ANU Supercomputing Facility, which was accountable to the NCI Director and to Steering Committee.

Early in 2008, at the time at which ANU was about to tender for a new peak system (under the NCRIS agreement), the Bureau of Meteorology was planning to approach the market for a new operational weather forecasting system.  With strong synergies apparent, it was agreed that BoM and ANU would issue a joint tender, with the goal of procuring interoperable systems that would facilitate research opportunities and collaboration in climate and weather science, and which also had the potential to enhance the service robustness for operational weather forecasting. This procurement took place under the governance of a Joint Steering Committee comprising BoM, ANU/NCI, and CSIRO representatives. The evaluations were completed by October 2008, at which time the world was then in the tightening grip of the Global Financial Crisis (GFC)—which spanned the entire period of the contract negotiations with the successful tenderer, Sun Microsystems.

A contract was executed by ANU and Sun in March 2009, for a 16 rack, 12,000 core, 140 TFlop Sun Constellation known as Vayu—a distributed memory cluster based on Intel Xeon Nehalem technology with a QDR Infiniband interconnect, with the US dollar exchange rate having dropped by approximately 25 per cent during contract negotiations, thereby impacting the scale of the procurement. The first phase (one-eighth) of the Sun system at NCI was installed in September 2009, more than replacing the capability of previous SGI Altix 3700 system, with subsequent upgrades in February and April of 2010 bringing the system to its full capacity.  Concurrently, BoM completed its negotiations with Sun at the same time, and its five rack system, with dual-rail Infiniband, entered production during 2010.

The emerging NCI collaboration took a most significant step forward late in 2008 with a decision by CSIRO to enter into partnership with ANU as the initial custodians of NCI—with CSIRO committing to its ongoing, strong level of co-investment in NCI by executing with ANU the Partner Service Agreement on 24 December 2008. QCIF and Pawsey (formerly iVEC) joined as minor partners shortly thereafter in 2009, followed in 2010 by Geoscience Australia, and Intersect (the NSW university consortium), which was able to repurpose a substantial ARC Linkage Infrastructure Grant for 2009 (led by the University of Technology, Sydney) to take services in NCI from 2010–14, rather than to procure a separate,standalone system for its consortium.

The year 2008–09 also saw the first steps taken to increase the diversity of computational resources available to Australian researchers, with the decision by the NCI Steering Committee to invest in two Specialised Facilities, in keeping with one of the goals of the NCRIS contract.  Following an open call for expressions of interest from research computing facility operators, the decision was taken to invest, alongside CSIRO, in each of the:

  • Specialised Facility in Bioinformatics, located at the University of Queensland—with institutional partners CSIRO, UQ, QCIF, QFAB, the State Government of Queensland and NCI, and
  • Specialised Facility in Imaging and Bioinformatics, located at Monash University (and known now as MASSIVE)—with institutional partners CSIRO, Monash University, the Australian Synchrotron, the State Government of Victoria, VPAC and NCI.

Contracts were signed with the lead agents for each facility in December 2009, with preliminary services available from mid-2010, and with a full production service made available to researchers through the NCI-funded share under its Merit Allocation Scheme from 2011–13, in the first instance.

The Second Phase

The second phase of NCI’s development commenced relatively early, with announcements in the Commonwealth Government Budget of May 2009 of the new Super Science initiatives that formed part of Government’s economic stimulus package, overlapping the implementation of the first phase under the NCRIS agreement. The 2009–10 Budget contained four major announcements for e-infrastructure in Australia:

  1. An allocation of $97M to support the implementation of data, cloud and collaboration infrastructure, to be taken forward through ARCS;
  2. An allocation of $50M for the continuation of ANDS;
  3. An allocation of $80M for the establishment of high-performance computing infrastructure to support radioastronomy and Australia’s bid for the SKA—from which emerged what is the Pawsey Centre through the iVEC consortium in Western Australia;
  4. An allocation of $50M for the Climate HPC Centre Project through ANU—which ultimately has been implemented through the NCI Collaboration.

The first of these did not proceed in the form originally planned, with ARCS being discontinued in mid-2011 at the conclusion of its NCRIS funding, and with the implementation of the project being reconceptualised into the two projects that are today known as:

  • NeCTAR (National eResearch Collaboration Tools and Resources) Project, through the University of Melbourne ($47M), and
  • RDSI (Research Data Storage Infrastructure) Project ($50M), through the University of Queensland.

Each of the Super Science investments, including the Climate HPC Centre Project through ANU/NCI, was funded from the Education Investment Fund, one of Australia’s Nation Building Funds, with this choice bringing special and challenging constraints that required government funding be used solely for infrastructure procurements and developmental activities, precluding its use for operations or recurrent costs.

The project objectives were:

  • The establishment of a petascale HPC facility prioritised to the needs of research in climate change, earth system science, national water management, which simultaneously would support meritorious and high-impact research in all fields that required access to capability-class computing;
  • The identification and support of data-intensive and flagship science application aligned with other government research and infrastructure investments;
  • The development and implementation of an access model that would meet the priority use requirements, as well as providing open access on research merit to researchers at publicly-funded research organisations;
  • The construction of a purpose-built data centre to house the HPC facility, capable  which could be upgraded to handle future usage for at least five years;

Implicit in these, and the strong constraint on the use of the funds, were that the ANU, as the contract holder, had to demonstrate to the Commonwealth its capacity to meet the recurrent costs of a supercomputer (of an appropriate scale), before commencing the construction of the data centre and the initiation of the system procurement. With all the substantial recurrent costs having to be met through the co-investment of stakeholders, it took some time to form and cement the partnership, and to establish the business plan and access model, before moving ahead with the establishment of the contracted infrastructure. Slightly more than two years elapsed from the Budget announcement date, to the commencement of construction of the data centre and the release of the tender for the petascale system—with major intermediate milestones being the execution of the Funding Agreement with the Commonwealth (April 2010), and the execution of the NCI Collaboration Agreement (July 2011).

The Collaboration Agreement lies at the heart of NCI partnership and sets out the framework and objectives for the collaboration, puts in place its governance, lays out the access model, and underpins NCI’s business model and planning. With the execution of the Collaboration Agreement by the initial partners, ANU, CSIRO and BoM in July 2011, the University was able to demonstrate to the Australian Government that it had secured at least $8M per annum for 2012–15 (i.e., sufficient to mount the operations of a supercomputer of an appropriate scale), at which point the green light was given for the commencement of construction of the new data centre, and the initiation of the public tender for the HPC system, both in August 2011.  The process of building the Collaboration continued and by early 2012, it had expanded to include Geoscience Australia, Intersect Australia (the NSW consortium), QCIF (the Queensland consortium), and a new consortium of six research intensive universities (ANU, Adelaide, Monash, USNW, Queensland, Sydney) whose participation in NCI was facilitated by a substantial ARC Linkage Infrastructure Grant led by ANU (on behalf of NCI), with approximately $11M per annum having been secured.  This partnership, which includes four national organisations, together with the high leveraging of the Commonwealth’s infrastructure monies from the Government through substantial co-investment, are two of most distinctive features of NCI.

The Collaboration Agreement in setting the foundations for the implementation of this current phase of NCI also formed the backdrop for an organisational change, from the beginning of 2012, by which the University brought together the NCI Project Office and the ANU Supercomputing Facility into a single operating unit (known as NCI-ANUSF internally, and NCI externally), and operating within with a governance model by which ANU governs the operations of NCI on the advice of the NCI Board, to within the limits of its Statutes and policies.

With the Collaboration Agreement in place from July 2011, the data centre construction was initiated in August 2011 under the managing contractor, G. E. Shaw and Associates, and was completed in September 2012, with the building formally handed over to ANU in November of that year. Concurrently, the tender for the HPC system also proceeded through a public tender from August 2011, closing in late October. The multi-stage evaluation process continued through until late-April 2012, with the Board accepting the final evaluation report in May 2012, and recommending to the University that it proceed with the procurement of a 1.2 petaflop, distributed memory cluster from Fujitsu.  The contract for the procurement was executed in mid-June 2012, leading to the delivery of the infrastructure from about September 2012 onwards, and ultimately to the debut of the system, now known as Raijin, on the Top500 list in November 2012 at rank 24, with a peak performance (Rmax) of 940 TFlops (for the 93 per cent of the system that had been implemented at the time).  In the months that followed, the DDN filesystem was commissioned, and system software evaluated and configured, culminating in an early user service from late April/early May 2013, and with a full user service from mid-June 2013. Also associated with the Fujitsu contract is a most significant collaboration framework which is being implemented to take NCI and its partners into the future, and, in particular, to deliver optimal value for the peak system procurement, through a program of work to optimise the implementation and performance of Australian climate and earth system science modelling suite, ACCESS, on Raijin, and which will investigate the implementation of today’s codes on the processor architectures that will likely be a feature of the next-generation systems.

From relatively early in the history of NCI (from around 2009 onwards), it was apparent to the Steering Committee, and subsequently the Board, that the conceptualisation of government investments at the infrastructure layer (i.e., HPC, cloud, data) was orthogonal to the realisation of research outcomes through a solutions-based approach. Thus, NCI began its path towards the comprehensive and integrated framework that underpins its delivery of high-performance solutions.  The initial activities, under NCI, in cloud computing and data-intensive computing, complementing its established role in high-performance computing, ramped up from 2009–10, with substantial upgrades to, and modernisation of, the storage infrastructure occurring in 2011–12 with a SGI solution.

Building on this framework, NCI’s Board took the strategic decision in 2011 that the NCI Collaboration, which had been established to support operations of a petascale supercomputer, should broaden its scope to provide the comprehensive and integrated services that had long been envisaged.  By that time, the RDSI and NeCTAR projects were underway, with both activities seeking proposals from national eResearch organisations to become nodes of each of the national storage infrastructure network, and the national research cloud. NCI submitted comprehensive proposals to each of RDSI and NeCTAR, with each built on a strong research community engagement, and with the goal of realising research outcomes through complementary high-performance solutions. The build-out of the integrated infrastructure solution was well underway by July 2013, with storage procurements from SGI and DDN in place, and with the establishment of a 3,200 core high-performance cloud (architected with focus on data-intensive applications within the NeCTAR cloud federation) from Dell in progress.

In Collaboration With