Raijin Maintenance – failed hardware component
Emergency Maintenance: Friday 7 Feb, 6-8pm
Date: Friday 7 February 2014, 6-8pm
Duration: 2 Hours
Purpose: Replacement of failing 84 disk enclosure
Affected: Raijin Lustre Filesystems (short, apps, images, home, system)
Dear NCI Users,
Emergency Maintenance will be performed tonight, Friday 7 February between 6-8pm to replace a failed hardware component.
NCI staff have raised a support case with the hardware vendor, and have been advised to replace the component from our onsite spare parts. The component is being replaced promptly to avoid further issues and to maintain data integrity.
The failure is unrelated to the degraded raijin:/short performance cases being investigated earlier this week.
Suspension of active jobs will commence 30 minutes prior to the maintenance window, and will be resumed when the work and testing is complete. Jobs will be resumed as soon as possible.
NCI Systems Administrators will monitor the replacement component and Raijin to verify correct operation.
NCI Storage and Infrastructure