HPC Senior Systems Administrator will work with other Advanced Computing Infrastructure team members to administer approx 4000+ node (IBM, Lenovo), storage systems (DDN storage), other support computers, networking fabrics, and other related systems and services.
- High level of Linux administration experience (RHEL6, RHEL7 or equivalent).
- in-depth knowledge on H/W systems and storage, infinibond /Omni path and their architecture
- Experience managing high performance data storage systems.
- Performance monitoring and fine tuning on HPC cluster systems
- Administration experience on GPFS/LUSTER or any parallel file systems
- Administration and management of xcat/brigh cluster manager or any other HPC cluster manager software in large scale environment
- Scripting knowledge to enable automation or problem detection and automatic patch and firmware management on large cluster
- Demonstrated knowledge of clustered Linux systems, including securing systems and day-to-day troubleshooting, monitoring, support, software packaging.
- Working within industry-wide best practices Administering, configuring.
- Supporting HPC clusters, including systems with accelerators, and high performance file systems and storage Hardware installation, configuration, upgrades and repairs fault diagnosis.
subsequent rectification of computer systems hardware automated system management tools (Xcat, puppet).
- Installing, configuring and maintain and fine tuning large computers cluster systems and software
- Experience on HPC job management tools like SLURM , torque and MPI tools
- Experience on system automation tools like Ansible or puppet
supporting Infiniband-based networks, Kerberos, LDAP Networking technologies (IP addressing);
- Storage systems administration and Implementation of different OEMS such as DDN/IBM/Huawei
configuring network switches (Cisco); L2 vs L3;
- Architecture design planning for large scale high performance cluster upgrades (hardware and software)
- Ability to adapt to changing priorities of tasks as specified by the customer.
- Experience Bachelor of computer Science (or equivalent) in a relevant discipline plus 10 years experience OR Master of computer Science (or equivalent) in a relevant discipline plus 7 years experience.
- Redhat certified engineer (RHCE and RHCSA) in redhat 6 or above is must.