Linux Systems Engineer/Administrator
East Tennessee R&D facility is seeking a Linux Systems Engineer to join their growing team. Ideal candidates must be able to complete a Linus Questionnaire and pass a background/drug screen. This position is for a multi year contract.
Primary duties include:
- Design, install, configure, and maintain Linux systems, Storage, Backup, HPC clusters and application software in support of research needs.
- Provide consulting in the selection and purchase of hardware and software systems.
- Ensure the secure and effective operation of computing systems through compliance with ORNL procedures and IT Internal Operating Procedures.
- Troubleshoot system problems quickly and effectively.
- Work with other systems engineers and vendors to resolve hardware and software issues.
- Answer escalated helpline calls in addition to primary project work.
- Develop and document procedures for startup, execution, shutdown, backup, and recovery. - Monitor systems performance.
- Install and configure software, both commercial packages and various open source packages.
- Maintain documentation/notes on software builds and installs.
- Work with scientific users to determine needs, and balance needs against cyber requirements to provide a solution that meets the requirements of both.
- Port, modify and write system management, logging and scientific tools.
- Support data storage systems and backup services.
- Assist users with use of computing systems.
- Promote operational efficiency and reliability through automation.
- Off hours support during maintenance windows and on-call support rotation may be required.
Basic Requirements:
- Bachelor's degree in Computer Science or related technical subjects or equivalent combination of education and experience.
- A minimum of 3 years of experience managing UNIX/Linux Systems.
- Bachelor's degree in Computer Science or related technical subjects or equivalent combination of education and experience with 5 years of experience managing UNIX/Linux Systems.
- Experience with configuring and managing HPC scientific clusters.
- Strong knowledge of multiple operating systems.
- Experience with RHEL6 and RHEL7.
- Strong knowledge of open source web application services (Apache, Tomcat, JBoss, NodeJS, Ruby, etc.).
- Familiarity with version control systems such as Git, Subversion, Bazaar, etc.
- Knowledge of networking fundamentals including TCP/IP, traffic analysis, common protocols and network diagnostics.
- Experience with performance and diagnostic tools for benchmarking, analysis and tuning of systems, networking and storage.
- Experience with virtualization hypervisors (VMware, KVM, LXC) and cloud/IaaS environments (OpenStack, CloudStack).
- Experience with PBS and OpenMPI.
- Experience with CFEngine, Puppet, Ansible, and other configuration management systems - Experience with Nagios, CheckMK, Solarwinds, Ganglia, and other network and device monitoring systems.
- Experience with various storage architectures, filesystems and hardware (iSCSI, SAS, Panasas, Netapp, Lustre, LVM, XVM, ext3, ext4).
- Experience with various open source backup solutions (Bacula, BackupPC, Amanda).
- Previous experience working in a government, scientific, or other highly technical environment. - Excellent interpersonal skills suitable for user support and ability to work well with peer system administrators.
- Moderate fluency in at least one scripting language such as Bash, Python, Ruby, Perl or equivalent.
- Demonstrated capabilities to work in a dynamic environment and translate user needs into actionable project plans and see those plans through execution while balancing needs for short-term, high-priority tasks.
- Excellent written and verbal communication skills.
- Ability to work independently and demonstrated analytical and problem-solving skills.
- Demonstrated ability to balance complex research and security requirements.
- Background of contributing to open source projects or avocational endeavors such as hacker/maker spaces is desirable.
- Technical documentation skills, including ability to prepare simple documentation web pages. - RHCT or RHCE Certification.