Company

Los Alamos National LaboratorySee more

addressAddressLos Alamos, NM
type Form of workFull-Time
CategoryInformation Technology

Job description

What You Will Do
HPC Environments (HPC-ENV) group invites applicants to apply for a position to join the HPC Workload Management (WLM) Team, as a Scientist 2 or 3. This team has several responsibilities: they are responsible for HPC scheduler and resource management, interact with software vendors, DOE/NNSA tri-labs, install, configure, and maintain the HPC job scheduling software and databases. In addition they help integrate scheduling software on new novel HPC systems.
The HPC Division supports the Los Alamos National Laboratory mission by managing a world-class supercomputing center. We support stockpile stewardship for NNSA/DOE and accelerate scientific discovery for scientists. We integrate and support some of the world's largest supercomputers during an exciting time in computing with the focus on traditional large scale simulations, data science, artificial intelligence and machine learning.
HPC-ENV manages how users interact with the HPC systems at LANL which helps secure the nation and pushes the boundaries of science and innovation. Several teams within HPC-ENV are responsible for the broad range of HPC platforms, programming and runtime environments, software, application optimization and readiness, software engineering, user support & services for a large and diverse customer base. We provide support and services to many production platforms at a world-class computing facility to ensure customers can accomplish their research and mission at extreme scale.
Qualified candidates will have experience with HPC, linux system administration, shell scripting, programming, and HPC schedulers, Workload Management, and resource management.
This position will be filled at either the Scientist 2 or Scientist 3 level, depending on the skills of the selected candidate. Additional job responsibilities (outlined below) will be assigned if the candidate is hired at the higher level.
Scientist 2 ($99,200 - $164,100)
The successful candidate will perform the full spectrum of tasks, including but not limited to:
  • Configure and analyze scheduling software for dynamic HPC production workloads
  • Work with the primary scheduling software vendors regarding issues, tickets, bugs, CVEs
  • Analyze existing configurations and scientific workloads, recommending and implementing changes to increase system efficiency and minimize human waiting time
  • Perform internal tool development to automate repeatable Workload Management processes and procedures
  • Propose and implement solutions when presented with projects in our HPC environment
  • Participate in the weekly on-call rotations and support schedule by triaging tickets, working closely with other production HPC teams
  • Work independently and also interactively with other support team members
  • Communicating and collaborating frequently with customers, other cross-group and cross-division teams as well as other HPC sites
  • Delivering the scheduling workload tool stack to HPC production system administrators and run-time environment validation team

Scientist 3 ($119,200 - $201,100)
In addition to what was outlined at the lower level, at this level you will:
  • Lead technical efforts and projects in the area of Workload Management: integrating new systems, administering, configuring and tuning production systems, automating and enhancing test and production system pipelines, developing and enhancing tools for these tasks
  • Anticipate, experiment with and optimize workload and workflow efficiency
  • Work with LANL staff and system vendors to optimize the performance of DOE applications on future HPC systems.
  • Work hand in hand with production system administration in determining the most difficult problems involving applications running on HPC systems.
  • Work with scheduling and resource management vendors and Trilab counterparts


What You Need
Minimum Job Requirements:

Linux Expertise
Strong Linux knowledge and expertise as an administrator. Broad knowledge of administration of production Linux computer systems, utilities, and tools, including experience building, configuring, and administering production Linux computer systems. Experience with multiple Linux distributions.
Programming Skills
Moderate (at least 2 years) experience in programming in Python, Bash, and/or C/C++. Experienced in basic software engineering principles.
Scripting Skills
Demonstrated scripting experience in Bash, Lua, Python or similar scripting languages as well as experience with more advanced programming languages.
Strong interpersonal and Communication Skills
Including demonstrated ability to work within a team environment and with customers. Strong interpersonal communication skills with the ability to work with groups of people of various levels of technical knowledge or understanding.
Additional Job Requirements for Scientist 3:
In addition to the requirements outlined above, qualification at the higher level requires:
Leadership
Experience as the technical lead on small or large technical projects
HPC Job Scheduling/Resource Management
Demonstrated experience administering an HPC job scheduler or resource manager (i.e. Slurm, Moab, LSF, PBS/Torque, Grid Engine, FLUX, Dakota, Swift/T, etc.)
Message Passing Experience
Experience with the internals of an MPI (i.e. OpenMPI, MPICH), PMIx or similar parallel runtime program model
HPC Computing Experience
Experience working in a production computing environment, preferably with HPC systems or at large scale. Working knowledge of networking concepts and practices.
Education/Experience at lower level
Position requires a Bachelor's degree in a STEM field from an accredited college and university and 4 years of related experience, typically with experience at a university or National Lab or equivalent experience directly related to the occupation.
Education/Experience at higher level
Position requires a Master's degree in a STEM field from an accredited college or university and 6 years of relevant experience or an equivalent combination of education and experience directly related to the occupation.
Desired Qualifications:
Debugging
Demonstrated understanding of multiple OS and library components, tools and methods to triage interactive and batch jobs, especially parallelized and at-scale
Continuous Integration & Software Development
Experience with continuous integration tools such as Gitlab CI/CD workflows
Linux Provisioning and Configuration Management
Experience with automating Linux system administration, provisioning and configuration management such as Ansible, CFEngine, Warewulf, Puppet, etc.
Linux Containers and Tools
Demonstrated experience with virtual machines, linux containers, container orchestration (kubernetes) and related concept implementations. Knowledge of how linux namespaces and control groups interact.
HPC Environments & Infrastructure
Knowledge of High Performance Computing, their environments and supporting infrastructure. Knowledge of distributed systems, including system architectures, computer networks, software and multi-tenant systems. Experience with networking and file systems in an HPC environment, experience with parallel file systems (Lustre, GPFS, etc.) experience with data movement tools.
Work Location:
This position will be located in the beautiful town of Los Alamos, NM with the potential for hybrid from a location within 2 hours ground commute of this location. Los Alamos is located in northern New Mexico between the Rio Grande and the eastern rim of the Valles Caldera and approximately 35 miles to the northwest of Santa Fe.Reporting onsite will be periodically required. Hybrid is at the discretion of management and can change at any time with appropriate notice. New hires are eligible for an extensive relocation package which will hopefully allow a smooth transition into your new lodging location or home.
Position commitment: Regular appointment employees are required to serve a period of continuous service in their current position in order to be eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the time required, they may only apply for Laboratory jobs with the documented approval of their Division Leader. The position commitment for this position is 1 year.
Note to Applicants:
Where You Will Work
Located in beautiful northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. Our generous benefits package includes:
§ PPO or High Deductible medical insurance with the same large nationwide network
§ Dental and vision insurance
§ Free basic life and disability insurance
§ Paid childbirth and parental leave
§ Award-winning 401(k) (6% matching plus 3.5% annually)
§ Learning opportunities and tuition assistance
§ Flexible schedules and time off (PTO and holidays)
§ Onsite gyms and wellness programs
§ Extensive relocation packages (outside a 50 mile radius)
Additional Details
Directive 206.2 - Employment with Triad requires a favorable decision by NNSA indicating employee is suitable under NNSA Supplemental Directive 206.2. Please note that this requirement applies only to citizens of the United States. Foreign nationals are subject to a similar requirement under DOE Order 142.3A.
Clearance: Q (Position will be cleared to this level). Selected applicants will be subject to a background investigation conducted by or on behalf of the Federal Government, and must meet eligibility requirements* for access to classified matter. This position requires a Q clearance. and obtaining such clearance requires US Citizenship except in extremely rare circumstances. Dependent upon the position, additional authorization to access classified information may be required, which may or may not be available to dual citizens. Receipt of a Q clearance and additional access authorization ultimately is a decision of the Federal Government and not of Triad.
*Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.
New-Employment Drug Test: The Laboratory requires successful applicants to complete a new-employment drug test and maintains a substance abuse policy that includes random drug testing. Although New Mexico and other states have legalized the use of marijuana, use and possession of marijuana remain illegal under federal law. A positive drug test for marijuana will result in termination of employment, even if the use was pre-offer.
Regular position: Term status Laboratory employees applying for regular-status positions are converted to regular status.
Internal Applicants: Regular appointment employees who have served the required period of continuous service in their current position are eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the required period of continuous service, they may only apply for Laboratory jobs with the documented approval of their Division Leader. Please refer to Policy Policy P701 for applicant eligibility requirements.
Equal Opportunity: Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regard to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation or preference, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to applyhelp@lanl.gov or call 1-505-665-4444 option 1.
Refer code: 7148693. Los Alamos National Laboratory - The previous day - 2023-12-17 00:28

Los Alamos National Laboratory

Los Alamos, NM
Jobs feed

Commercial Estimator

Gpac Talent Network

Athens, GA

Security Officer Armed

Allied Universal

Lake Charles, LA

Estimator/Purchasing

Gpac Talent Network

Marietta, GA

Drywall estimator South Carolina

Gpac Talent Network

Myrtle Beach, SC

Multifamily Project Manager

Gpac Talent Network

Athens, TN

Estimator/Purchasing

Gpac Talent Network

Roswell, GA

HVAC Technician - Now Hiring

Vi Living

Denver, CO

$27.44 - $34.30 per hour

Commercial Estimator

Gpac Talent Network

Bethlehem, GA

Project Manager

Gpac Talent Network

Minnesota, United States

Share jobs with friends