Machine Learning Programming & Runtime Environment Developer (Scientist 2/3)

Los Alamos National Laboratory
Los Alamos County, New Mexico, US
Aug 01, 2021
Sep 09, 2021
Employment Type
Full time
Salary Type
What You Will Do
This position will be filled at either the Scientist 2 or Scientist 3 level, depending on the skills of the selected candidate. Additional job responsibilities (outlined below) will be assigned if the candidate is hired at the higher level.
The HPC Design Group is searching for a strong contributor to help drive the future of mixed simulation/machine learning/artificial intelligence supercomputing environments. Los Alamos National Laboratory (LANL) is preparing for a recently announced 2023 "first of its kind" system utilizing Nvidia Grace processors and GPGPU's (see that will provide 10x performance improvements in training giant AI models. The individual selected will be responsible for developing the user environment for this new system, and advancing the artificial intelligence and machine learning toolchains to deliver these advanced capabilities to LANL scientists and engineers.
Scientist 2 ($91,800 - $151,400)
The successful candidate will perform the full spectrum of UNIX/Linux computing environment administration. This includes but not limited to:
- Install, support and maintain software used on High Performance Computing (HPC) resources at the Los Alamos National Laboratory focusing on Machine Learning (ML) and Artificial Intelligence (AI) tool chains.
- Work across organizational HPC groups on tool installation, management, and testing
- Evaluate and recommend new software tools and packages for possible inclusion in LANL's HPC software stack
- Communicate with vendors and provide support for third-party packages
- Contribute to the development of custom tools, especially in the areas of software deployment and testing
- Interact with end-users and the HPC user support teams
- Provide tool education opportunities to users of LANL's HPC systems
Scientist 3 ($110,400 - $186,300)
In addition to the duties outlined above, the Scientist 3 will be required to:
- Be a technical lead within the Programming & Runtime Environments team
- Provide technical direction and help to continuously improve the state of HPC software support at LANL
- Represent the LANL at workshops, conferences, and meetings with other HPC sites
- Identify and represent appropriate work
- Represent LANL across the DOE complex
- Use project assignments to further organizational goals
- Leads peer review of the work of others within the organization
- Enhance technical and professional expertise of junior staff through mentoring and training
What You Need
Minimum Job Requirements:
- Significant knowledge and expertise in a high-level programming language, such as C or C++ Experience with building and deploying ML/AI tools, such as Tensorflow, Pytorch, etc.
- Significant knowledge and expertise with typical Linux build systems such as GNUMake, and CMake
- Experience with scripting languages, such as Perl, Python and shell scripting
- Strong Linux background
- Effective interpersonal skills, including demonstrated ability to work within a team environment
- Strong oral and written communication skills
Additional Job Requirements for Scientist 3:
In addition to the requirements outlined above, qualification at the higher level requires:
- Experience acting as a technical lead for small or large team software development
- Deep technical understanding of a topic related to ML/AI, systems design and/or systems programming
- Experience developing and publishing state-of-the-art systems software including designing experiments to demonstrate the effectiveness of systems under important workloads, knowledge of relevant related works, technical writing, and managing publication venue deadlines
- Experience acting as a primary investigator on competitively funded research projects
Education/Experience at Scientist 2 level: Position requires a Bachelor' degree in a STEM field from an accredited college and university and 4 years of related experience.
Education/Experience at Scientist 3 level: Position requires a Master's degree in a STEM field from an accredited college or university and 6 years of relevant experience or an equivalent combination of education and experience directly related to the position.
Desired Qualifications:
- Extensive knowledge and practical experience at the advanced level in programming using languages such as Python, Java, C, and C++ (3-5 Years)
- Linux system administration experience including using tools such as Ansible
- Knowledge and experience of multi-node multi-GPU ML tools, such as petastorm and Horovod
- Knowledge and experience working with HPC and/or ML/AI systems
- Knowledge and experience using or supporting data analytics software (Pandas, Spark, etc.) and/or scientific computing and mathematics libraries
- Experience with Linux compilers, such as PGI, Intel, CLang, and GNU
- Programming in a parallel computing environment with MPI, threads, or both
- Familiarity with concepts in program decomposition, and parallel programming models
- Familiarity with container technology and tools such as Docker
- Experience with Spack or other package management tools
- Experience with tools and methods for optimization and debugging in a highly parallel environment
- Experience with virtual machine environments using tools such as KVM, VMware, or VirtualBox
Note to Applicants: Opportunities are provided to publish work on software environment developed as part of this position. Possibilities range from lead authorship to co-author depending on experience and interest.
Location: This position will be physically located in Los Alamos, N ew Mexico.
Position commitment: Regular appointment employees are required to serve a period of continuous service in their current position in order to be eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the time required, they may only apply for Laboratory jobs with the documented approval of their Division Leader. The position commitment for this position is 1 year. Where You Will Work
Located in beautiful northern New Mexico, Los Alamos National Laboratory (LANL) is a multidisciplinary research institution engaged in strategic science on behalf of national security. Our generous benefits package includes:
- PPO or High Deductible medical insurance with the same large nationwide network
- Dental and vision insurance
- Free basic life and disability insurance
- Paid maternity and parental leave
- Award-winning 401(k) (6% matching plus 3.5% annually)
- Learning opportunities and tuition assistance
- Flexible schedules and time off (paid sick, vacation, and holidays)
- Onsite gyms and wellness programs
- Extensive relocation packages (outside a 50 mile radius)
Additional Details
Directive 206.2 -_ _Employment with Triad requires a favorable decision by NNSA indicating employee is suitable under NNSA Supplemental Directive 206.2 ._ _Please note that this requirement applies only to citizens of the United States. Foreign nationals are subject to a similar requirement under DOE Order 142.3A.
Clearance: Q (Position will be cleared to this level). Applicants selected will be subject to a Federal background investigation and must meet eligibility requirements
- for access to classified matter.
- Eligibility requirements: To obtain a clearance, an individual must be at least 18 years of age; U.S. citizenship is required except in very limited circumstances. See DOE Order 472.2 for additional information.
New-Employment Drug Test: The Laboratory requires successful applicants to complete a new-employment drug test and maintains a substance abuse policy that includes random drug testing.
Regular position: Term status Laboratory employees applying for regular-status positions are converted to regular status.
Internal Applicants: Regular appointment employees who have served the required period of continuous service in their current position are eligible to apply for posted jobs throughout the Laboratory. If an employee has not served the required period of continuous service, they may only apply for Laboratory jobs with the documented approval of their Division Leader. Please refer to PolicyPolicy P701for applicant eligibility requirements.
Equal Opportunity: Los Alamos National Laboratory is an equal opportunity employer and supports a diverse and inclusive workforce. All employment practices are based on qualification and merit, without regard to race, color, national origin, ancestry, religion, age, sex, gender identity, sexual orientation or preference, marital status or spousal affiliation, physical or mental disability, medical conditions, pregnancy, status as a protected veteran, genetic information, or citizenship within the limits imposed by federal laws and regulations. The Laboratory is also committed to making our workplace accessible to individuals with disabilities and will provide reasonable accommodations, upon request, for individuals to participate in the application and hiring process. To request such an accommodation, please send an email to or call1-505-665-4444 option 1. Appointment Type
Contact Details
Contact Name Marquez, Yesenia Jasmin
Work Telephone
Vacancy Name: IRC86211
Organization Name HPC-DES/HPC Design
Minimum Salary
Maximum Salary
Req ID: IRC86211
Category: Science