Please scroll down, To apply

High Performance Computing (HPC) System Engineer with Security Clearance

hiring now
New job

Scuttlebutt Services

2024-09-21 16:38:34

Job location Annapolis Junction, Maryland, United States

Job type: fulltime

Job industry: I.T. & Communications

Job description

Annapolis Junction, MD - Salary Range 195k-225k (TS/SCI w/ Full Poly) Job Brief We have multiple openings for Computer/Systems Engineers in Annapolis Junction, MD - we are looking for High Performance Computer (HPC) designers and developers to join a highly skilled, high performing agile team to support a nationally significant and fast-paced program. The focus is on developing a range of streamlined, collaborative applications for cybersecurity and analytics that shares data across agencies within the Intelligence Community (IC). Responsibilities Requirements Gathering: Confer with other computer, systems, and software engineers to analyze complex requirements, use design software tools, provide support using formal specifications, data flow diagrams, and other accepted design techniques, and will use engineering principles to provide full systems lifecycle support for the growing HPC compute infrastructure
Software Development: Shape the design, development, and/or modification of HPC software solutions by analyzing system performance standards, confer with users, computer/systems or software engineers; analyze systems flow, data usage and work processes; and investigate problem areas
Algorithms: Develop or implement algorithms to address HPC system performance and functional standards
Documentation: Review HPC software and system documentation to further provide recommendations for improving existing documentation and software/system development process standards
Quality Control: Ensure quality control of all developed and modified HPC software and hardware Requirements Active TS/SCI clearance with full scope polygraph
Bachelors Degree in a STEM field or similar technical discipline
Knowledge and experience with HPC concepts to include cluster architecture, parallel file systems, and high-speed networking
Demonstrated ability to provision and configure HPC environments and components
Solid understanding of accelerated computing scheduling and I/O stacks
Broad and deep understanding of the issues that affect GPU performance, CPU performance, and scaling performance
Proficiency with:
Agile/Scrum software development methodologies and team collaboration
Linux (Red Hat/CentOS) including OS, CLI (Command Line Interface), system administration, networking, storage, and security
Writing Linux based scripts to facilitate application integration
Lightweight Directory Access Protocol (LDAP) experience TCP/IP fundamentals
HPC workflows that use Message Passing Interface (MPI)
Languages, libraries and tools used in HPC (C++, C, modern Fortran, HIP, CUDA, Python, MPI, OpenMP, etc.)
Cluster configuration managements tools such as Ansible, Puppet, Salt
Unix cluster and node monitoring tools, including Node Health Check (NHC), Nagios, Grafana and Prometheus
Node.js and the NPM (Node Package Manager) ecosystem
Continuous integration and software CM (Configuration Management) processes/tools
Container technologies like Docker, Singularity, Shifter, Charliecloud
Skilled generating and reviewing software/technical documentation
Understanding of Test Driven Development (TDD) and automation tools
Bonus Skills A background in Signals Intelligence (SIGINT) is preferred
Experience working with information security teams to ensure cybersecurity compliance of multi-user systems
Knowledge of algorithms, methods, software libraries, and other tools commonly used in scientific computation
Experience with:
Bright Computing platform
Various MPI implementations, IntelMPI, OpenMPI, MPICH
Fast, multivendor, distributed cluster storage systems like Lustre, GPFS (General Parallel File System), and XFS for HPC workloads
Deep learning frameworks like PyTorch and TensorFlow
Software Defined Networking
Nvidia CUDA libraries and GPUs
Virtualization techniques, cloud platform solutions
MLPerf benchmarking
AI/ML coding
Apache NiFi
DevOps
AWS, Azure or GCP platform

Inform a friend!

<!– job description page –>
Top