Software Development Engineer, DevOps (Top Secret Clearance) with Security Clearance
Groundswell
2024-11-05 17:35:10
McLean, Virginia, United States
Job type: fulltime
Job industry: I.T. & Communications
Job description
Who Are We? Groundswell is a premier technology integrator resolutely committed to solving the most complex challenges facing federal agencies today. Our name, Groundswell, represents our commitment to be an unstoppable, seismic change in government. Ours is a small company culture with big company reach and results. Are you ready to be audacious, be bold and drive change at a rapid pace? Join us, where we'll make a greater impact together. What You'll Do:
Groundswell is seeking an experienced Software Development Engineer or Site Reliability Engineer to automate, operate, and improve pioneering cloud-native software. You will collaborate with a team to deploy, operate, and support cloud native technology as well as ensure the reliability of the complete stack and tools that deliver the software. The software you will work with is built using Cloud Native technologies (CNCF), on a foundation of Kubernetes. The ideal candidate has a keen interest in improving operational efficiency and works with a demeanor that everything can be automated. - Ensure reliability and availability of the software to meet desired SLAs, reduce operational load, and scale sustainably in alignment with business growth.
- Be a key member of the team of dedicated DevOps/SREs responsible for software engineering and operations, with an emphasis on reducing operational toil.
- Plan automation and improvements by following scrum practices with two week sprints.
- Work with the following tech stack: Cloud Native (Kubernetes, Istio, OPA, GoLang, Ruby/Groovy, ArgoCD, Jenkins, Prometheus, Grafana, and more)
- Responsible for the safe change and reliability of customer environments, with Service Level Objective (SLO) gated multi-stage deployment automation. Mission is to improve platform reliability, observability and overall customer satisfaction.
- Develop and launch effective Service Level Indicators (SLIs) to ensure that SLOs are achieved through building an extendable Observability architecture, runbook automation, and establishing new processes.
- Partner with other development teams to craft and implement a range of Site Reliability Engineering (SRE) standards for their respective services to meet. Define benchmarks and automation to qualify services to move to production environments. Required Qualifications:
- Must be a US Citizen with active TS/SCI Clearance and eligibility for CI Poly.
- Bachelor's degree in Computer Science, Software Engineering, or related technical discipline with a minimum of 4-years of relevant experience or a Master's with 2 years.
- Expertise in DevOps, or SRE experience in a distributed systems environment.
- Expertise with Kubernetes, AWS architecture and development, and Linux Operating system and network.
- Experience handling and solving distributed systems in a public cloud.
- Experience with software development standard methodologies, such as code management, CI/CD, and testing.
- Proficiency with GoLang programming language, or alternatively Python or Ruby.
- Excellent documentation skills, including producing detailed runbooks and process documentation.
- Can work independently, as well as collaborate with multi-functional teams.
- May require occasional travel (up to 5%). Preferred Qualifications:
- Certified Kubernetes Administrator / Application Developer (CKA / CKAD) preferred
- AWS certification preferred