Site Reliability Engineer - Transport
Company: Leidos
Location: Honolulu
Posted on: April 4, 2026
|
|
|
Job Description:
The NMCI Service Management Integration and Transport (SMIT)
group at Leidos has an opening for a Site Reliability Engineer to
focus on the reliability, performance, and scalability of complex
distributed systems. Under the SMIT Contract, the Leidos team is
responsible for the core backbone for the Navy-Marine Corps
Intranet, including cybersecurity services, network operations,
network engineering, service desk, seat support services, and data
transport. The SRE will also develop and execute tests focused on
system resilience, performance under load, and failure scenarios.
They will work in tandem with other Site Reliability Engineers
(SREs) and development teams to create automated testing frameworks
that simulate real-world conditions that validate system behavior
under normal and stress conditions, ensuring our services are
resilient and meet established service level objectives (SLOs).
Your work will contribute to the development of robust and scalable
services that operate reliably in production. Your responsibilities
will include maintaining complex computer systems by writing code
to automate software releases, monitor systems, and detect and fix
problems before users even know there is an issue. You will use
these skills to improve site performance and overall reliability.
The SRE Engineer role is responsible for supporting, migrating,
automation and optimization of software development and deployment
process, infrastructure as code, and contribute to the overall
maturity of the Site Reliability Engineering program. Primary
Responsibilities Work alongside the development and operations
teams to ensure speedy and reliable software deployments, monitor
systems, and improve overall reliability of the platform. In
addition, as you discover and document system bugs, you have the
motivation to go off and fix them yourself. Develop features
utilize the AI coding tool and repository of scripts to automate,
scale, test, and secure the cloud infrastructure and the pipelines.
Enhance performance monitoring of the various systems via Splunk or
other dashboard reporting tools. Identify performance bottlenecks
and optimize the performance of cloud infrastructure. Contribute to
continuing our SRE journey by suggesting ways to improve
engineering build, maintenance, automation and reliability across
the platform with SRE/DevOps tools and Infrastructure-as-Code.
Develop and code high-quality pipeline automation workflows to
support inside and outside the cloud platform that are appropriate
for business and technology strategies. Develop and execute test
strategies that simulate real-world failure scenarios, including
network disruptions, hardware failures, and system overloads.
Create, script, and run performance tests to measure system
behavior under varying levels of load and traffic. Identify
bottlenecks, performance degradation, and areas for optimization.
Design, implement, and maintain automated test suites for
infrastructure and application components. Ensure that testing is
integrated into the CI/CD pipeline to validate system reliability
with every release. Build automated systems for continuous
performance testing, stress testing, and load testing. Work closely
with SREs, developers, and operations teams to define reliability
goals and develop appropriate testing strategies to validate those
goals. Ensure that new services and features undergo thorough
testing for performance, reliability, and failure recovery before
deployment to production. Validate that monitoring, logging, and
alerting mechanisms are functioning correctly by testing systems
under failure conditions. Ensure that Service Level Indicators
(SLIs) and Service Level Objectives (SLOs) are accurately measured
and tracked through automated testing frameworks. Resolve most
conflicts between timeline, budget, and scope independently but
intuitively raise sophisticated or consequential issues to senior
management. Basic Qualifications Typically requires Bachelor’s
however 4 – 8 years of prior relevant experience may be considered
in lieu of degree. Must have an active DoD Secret security
clearance and be able to maintain. Minimum of DoD 8570.01 IAT Level
II Certification required prior to onboarding and must maintain
certification while supporting the SMIT Contract. Must be able to
support program execution in classified environments and access
SIPRNet from an NMCI location on short notice (local travel.) 5
years’ experience configuring Cisco routers, switches, and network
appliances. 5 years’ experience with routing protocols (i.e.,
OSPF/EIGRP/BGP.) 5 years’ experience with L2 switching, (i.e.,
Vlans, spanning tree, VTP etc.) 5 years’ experience troubleshooting
complex routing and switching issues. Experience with multiple
vendor routing, switching or wireless product lines. Strong
understanding and in-depth knowledge of TCP/IP network/subnet
addressing. Supports network configuration/asset management
activities, manages configuration drift, and accurately creates or
modifies network documentation to reflect the as-is and/or to-be
environment. Ability to work independently or in a team environment
to resolve technical issues in a dynamic environment. Experience
with automated script design, coding, debugging, and maintenance
skills (using bash, python, etc.) preferred. Experience in CI/CD
toolsets (e.g. Jenkins, GitLab, etc.) Experience with
Containerization (Docker) and Container Orchestration (Kubernetes.)
Good command of Linux/Unix and command line knowledge. Experience
in application administration, configuration, and integration.
Familiarity with agile development methodologies. Skilled and
disciplined to work with a distributed team. Ability to work in a
highly collaborative, forward thinking, and innovation-driven
environment. Knowledge of Agile and DevSecOps/SRE concepts and best
practices, with a desire to grow that knowledge. Hand-on experience
with Atlassian products (Jira, Confluence, Bitbucket, etc.)
Experience creating JIRA and/or Azure DevOps workflows, projects,
custom configurations. Experience administrating/maintaining SRE
platform via Ansible playbooks (e.g. upgrading Jenkins.) Experience
in automating tasks with scripting languages like PowerShell, or
Python. Integrating/maintaining with various 3rd party CI/CD tools
like Jenkins and Gitlab. Experience with PaaS using Red Hat
OpenShift/Kubernetes and Docker containers. Experience with
commercial cloud infrastructure deployment environments such as AWS
and Azure. Experience with automated provisioning and configuration
tools like Terraform, Cloud Formation, Chef, Puppet, Ansible, or
similar technologies. Working knowledge of the Risk Management
Framework (RMF), DISA STIGs. Preferred Qualifications: Previous
work experience providing support to the NGEN-NMCI program.
Experience with Infrastructure as Code (IaC) tools such as
Terraform, Ansible, or CloudFormation for automating test
environments. If you're looking for comfort, keep scrolling. At
Leidos, we outthink, outbuild, and outpace the status quo — because
the mission demands it. We're not hiring followers. We're
recruiting the ones who disrupt, provoke, and refuse to fail. Step
10 is ancient history. We're already at step 30 — and moving faster
than anyone else dares. Original Posting: March 10, 2026 For U.S.
Positions: While subject to change based on business needs, Leidos
reasonably anticipates that this job requisition will remain open
for at least 3 days with an anticipated close date of no earlier
than 3 days after the original posting date as listed above. Pay
Range: Pay Range $87,100.00 - $157,450.00 The Leidos pay range for
this job level is a general guideline only and not a guarantee of
compensation or salary. Additional factors considered in extending
an offer include (but are not limited to) responsibilities of the
job, education, experience, knowledge, skills, and abilities, as
well as internal equity, alignment with market data, applicable
bargaining agreement (if any), or other law.
Keywords: Leidos, Honolulu , Site Reliability Engineer - Transport, IT / Software / Systems , Honolulu, Hawaii