11 days old

Manager, Site Reliability Engineering

Airlines Reporting Corporation
Arlington, VA 22201
  • Job Code
Location: US-VA-Arlington | US-KY-Louisville | US-FL-Tampa
Job ID: 2021-2036
# of Openings: 1
Category: Technology

About The Role:

We are searching for a Manager, Site Reliability Engineering to join our team. In this role, you will oversee multiple concurrent large-scale projects, direct the allocation of resources, and set priorities for a team of employees and contractors who perform operational and release management support for the full product stack related to ARC products and services. You will drive releases and implementation of operational processes and non-functional requirements within a complex multi-platform environment. You will meet challenging development and support commitments and improve upon development processes. Additionally, you will grow, develop and mentor a team of site reliability engineers and align with software engineering managers. You will assess team/project needs, capacity management and competency to make prioritization changes. You will also provide deep technical input to products that align with ARC's strategy and work with peers to implement a comprehensive project-based planning process across all software groups and projects.

If you are ready to join a dynamic and forward-thinking organization, then we want to hear from you. Come innovate with us!

What You'll Get to Do:

  • Collaborate with product owners, business SMEs and software engineering teams to analyze the business needs and improve supportability, scalability and recovery for the engineered solution. Ensure that the overall technical solution is aligned with the business needs and operational teams methodologies
  • Responsible for the operational strategy for service delivery and service availability to reduce the mean time to recovery using automation. Develop methods for autonomous recovery and self-repairing systems. Ensure the solution is consistent with ARC architecture, design and development standards
  • Direct and plan system releases and hotfixes. Develop methods that allow simplified triage following a set of checklists, runbooks and standard operating procedures. Make adjustments to adopt new methodologies that provide the business with increased awareness and attention of non-functional requirements.
  • Align and improve software development delivery by providing operational improvements to non-functional requirements. This includes enhancements to improve service levels by leveraging key performance indicators consisting of monitoring, non-functional testing and availability reports. Provide a service-focused approach leveraging continuous process improvement. Define and oversee a strategy for chaos testing to improve system resiliency. Mentor site reliability engineers and software engineers
  • Stay current with latest development tools, technology ideas, patterns and methodologies; share knowledge by clearly articulating results and ideas to key stakeholders and site reliability engineers. Grow in people development and enterprise influence.

You'll Bring These Qualifications:

  • Bachelor's Degree in Computer Science or related engineering field; or equivalent experience
  • ITIL v4 Certification preferred in ITIL 4 Foundation / ITIL 4 Managing Professional / ITIL 4 Strategic Leader
  • 7+ years of application programming in both front-end User Interfaces, server-side applications and database queries
  • 3+ years of people leadership, matrix resource management experience preferred
  • 5+ years of experience with application or operational support role
  • 5+ years of experience with full cycle application development (Full SDLC experience: design, development, delivery, etc.),
  • 3+ years with Agile, Scrum, DevOps, XP, and Continuous Integration and Continuous Delivery
  • 5+ years of experience implementing modern applications using:
    • Cloud Based Solutions/Technologies (AWS, Google, Azure). AWS developer environment including, but not limited to, Lambda, API Gateway, DynamoDB, S3, Cloudwatch
    • Implementation of modern application and infrastructure design patterns, including micro-services and containers, disposable, reactive, stateless, ephemeral and distributed patterns
    • Open-source technologies including, but not limited to, NodeJS, Data Dog, React, Python and NoSql DynamoDB database(s)
    • DevOps tools including, but not limited to, Terraform/Cloud formation, Jenkins pipeline, GIT, gitlabs, Jira, Confluence,
    • Jenkins, Sonar, Nexus, automated test and deployment tools
    • Experience w/Data Lake concepts and design patters (AWS S3, parquet, python, node.js, lambda, hadoop, Spark, noSQL mongoDB, DynamoDB, Athena, etc.)

You'll Also Bring These Professional Skills:

  • Proven ability to lead multiple resources through triage events and communication after action reports
  • Proven ability to lead a group through an operational improvements that reduce mean time to resolution
  • Experience leading technical teams to mentor and guide multiple-disciplined (full stack) technical teams
  • Continuous focus on operational improvements and automated recovery
  • Ability to discover and define non-functional requirements and to transform them into technical requirements and solution definition
  • Proven ability to influence technology strategy and best practices across peer and leadership groups to support an agile development culture
  • Outstanding communication skills (verbal and written) and ability to communicate with internal and external customers and all levels of management, including communicating technical information to nontechnical audiences
  • A strong intellectual curiosity to continually challenge what exists and explore what should be changed to best meet evolving business and market
  • A strong passion to support peers to help meet timelines on larger projects

What We Can Offer You:

  • Joining ARC means joining a team that is motivated, diverse, creative, collaborative and solutions-oriented. We think big, embrace challenges, and explore new ideas to lead the way for the travel industry.
  • Our employees value the hands-on learning and professional development opportunities that allow them to expand their skills and grow their career in new, dynamic ways.
  • We offer a highly competitive, comprehensive benefits package so you can worry less and focus on what truly matters.
  • By joining ARC, you will partner with top minds in the industry as we use data and technology to innovate how the world travels.

EOE M/F/D/V Females and Minorities Encouraged to Apply



Posted: 2021-04-28 Expires: 2021-05-29

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Manager, Site Reliability Engineering

Airlines Reporting Corporation
Arlington, VA 22201

Join us to start saving your Favorite Jobs!

Sign In Create Account
Powered ByCareerCast