Lead SRE Engineer

New

Skills

AWS Cloud infrastructure Devops Kubernetes Monitoring Performance Optimization Terraform

Job Overview

As a Lead Site Reliability Engineer, you will be responsible for defining and implementing SRE strategy, architecture, and roadmap aligned with business goals. You will lead the design and deployment of containerized workloads and Infrastructure as Code (IaC) in regulated clouds. Your role will also involve establishing observability, monitoring, and alerting at scale, driving incident management, on-call rotations, and root cause analysis, and collaborating with security and compliance teams to meet regulatory requirements. Additionally, you will champion automation and operational excellence on secure, mission-critical platforms in a remote-first, flexible work environment.

Responsibilities
  • Define and implement SRE strategy, architecture, and roadmap aligned with business goals.
  • Lead the design and deployment of containerized workloads and Infrastructure as Code in regulated clouds.
  • Establish observability, monitoring, and alerting at scale.
  • Drive incident management, on-call rotations, and root cause analysis.
  • Partner with security and compliance teams to meet regulatory requirements.
Requirements & Qualifications
  • BS in Computer Science, Cybersecurity, Software Engineering, or equivalent.
  • 5+ years of experience in SRE, DevOps, and cloud infrastructure.
  • Expertise in Kubernetes, Terraform/Infrastructure as Code, and AWS cloud.
  • Experience in monitoring, alerting, and performance optimization.
  • Strong troubleshooting and incident management skills for distributed systems.

Job Type: Remote

Salary: Not Disclosed

Experience: Entry

Duration: 12 Months

Share this job:

Similar Jobs

Software Engineer II, Competitive Intelligence

New

Develop and deploy clean code with high quality and performance

Collaborate with data analysts and product managers

Algorithms AWS Data Structures Distributed systems

DevOps Software Developer

New

Architect and maintain cloud infrastructures

Lead containerization and orchestration with Docker and Kubernetes

AWS Azure CI/CD devsecops

Sr. Tech Program Manager

New

Lead large-scale technical projects effectively

Establish program standards and milestones

AWS Azure Cross-functional Collaboration Distributed systems

Real-time Bidding System Engineer

New

Design and build ultra-low-latency real-time bidding systems for buy-side ads

Lead bidder services with millions of QPS and tight SLAs

AWS C++ Golang Kafka

Senior AI Software Engineer

New

Implementing voice cloning and text-to-speech technologies

Developing low latency and cost-effective solutions

AWS Azure Backend Development Docker

Senior Backend Engineer

New

Design, build, and maintain distributed backend systems

Work closely with product and engineering teams

AWS Docker Golang Grafana

Senior Engineer Trust & Safety

New

Analyze threat actor behavior and abuse patterns.

Implement abuse detection techniques.

AWS Cloud infrastructure Collaboration Javascript

Cloud Engineer I

New

Design, deploy, and configure cloud services for migrations and AD

Implement security architectures for managed services offerings

Ansible AWS CI/CD Devops

Engineer Manager, Accelerator Platform

New

Lead and manage the Accelerator Platform team effectively

Define and implement the platform normalization layer for integration

ASIC AWS Azure Distributed systems

Senior FullStack Engineer

New

Building and maintaining products for managing payments

Creating a platform for quick iteration with partners

AWS Cloud Architecture Lambda Mysql

Senior ML Manager

New

Lead and mentor ML/Data Engineers

Collaborate with Product Managers

AWS Hadoop Java Kafka

Full Stack Engineer

New

Full Stack development using React and React Native

API development with Python and FastAPI

AWS FastAPI Javascript Mysql

Lead Data Analyst, Devices

New

Drive revenue growth through pricing and experiments.

Analyze device data to shape strategy and product impact.

A/b Testing AWS Data Analytics Databricks

Senior Security Architect

New

Lead secure cloud architecture design and implementation.

Drive alignment of security with business goals.

AWS Gcp Go Iam

Performance Engineer, Remote Work

New

Analyze and remove performance bottlenecks

Develop and optimize Node.js and Golang services

AWS Azure Golang Kubernetes

Senior Enterprise Account Executive

New

Lead strategic territory plan targeting NYC banking institutions

Become a product and industry expert

Ai/ml AWS Azure Data Warehousing

Senior Data Platform Engineer

New

Build and optimize batch pipelines for data orchestration

Develop and enhance BigQuery data models for analytics

AWS BigQuery Gcp Kafka

Senior Software Engineer

New

Analyze threat actor behavior and evolving abuse patterns

Implement state-of-the-art LLM-driven techniques for abuse detection

AWS Cloud infrastructure Javascript LLMs

AI Model Serving Engineer

New

Specializing in AI Model Serving and backend development

Efficient task prioritization and deployment of high availability applications

AWS Azure Backend Development Docker

Senior Backend Engineer

New

Design, build, and maintain scalable distributed backend systems

Actively shape product decisions

AWS Golang Grafana Mongodb

Platform Engineering Manager

New

Define platform vision and strategy

Build scalable video ingestion, processing, and storage systems

AWS Backend Development C# Data-driven decision making

API Product Manager

New

Lead API product strategy, roadmap, and lifecycle

Design scalable RESTful APIs with security and compliance in mind

AWS Azure Communication Skills Data Modeling

SRE - AI/ML Infrastructure

New

Architect and maintain Kubernetes-based AI/ML platform

Optimize AI/ML job scheduling using Slurm

AWS Go Kubernetes Python

Senior FullStack Engineer

New

Hiring a remote Senior FullStack Engineer

Full-time position

Agile Methodologies AWS Css Git

Senior Software Engineer

New

Analyze threat actor behavior and evolving abuse patterns

Develop state-of-the-art techniques for abuse detection

AWS Cloud infrastructure Javascript LLMs

Senior Backend Engineer

New

Design, build, and maintain scalable distributed backend systems

Work closely with product and engineering peers

AWS Golang Grafana Mongodb

AI Model Serving Engineer

New

Develop state-of-the-art voice cloning techniques.

Ensure low latency and cost-effective text-to-speech solutions.

AWS Azure Backend Development Docker

Senior Manager, Software Engineering

New

Lead architecture and development of Commerce reporting platform

Manage team to foster scalable data systems culture

Apis AWS Data Warehousing Elasticsearch

Cloud Security Engineer

New

Build and operate detection and response pipelines in the cloud environment

Implement controls to counter threats and enhance security measures

AWS Azure CI/CD Cloudformation

AI Model Serving Engineer

New

Develop advanced AI model serving for voice cloning

Implement low-latency and cost-effective text-to-speech solutions

AWS Azure Backend Development Docker

Senior Backend Engineer

New

Design, build, and maintain scalable backend systems

Work closely with product and engineering teams

AWS Golang Grafana Mongodb

Senior Backend Engineer

New

Develop and maintain backend systems for Foodsmart web app

Collaborate with front-end engineers to design APIs

AWS Javascript Lambda Mysql

Trust & Safety Engineer

New

Analyze threat actor behavior and abuse patterns

Implement abuse detection techniques

AWS Cloud infrastructure Data Analysis Javascript

Senior Software Engineer - Finance & Compliance

New

Design and implementation of features

Architectural discussions and proposing improvements

AWS Django Docker FastAPI

Backend Engineer, Music Mission

New

Collaborate on backend services and Pubsubs

Partner with Product to track performance and artists' needs

AWS CI/CD Data Engineering Data Structures

Lead Software Engineer, Marketplace

New

Lead technical direction and improve platform stability.

Build scalable APIs and microservices.

Ai Api Development AWS CI/CD

Senior SRE Director

New

Lead and mentor globally distributed SRE & Infra teams

Define and execute SRE/infrastructure strategy

Automation AWS CI/CD Cloud infrastructure

Senior Systems Engineer Project

New

Maintain Linux servers and AWS infrastructure for reliability

Develop and maintain automation for provisioning and deployments

Ansible apache Automation AWS

Senior Backend Software Engineer

New

Develop and maintain backend systems for Foodsmart web application

Collaborate with front-end engineers to design APIs

AWS Git Javascript Lambda

Senior Software Engineer Trust

New

Analyze threat actor behavior across datasets

Implement state-of-the-art techniques for abuse detection

AWS Cloud infrastructure Data Analysis Javascript

AI Model Serving Engineer

New

Develop state-of-the-art voice cloning technology

Implement low latency and cost-effective text-to-speech solutions

AWS Azure Backend Development Docker

Senior Backend Engineer

New

Design, build, and maintain scalable distributed backend systems

Work closely with product and engineering peers

AWS Docker Golang Grafana

Senior Trust & Safety Engineer

New

Analyze threat actor behavior and evolving abuse patterns

Research and implement state-of-the-art abuse detection techniques

AWS Cloud infrastructure Data Analysis Javascript

AI Model Serving Engineer

New

Develop and deploy AI models

Create low-latency text-to-speech solutions

AWS Azure Backend Development Docker

Senior Backend Engineer

New

Design, build, and maintain scalable distributed backend systems

Work closely with product and engineering peers to shape product decisions

AWS Docker Golang Kubernetes

Senior Backend Engineer

Posted 3 days ago

Design, build, and maintain scalable distributed backend systems

Work closely with product and engineering peers

AWS Golang Grafana Mongodb

AI Model Serving Engineer

Posted 3 days ago

Contribute to cutting-edge voice cloning and text-to-speech technologies.

Utilize backend development skills in Python for efficient applications.

AWS Azure Backend Development Docker

Senior Software Engineer - Trust & Safety

Posted 3 days ago

Analyze threat actor behavior and abuse patterns

Design and develop production-ready systems for abuse detection

AWS Cloud infrastructure Data Analysis Javascript

Sr. Software Engineer - Trust & Safety

Posted 3 days ago

Analyze threat actor behavior and evolving abuse patterns

Research and implement AI-driven techniques for abuse detection

AWS Cloud infrastructure Data Analysis Javascript

Senior Backend Engineer

Posted 3 days ago

Design, build, and maintain scalable distributed backend systems

Work closely with product and engineering peers

AWS Docker Golang Grafana
overtime