Senior Chaos Engineer Role

New

Skills

Automation scripting Chaos Engineering Distributed systems Engineer Observability Tools Strategy Swift Testing

Join Goodnotes as our first Senior Chaos Engineer and help shape the future of resilience for our world-class digital productivity platform, used by millions globally. Lead the development and execution of a chaos engineering program from scratch, driving reliability through creative experimentation, collaboration, and continuous learning. This fully remote role offers the opportunity to define strategy, build tooling, and foster a culture of reliability across mobile and backend systems.

Job Overview
  • Define and implement Goodnotes' chaos engineering strategy, including tools, safety practices, and roadmap.
  • Design and execute fault injection experiments on mobile and backend systems to expose hidden risks.
  • Collaborate across engineering teams to identify critical user flows and stress-test system assumptions.
  • Build and scale automation tools, observability systems, and experiment tracking processes.
  • Document findings, communicate insights, and influence improvements to system reliability.
Key Responsibilities
  • Simulate real-world issues such as latency spikes, outages, and resource exhaustion.
  • Establish safety guardrails and blast radius controls for chaos experiments.
  • Facilitate resilience drills and chaos game days to build cross-team readiness.
  • Mentor and support the growth of the chaos engineering function, including future hiring.
  • Translate experimental learnings into actionable system improvements.
Required Skills & Qualifications
  • Proven experience in chaos engineering or fault injection in production-scale, distributed environments.
  • Strong knowledge of iOS platforms and mobile networking.
  • Advanced proficiency in Swift programming.
  • Understanding of resilience patterns (circuit breakers, bulkheads, timeouts, retries).
  • Experience with incident postmortems, war games, or reliability reviews.
  • Ability to build automation tools and scripts for chaos experiments.
  • Scientific mindset: hypothesis-driven, analytical, and curious about system behavior at the edge.
  • Excellent communication and documentation skills.
  • Collaborative approach and passion for mentoring others.
  • Openness to learning and developing new skills.

Job Type: Remote

Salary: Not Disclosed

Experience: Entry

Duration: 12 Months

Share this job:

Similar Jobs

Cloud Endpoint Services Admin

Posted 55 days ago

Administer and automate endpoint management

Ensure security and compliance of endpoints

Automation scripting Azure Bash Cloud
overtime