Site Reliability Engineer (Cloud Native)

Location New York
Discipline: Data Science , Startups in AI
Job type: Permanent
Salary: Up to $200,000 per annum
Contact name: Fadi Jawish

Contact email:
Published: 8 months ago

Site Reliability Engineer (Cloud Natve)

Location: New York (Hybrid 3 days office / 2 days home)

Salary: range between $50,000—$200,000 USD plus an excellent benefits package including ( health/PEO related benefits, vacation bonus (take a real vacation, they give you money up to 1k), happy hours Fridays, Bagel breakfasts once a week, relocation assistance if you want to work out of one of their office over another (e.g. London, New York, Tokyo), visa sponsorship and work auth - like H1B, O1, and similar

Are you a forward-thinking engineer passionate about enhancing system reliability and performance? Do you thrive in dynamic environments and enjoy solving complex technical challenges? If so, we want you to join our team as a Site Reliability Engineer (SRE) in the heart of New York City!

Minimum Qualifications:

  • Educational Excellence: A Bachelor's degree in Computer Science, a related field, or equivalent practical experience.

  • Seasoned Software Engineer: A solid foundation with a minimum of 4 years of experience as a software engineer.

  • Coding Versatility: Proficiency in programming with one or more of the following languages: C, C++, Python, Go, Perl, or Ruby.

  • System Mastery: A deep understanding of Site Reliability Engineering, System Design, and Distributed Computing.

  • Technical Proficiency: Proficiency in algorithms, data structures, Unix/Linux systems, IP networking, performance optimization, and troubleshooting application issues.

Preferred Experience:

  • Cloud Native Expertise: Experience in the cloud-native applications space, driving innovation in a rapidly evolving environment.

  • System Detective: Proven ability to troubleshoot and debug complex distributed systems, making you the go-to problem solver.

  • Code Connoisseur: Experience with code reviews, ensuring our systems are always in top-notch shape.

  • Communication Pro: Excellent communication skills, enabling you to collaborate effectively across teams.


  • Architect Reliability: Design and execute projects aimed at enhancing the reliability of our cutting-edge Toggle platform.

  • Product Reliability: Take charge of projects to improve the reliability of our products, ensuring they exceed user expectations.

  • Cloud Optimization: Streamline operational processes on Google Cloud Platform (GCP), reducing manual work and increasing efficiency.

  • SRE Prowess: Leverage Site Reliability Engineering (SRE) strategies to deliver infrastructure on-demand, meeting the needs of our dynamic environment.

  • Automation Champion: Drive automation efforts to optimize existing systems, liberating your time for more strategic initiatives.

  • GitOps Guru: Implement GitOps strategies, contributing to our continuous improvement culture.

Apply Now and embark on a journey to enhance reliability in the heart of New York City!