Full-time

Site Reliability Engineer

Posted by RCS TECH • June 04, 2026

📍 mexico, mexico, Mexico

Apply Now

Description

What You’ll Do  
 Reliability & Operations 
 - Own availability, latency, and scalability across SaaS and AI systems  
 - Define and enforce SLOs, SLIs, and error budgets 
 - Participate in a global on-call rotation (~1 week every 4 weeks) 
 - Lead incident response and drive blameless postmortems with systemic fixes 
 Platform & Infrastructure  
 - Architect and operate on-premise and multi-region, multi-cloud environments 
 - Manage large-scale Kubernetes workloads 
 - Build and evolve infrastructure using Terraform and Ansible 
 - Improve system resilience, fault isolation, and capacity planning 
 AI/ML & Automation  
 - Build and scale agentic AI systems for triage, anomaly detection, and self-healing 
 - Ensure reliability of model serving infrastructure 
 - Operate, optimize and scale distributed systems 
 What You Bring ...
            

Job Details

Location mexico, mexico
Job Type Full-time
Category IT / Computing / Software
Posted June 04, 2026
Deadline July 14, 2026

Ready to Seal the Deal?

Submit your application today and take the next step in your career with RCS TECH.

Apply for this Job