Full-time

Site Reliability Engineer – GenAI Platform

Posted by Astra North Infoteck Inc. • March 23, 2026

📍 Mirabel, Quebec, Canada
Apply Now

Description

Job Description
  • Experience: 8+ years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting IaaS platforms with networking and system engineer-ing knowledge.

  • Roles and Responsibilities:

    • Operate, monitor, and maintain the infrastructure supporting GenAI applications (training, inference, feature store, data ingestion, model serving)

    • Design and build automation for core platform capabilities, reducing manual toil

    • Develop and maintain infrastructure-as-code (IaC) for provisioning and managing compute, storage, network, GPU clusters, Kubernetes / container orchestration, etc.

    • Establish, monitor, and enforce SLOs/SLIs/SLAs, error budgets, alerting, and dashboards

  • Ready to Seal the Deal?

    Submit your application today and take the next step in your career with Astra North Infoteck Inc..

    Apply for this Job