About the Opportunity:
Shape the Future of Our Global Kubernetes Platform!
We are looking for a Cloud Infrastructure Engineer to take on AKS platform engineering and Day2 within our Cloud Kubernetes Engineering team. This is an opportunity to play a contributing role in supporting how we build, operate, and scale our Kubernetes Service (AKS) platform globally.
If you have in-depth expertise in Kubernetes engineering and operations, cloud architecture and operations, and Azure infrastructure, and you enjoy taking ownership and making an impact, solving complex challenges, and thrive in a place where you can design and troubleshoot, we want to hear from you.
Location: Anywhere in the United States and Canada.
Citizenship Requirement for US Candidates:
Must be a US citizen.
Ability to obtain US security clearance.
Citizenship Requirement for Canadian Candidates:
Ability to obtain or currently possess Canadian Protected B security clearance is mandatory.
Must possess 5+ years of residency in Canada.
Applicant must be Canadian citizen and able to work Government of Canada and USA Government projects.
What You'll Do
Support AKS Platform Strategy & Operational Excellence
· Help with the technical implementation of our AKS platform, ensuring scalability, resiliency, and future-proofing
· Provide input into how our global Kubernetes platform operates, helping to refine, implement, and validate best practices, and contributing to decision-making
· Assist and implement AKS platform standards enforcement
· Work with platform team members, security teams, infra engineering teams, and other stakeholders to align Kubernetes strategies with business needs
Kubernetes Technical Operations & Proactive Problem-Solving
· Take ownership of incident response, troubleshooting, and root cause analysis for AKS-related issues
· Provide internal customer AKS/Kubernetes support and provide subject expertise
· Help ensure AKS clusters are up to date, secure, and in sync, minimizing drift across environments
· Help identify potential scaling, security, or operational risks before they become issues, and proactively implement solutions
· Provide Day2 support and operational efficiency of the Kubernetes clusters
Automation & Infrastructure as Code (IaC) - Terraform
· Work on automation of cluster lifecycle management, including provisioning, scaling, upgrades, and decommissioning
· Improve and standardize Infrastructure as Code (IaC) implementations using Terraform, ArgoCD, and GitHub workflows
· Help with CI/CD pipelines and GitOps workflows to streamline Kubernetes deployments, and work within the CKE team and with Release Engineering for internal customer-facing integrations
· Drive consistency in AKS configurations across all environments
Expertise, Communication & Cross-Team Collaboration
· Provide technical input within the Kubernetes team, supporting a culture of learning and innovation
· Be an effective communicator, able to articulate technical decisions and trade-offs to your team
· Collaborate with engineering teams to ensure AKS platform adoption and adherence to best practices
· Help us drive continuous improvement initiatives for our Kubernetes platform
· Assist with both greenfield onboarding of containerized applications and microservices, and lift-and-shifts
What You Bring:
· 5+ years of experience in cloud infrastructure design and implementation, with deep expertise in Azure cloud architecture
· 3+ years of hands-on in-depth experience managing Kubernetes in dev/qa/prod, including deep troubleshooting of platform and workloads and Day2 operations
· 3+ years of cloud and/or Kubernetes operations experience
· Good understanding of AKS, including cluster operations, networking, identity, and security best practices
· An ability to identify future challenges and proactively design and implement solutions to mitigate risks before they become issues is a real plus
· Good hands-on experience with containerization (Docker), Infrastructure as Code (Terraform, ArgoCD), and CI/CD automation
· Understanding of Kubernetes internals, operations, controllers, networking (CNI), is required, and observability tools is a plus
· Proven experience in troubleshooting, root cause analysis, and proactive problem resolution in Kubernetes environments
· Experience with defining and enforcing Kubernetes platform standards across multiple teams and clusters is also a plus
· Proactive problem solver – capable of anticipating challenges and addressing them before they impact operations
· Strong communicator – able to explain complex technical concepts in a clear, accessible way
What would make you really stand out (Experience Over Certifications):
While certifications are a plus, we value real-world experience more than credentials. That said, these certifications and skills may be beneficial:
· Certified Kubernetes Administrator (CKA)
· Certified Kubernetes Security Specialist (CKS)
· Certified Kubernetes Application Developer (CKAD)