Stefanini Group logo

Kubernetes Engineer

Stefanini Group
11 days ago
Full-time
On-site
Charlotte, North Carolina, United States
 
Job Description

Stefanini Group is looking for a Kubernetes Engineer for a globally recognized company! For interested applicants, click the apply button or you may reach out Micah Andres at (248) 386-7399/Micah.Andres@Stefanini.com for faster processing. Thank you! 

 

We are seeking an experienced Kubernetes Engineer/Administrator. This role focuses on managing and scaling our enterprise-grade Azure Kubernetes Service (AKS) infrastructure. You will be responsible for designing, implementing, and maintaining production Kubernetes clusters that support critical enterprise workloads across multiple Azure regions.

 

Primary Responsibilities

  • Azure Kubernetes Service (AKS) Management
  • Design, deploy, and manage enterprise-scale AKS clusters across multiple Azure regions
  • Implement and maintain private AKS clusters with advanced networking configurations
  • Configure and manage customer-managed encryption keys (CMK) for cluster disk encryption
  • Implement blue/green deployment strategies for zero-downtime cluster upgrades
  • Manage AKS cluster lifecycle including upgrades, node pool scaling, and disaster recovery
  • Optimize cluster performance, cost, and resource utilization
  • Implement AKS Fleet Manager for multi-cluster management and orchestration
  • Configure AKS Automatic for simplified cluster operations and auto-scaling
  • Manage AKS Managed Namespaces for improved multi-tenancy and resource isolation

 

Security & Compliance

  • Implement and maintain private networking architectures with Azure Private Endpoints
  • Configure and manage Workload Identity (OIDC) and user-assigned managed identities
  • Integrate Azure Policy for governance, compliance, and security enforcement
  • Implement Kubernetes RBAC and Azure RBAC integration
  • Manage secrets integration with Azure Key Vault using CSI drivers
  • Ensure secure communication between AKS and Azure PaaS services
  • Implement network policies and pod security standards

 

Service Mesh & Advanced Networking

  • Deploy and manage Linkerd service mesh for secure service-to-service communication
  • Implement mTLS between services with automatic certificate rotation
  • Configure traffic splitting, load balancing, and observability with Linkerd
  • Troubleshoot service mesh networking and performance issues
  • Integrate service mesh metrics with Azure Monitor

 

Infrastructure as Code (IaC)

  • Develop and maintain Terraform modules for AKS and supporting Azure infrastructure
  • Build reusable, production-ready Terraform patterns following Azure best practices
  • Implement infrastructure automation and GitOps workflows
  • Manage Terraform state, version control, and module lifecycle
  • Create and maintain comprehensive documentation for infrastructure patterns

 

GitOps & CI/CD

  • Design and implement GitOps workflows using ArgoCD for application deployments
  • Build and maintain CI/CD pipelines using GitHub Actions for Kubernetes workloads
  • Integrate AKS with Azure Container Registry (ACR) for container image management
  • Implement automated testing and validation for infrastructure and application changes
  • Manage deployment strategies (rolling updates, blue/green, canary)
  • Maintain GitHub Actions workflows for infrastructure provisioning and testing

 

Azure Platform Integration

  • Integrate AKS with Azure services including
  • Configure and maintain private endpoints for all Azure services
  • Implement VNet integration and subnet delegation patterns
  • Design and implement service connectivity across Azure regions

 

Monitoring, Observability & Operations

  • Implement comprehensive monitoring and alerting with Azure Monitor
  • Configure Log Analytics workspaces and integrate with AKS
  • Build dashboards and alerts for cluster health, performance, and security
  • Leverage Linkerd metrics and distributed tracing for service observability
  • Troubleshoot complex cluster, networking, and application issues
  • Conduct capacity planning and cost optimization
  • Participate in on-call rotation for production support
  • Perform post-incident analysis and implement preventive measures

 

Required Qualifications

  • Technical Skills - Azure & Kubernetes
  • 5+ years of hands-on Kubernetes experience in production environments
  • 2+ years of Azure Kubernetes Service (AKS) experience required
  • Strong Terraform expertise with proven ability to build reusable, production-ready modules
  • Deep understanding of Kubernetes architecture, networking, storage, and security
  • Experience with private AKS clusters and Azure Private Link/Private Endpoints
  • Proficiency with Azure networking: VNets, subnets, NSGs, private DNS zones, VNet peering
  • Strong understanding of Azure managed identities, Workload Identity, and RBAC
  • Experience with Azure Key Vault integration (CSI driver, disk encryption sets)
  • Hands-on experience with customer-managed encryption keys in Azure
  • Experience with Azure Container Registry including geo-replication and vulnerability scanning
  • Knowledge of AKS advanced features (Fleet Manager, AKS Automatic, Managed Namespaces) is
  • a plus

 

Infrastructure as Code & Automation

  • Advanced Terraform skills with module development experience
  • Git version control and branching strategies (GitHub)
  • GitOps tools: ArgoCD
  • GitHub Actions for CI/CD pipelines
  • Infrastructure testing and validation practices

 

Platform & Tools

  • Azure CLI and Azure PowerShell
  • kubectl, helm, kustomize
  • Linux system administration
  • Scripting: Bash, Python, or PowerShell
  • Container technologies: Docker, containerd
  • GitHub workflows and Actions

 

Soft Skills

  • Strong analytical and troubleshooting abilities
  • Excellent documentation skills with focus on knowledge sharing
  • Collaborative team player with mentoring capabilities
  • Effective communication for both technical and business audiences
  • Self-motivated with ability to manage complex projects

 

Preferred Qualifications

  • Advanced Kubernetes & Cloud Skills
  • Certified Kubernetes Administrator (CKA) or Certified Kubernetes Security Specialist (CKS)
  • Experience with Linkerd service mesh - deployment, configuration, and troubleshooting
  • Experience with AKS Fleet Manager for multi-cluster orchestration
  • Familiarity with AKS Automatic and managed namespace patterns
  • Experience with Kubernetes operators and Custom Resource Definitions (CRDs)
  • Service mesh implementations (Linkerd preferred; Istio, Open Service Mesh)
  • Advanced CNI configurations (Azure CNI, Calico, Cilium)
  • Multi-cluster management and federation
  • Experience with other cloud platforms (GCP GKE, AWS EKS) is a plus

 

  • Azure Certifications
  • Azure Solutions Architect Expert (AZ-305)
  • Azure Security Engineer Associate (AZ-500)
  • Azure Administrator Associate (AZ-104)

 

Platform Engineering Experience

  • Building internal developer platforms on Kubernetes
  • Policy-as-code implementation (Azure Policy, OPA, Kyverno)
  • Cost optimization and FinOps practices for Kubernetes
  • Chaos engineering and reliability testing
  • Multi-region disaster recovery patterns

Required Qualifications

  • Technical Skills - Azure & Kubernetes
  • 5+ years of hands-on Kubernetes experience in production environments
  • 2+ years of Azure Kubernetes Service (AKS) experience required
  • Strong Terraform expertise with proven ability to build reusable, production-ready modules
  • Deep understanding of Kubernetes architecture, networking, storage, and security
  • Experience with private AKS clusters and Azure Private Link/Private Endpoints
  • Proficiency with Azure networking: VNets, subnets, NSGs, private DNS zones, VNet peering
  • Strong understanding of Azure managed identities, Workload Identity, and RBAC
  • Experience with Azure Key Vault integration (CSI driver, disk encryption sets)
  • Hands-on experience with customer-managed encryption keys in Azure
  • Experience with Azure Container Registry including geo-replication and vulnerability scanning
  • Knowledge of AKS advanced features (Fleet Manager, AKS Automatic, Managed Namespaces) is a plus


#LI-MA1

#LI-HYBRID