About Wellfit
Wellfit is the dental industry’s fintech platform, removing financial friction so patients, providers, employers, and payors can access better care. As a profitable healthcare fintech innovator, we’re transforming the patient journey and redefining what’s possible in dental
care.
About the Role
• We are seeking a Senior DevOps Engineer, Application Reliability to own and evolve the reliability, observability, and performance of our Azure-based application ecosystem.
• This role sits between platform engineering and application development, focusing on application-level reliability rather than traditional infrastructure SRE. You’ll partner closely with engineering teams to ensure our systems are observable, resilient, and safe to deploy—especially as we accelerate delivery through AI-assisted development and modern deployment practices.
• If you enjoy debugging real production issues, improving developer confidence through strong observability, and preventing incidents before they happen, this role is built for
you.What You’ll DoApplication Reliability & Observability
• Design and implement end-to-end observability for Azure App Services using Application Insights, Azure Monitor, and distributed tracing.
• Build and maintain actionable dashboards that surface application health, performance bottlenecks, and reliability risks.
• Analyze logs, metrics, and traces to proactively identify issues and reduce MTTD / MTTR.
• Partner with developers to instrument applications correctly and improve runtime visibility.
Azure App Service & Application Debugging
• Serve as a subject-matter expert on Azure App Service architecture, scaling, and performance tuning.
• Debug production issues at the application level, including C#, .NET, Angular, and SQL-related problems.
• Use tools such as Diagnose and Troubleshoot, Kudu, and App Service diagnostics to resolve complex issues.
Automation & Operational Excellence
• Automate monitoring, alerting, and common remediation workflows using PowerShell and Azure tooling.
• Reduce operational toil through better tooling, alert quality, and self-service diagnostics.
• Support safe daytime deployments, incremental rollouts, and environment stability.
Incident Response & Continuous Improvement
• Participate in on-call rotations and lead incident response for application reliability issues.
• Write clear, high-quality root cause analyses (RCAs) focused on prevention, not blame.
• Continuously improve reliability patterns as delivery velocity increases.
Collaboration & Enablement
• Act as a reliability partner to development teams—not a gatekeeper.
• Document best practices for monitoring, debugging, and deployment safety.
• Help lay the observability foundation required for AI-accelerated development workflows (BMAD).
What We’re Looking For
• Bachelor’s degree in Computer Science, IT, or equivalent experience.
• Azure Fundamentals (AZ-900) certification required.
• Proven experience in application-focused DevOps or SRE roles, with deep hands-on Azure work.
• Strong experience with Azure App Services, Application Insights, and Azure Monitor.
• Ability to read, understand, and debug .NET/C# and Angular code.
• Experience with incident response, on-call operations, and RCA writing.
• Comfort operating in fast-moving environments with increasing deployment velocity.
Bonus Experience
• Grafana, Prometheus, Datadog, or Dynatrace
• Azure Front Door, CDN, Function Apps, WebJobs
• Service Bus, Event Hub, or distributed messaging
• Additional Azure certifications
Why Wellfit
Real Impact: Your work directly supports a mission-driven healthcare fintech platform.
Hybrid Work: Dallas-based hybrid model (3 days/week in office).
Strong Benefits: Medical, dental, vision, generous PTO.
Competitive Compensation: Salary, bonus eligibility, and 401(k) matching.
Growth & Scale: Join a profitable startup entering a high-velocity growth phase.