Title: Senior Platform Engineer
Location: Remote
Type: Direct Hire
Role Overview
We are seeking a Senior Platform Engineer to help build, operate, and continuously improve a modern cloud platform supporting highly available and scalable applications. This role blends DevOps and Site Reliability Engineering principles, with a strong focus on reliability, automation, and operational maturity across multi-cloud environments.
The ideal candidate brings deep hands-on experience in AWS and Azure, thrives in complex distributed systems, and enjoys partnering with development teams to improve how software is built, deployed, and operated.
How You’ll Make an Impact
-
Architect and evolve cloud environments that balance performance, resilience, security, and cost across AWS and Azure
-
Own the reliability and day-to-day health of Kubernetes platforms, including cluster lifecycle management and workload optimization
-
Advance software delivery practices by building and refining automated pipelines that enable fast, repeatable deployments
-
Establish strong observability practices through metrics, logs, and alerts that enable proactive system monitoring
-
Champion infrastructure automation and Infrastructure-as-Code to reduce manual effort and improve consistency
-
Work closely with application teams to improve system reliability, deployment confidence, and operational workflows
-
Contribute to incident management, on-call support, and disaster recovery readiness
-
Embed security best practices into cloud platforms, networking, and access controls
What You Bring
-
10+ years of experience in platform engineering, SRE, DevOps, or cloud infrastructure roles
-
Proven experience operating production workloads in both AWS and Azure environments
-
Deep expertise managing Kubernetes at scale, including EKS and AKS
-
Strong background designing and maintaining CI/CD workflows using platforms such as Azure DevOps, GitHub Actions, Jenkins, or similar tools
-
Hands-on experience with monitoring and observability platforms, with Datadog preferred
-
Strong scripting and automation skills using languages such as Python, Bash, or PowerShell
-
Practical experience implementing Infrastructure-as-Code using tools like Terraform, Bicep, or CloudFormation
-
Solid understanding of cloud networking, security, and distributed system architecture
-
Ability to troubleshoot complex issues under pressure and communicate clearly during incidents
Preferred Attributes
-
Experience improving platform reliability and operational maturity in high-growth or complex environments
-
Strong collaborator who can bridge the gap between infrastructure and application teams
-
Passion for automation, system resilience, and continuous improvement

