AI-generated summary
This job is for a Senior Site Reliability Engineer in Kuala Lumpur at AirAsia MOVE. You might like this job because you'll design cloud systems, boost automation, and enhance reliability while collaborating with diverse teams to improve customer journeys!
Undisclosed
Kuala Lumpur, Kuala Lumpur
Full-Time
1 week ago
Job Description
Location: Kuala Lumpur
About AirAsia MOVE
AirAsia MOVE is a leading ASEAN-focused budget travel OTA, part of the Capital A Group. We deliver customer-centric travel solutions by combining innovation with operational excellence. Our goal is to create seamless, reliable, and delightful journeys for travelers across the region.
About the Role
We’re looking for a Senior Site Reliability Engineer to help scale and stabilize our cloud infrastructure and reliability practices as we grow across multiple lines of business.
You’ll lead key initiatives around:
Cloud architecture modernization.
Multi-region reliability.
Observability and incident response.
Reducing toil through automation and self-service.
This is a hands-on technical role, where you’ll work across platforms, SRE, and application teams to build scalable systems that are resilient, cost-aware, and developer-friendly.
What You’ll Do
Design and implement secure, scalable infrastructure on Google Cloud Platform (GCP).
Lead efforts to build and evolve MOVE’s GCP Landing Zone, including Shared VPC, org structure, IAM, and policy guardrails
Build and improve multi-region architectures for high availability and disaster recovery.
Drive infrastructure automation using Terraform, CI/CD, and GitOps practices.
Improve observability across teams by standardizing monitoring, tracing, and alerting.
Collaborate on incident response and postmortems to reduce MTTR and build resilience.
Enforce tagging, FinOps controls, and security policies across GCP projects.
Contribute to platform engineering initiatives and developer self-service tools.
What We’re Looking For
5+ years in SRE, DevOps, or cloud infrastructure roles.
Solid experience with GCP, Terraform, Kubernetes (GKE), or similar cloud providers.
Strong hands-on experience in automation and multi-region architecture design.
Experience in networking (VPCs, NAT, PSC), IAM, and cloud-native security.
Proven ability to debug and support production systems under pressure.
Familiarity with monitoring and tracing tools like Cloud Monitoring, OpenTelemetry, Signoz.
Exposure to using AI/anomaly detection for alert tuning or reliability insights.
Clear communicator who works well with developers, product, and other infra teams.
The company offers various perks such as travel discounts, which include reduced rates for flights and access to e-coupon schemes.
The company invests in its employees through training programs, workshops, and skill development initiatives.
The company is known for its innovative culture and encourages employees to bring creative ideas to the table.
Last active - few hours ago
0 - 10 Years of Experience