Site Reliability Engineer (Mandarin Speaker!) - Ampang Jaya

apartmentHiredly X placeAmpang Jaya scheduleFull-time calendar_month 
The Site Reliability Engineer (SRE) ensures the reliability and performance of critical services, bridging development and operations. The role focuses on scalable infrastructure, SRE practices such as SLOs and SLIs, and reducing operational toil.

Collaboration with teams to improve reliability and foster a continuous learning culture is key.

  • Design and implement resilient system architectures for high availability and scalability.
  • Develop automation tools and scripts to improve operational efficiency.
  • Define, track, and analyze SLOs and SLIs for performance and reliability.
  • Conduct post-mortem analyses and implement improvements based on findings.
  • Collaborate on best practices for system reliability and incident management.
  • Troubleshoot and resolve database, network, and deployment issues.
  • Ensure issue resolution meets Service Level Agreements (SLAs).
  • Identify and address system performance bottlenecks with actionable recommendations.
  • Maintain documentation for processes and incident responses.
  • Proficiency in programming languages like Python, Golang, or Java.
  • Experience in system architecture with a focus on reliability and scalability.
  • Strong understanding of SRE principles (SLOs, SLIs, toil reduction).
  • Experience with cloud environments (AWS, Azure, Google Cloud).
  • Expertise in Linux system administration.
  • Problem-solving skills with a proactive approach to operational challenges.
  • Ability to work independently and collaborate in a team environment.
  • Able to speak, read, and write in Mandarin to support communication and collaboration across teams.

Preferred skills:

  • Familiarity with monitoring tools and performance optimisation.
  • Experience with system administration automation and scripting.
  • Knowledge of networking concepts and troubleshooting.
  • Hands-on experience with cloud platforms and services.
  • Familiarity with DevOps practices (CI/CD, infrastructure as code, containerisation).
apartmentAirAsia XplaceKuala Lumpur, 8 km from Ampang Jaya
goal is to create seamless, reliable, and delightful journeys for travelers across the region. About the Role We’re looking for a Senior Site Reliability Engineer to help scale and stabilize our cloud infrastructure and reliability practices as we...
apartmentHiredly XplaceKajang, 18 km from Ampang Jaya
The Site Reliability Engineer (SRE) ensures the reliability and performance of critical services, bridging development and operations. The role focuses on scalable infrastructure, SRE practices such as SLOs and SLIs, and reducing operational toil...
apartmentExxonmobil Business Support Centre MalaysiaplaceKuala Lumpur, 8 km from Ampang Jaya
center in Kuala Lumpur that provides high-level information technology and engineering expertise to ExxonMobil’s upstream, downstream and chemical businesses worldwide. What role you will play in the team  •  As an experienced Reliability Engineer, you...