AI-generated summary
This job is for a Site Reliability Engineer focusing on network systems. You might like this job because you’ll automate tasks, ensure services run smoothly with minimal downtime, and work with coding languages like Java or Python in a dynamic tech environment!
RM 5K - RM 10K
G Tower, Kuala Lumpur
Job Description:
Ability to debug scripts and automate routine tasks in OS, network, database or application servers. Coding experience beyond simple scripts;
• Experience in Devops process, programming knowledge in at least one of the following languages: Java, Python, or Go;
• Scripting skills in at least of the following: Shell, Terraform, Ansible, Chef or Puppet;
• Deep Understanding of Unix/Linux operating systems, virtual machines, containers, Container management systems, Enterprise cloud platforms and data structures;
• Engage in and improve the lifecycle of services—from Launch through to deployment, operation and optimization in reliability and user experience;
• Ensure service reliability once they are live by measuring and monitoring availability, latency, and overall system health. Practice sustainable incident response;
Site Reliability Engineer:
• The service reliability SLA is greater than or equal to 99.99% annual downtime<=52.56min. No live-network accidents are caused by manual operations;
• The fault recovery duration (MTTR) meets the KPI requirements of the department. (The target for 2022 is less than 89 minutes, which will be updated every year.) Timely closure rate of major and higher alarms > = 95%. The major and critical alarm should be timely handled and clear within 24hours;
• The dual-cloud drill is 100% completed as required (once every half a year), and the drill summary materials are archived as required;
• The average closure duration of change flows meets the annual KPI requirements of the department. (In 2022, the target is less than 4 days and will be updated every year.);
• To provide on-call duty to handle daily alert, work order, upgrade etc;
• Others triggering tasks, such as OS patch upgrade and security hardening, are completed according to the planned time of the project.
Requirement:
Bachelor degree or above in Computer science/Electronics & communication;
• Have in-depth knowledge of SRE role and Devops process;
• Have strong observation and critical thinking to handle business emergencies;
• Ability to adapt to dynamic environment and apply problem solving skills to resolve issues;
• Have excellent written and verbal communication skills;
Better to have
Last active - few minutes ago
2 - 10 Years of Experience
Junior Executive
Cybersecurity / Network Security
Get notified on similar new jobs!