-
Job Title: SRE (Site Reliability Engineer)
Job Location: Phoenix, AZ (Onsite).
Job Type: Contract
Job Requirements:
Roles & Responsibilities:
• Manage cloud infrastructure (Azure & AWS) for all environments.
• Drive infrastructure optimization & maintenance activities.
• Responsible for ensuring high availability of the platform.
• Implement develops best practices in logging, proactive monitoring, alerting, CI CD, Observability etc., using new enterprise tools.
• Lead with #Automation mind-set# - Eliminate repetitive manual process by developing necess ary tools.
• Handle production support, change management, on-call rotation, ensure service restoration as per SLA.
• Partner with various supporting teams to ensure operational readiness.
• Guide scrum teams on cloud infrastructure design p rinciples to help them design reliable & resilient solutions as per SLOs.
• Understand platform roadmap and contribute to the platform strategy initiatives
• Demonstrate increased self-reliance to achieve team goals.
• Influence team members with creative changes and improvements, challenge status quo and demonstrate taking risks.
• Continuously identify opportunities to improve efficiency of the team by analyzing existing workflow, driving the team to be more effective, productive, and demonstrating faster and stronger results.
• Mentor and guide junior team members to success within the team.
• Excellent communication & co-ordination skills.
• Proven ability to manage critical production issues and collab orate with teams to restore service.
Experience
• Bachelor's Degree in related field preferred proven industry experience.
• 8 years of software engineering, Site Reliability, architecture experience.
• 5 year's supporting a 24 x 7 global operations environment with on-call responsibilities for production support.
• 5 years in supporting Disaster Recovery procedures.
• 5 years in developing software applications using agile methodologies.
• 5 years in identifying application infrastructure risks and mitigation strategy and the ability to work with a team to ensure risks are mitigated.