Roles and Responsibilities
- Design, build, maintain and scale monitoring services across multiple environments overcloud/on-prem deployments.
- Plan, organize and manage staff and overall SRE operations to ensure the stability of customer’s infrastructure.
- Provides overall expertise with all network operations functions.
- Ability to define, implement and enforce configuration management and change management policies and practices.
- Handle escalations and manage support from different levels. Perform tier-2 and 3 escalation support and act as the point of contact in the NOC for all inquiries from other departments.
- Ensure adherence to operational metrics.
- Manage tools, systems and procedures to ensure dynamic management of issues and customer management.
- Identify areas for process and efficiency improvement within the NOC; recommend prioritized enhancements and oversee the implementation.
- Ensure continual process improvement within the NOC including but not limited to automation of NOC tasks and reporting, implementation of enterprise-wide monitoring initiatives, and routine administration tasks.
- Ensure that reports are accurate and delivered on time.
- Hire, develop, and retain highly responsive and customer-focused engineers to ensure the effective operation of the department.
- Ensures all members of assigned technical teams are effective and fully utilized in order to provide high resource utilization.
- Evaluate the technical skills of the team and ensures there is an appropriate level of expertise.
- Provide procedural training to staff.
- Perform performance objectives and performance reviews with all team members.
Ideal Candidates Must Have
- More than 4 years of experience in building and managing high-performance 24/7 SRE teams.
- Monitor Tools : Grafana , Prometheus , Kibana , aws ,azure , gcp monitor cloud tools .
- Cloud Knowledge and relevant certification.
- Good knowledge of Active Directory, DHCP, DNS, Clustering, Load Balancing, Anti-virus, backup procedures, Group policy, Disaster recovery and High availability using industry standards.
- Network services experience is desired.
- A bachelor’s or master’s degree with qualifications in computer science, electrical engineering, telecommunication engineering and information technology and related fields.
- Fluent English – Verbal and written – Must.
- Excellent leadership qualities.
- Excellent skills in developing processes and procedures for Client & In-house team.
- Ability to interact with clients in a professional, articulate manner.