Résumé du poste par JobGrid
Senior Site Reliability Engineer at Omilia: Remote, Philippines; Temps plein. JobGrid adds normalized role facts, source context, and a path to the employer application page so candidates can compare the listing before applying.
- Location and workplace: Remote, Philippines
- Role classification: Temps plein
- Source freshness: checked by JobGrid on 2026-06-04.
- Application path: candidates continue to the employer application page with non-personal referral tags.
We are looking for a Senior Site Reliability Engineer with Cloud platform experience. This individual will be part of a team responsible for operating and maintaining production clusters and developing our observability solutions; they will collaborate with team members to develop automation strategies, monitoring & alerting, and ensuring overall platform reliability. Your goal will be to become an integral part of the team, making every challenge of the platform – your own challenge, and solving them accordingly.
Responsibilities
- Ensure platform reliability and availability across production and pre-production environments through proactive monitoring, alerting, and automation.
- First response for incidents, contribute to problem management and root cause analysis.
- Supporting the development team's effort towards reliability, creating a solid reliability culture within the development lifecycle.
- Develop troubleshooting documentation for production support resources.
- Collaborate with Engineering teams to develop optimised and productive runbooks, operational documentation and automation of operational tasks.
- Collaborate with development and cloud engineering teams to embed reliability and performance into the software delivery lifecycle.
- Design, implement, and evolve observability solutions (metrics, logs, traces, dashboards) using tools such as Prometheus, Grafana, and ELK.
- Participate in on-call rotations and continuously improve alert quality and response processes.
- Champion a culture of reliability, performance, and continuous improvement across teams.