استخدام SRE) Site Reliability Engineer-دورکاری)
شرح موقعیت شغلی
We’re seeking a site reliability engineer (SRE) who will be responsible for keeping all user-facing services and other Miare production systems running smoothly. The candidate should have both characteristics of a pragmatic operator and a software craftsman that applies sound engineering principles, operational discipline, and mature automation to our environments and the Miare codebase.
As an SRE you will:
- Design, build & maintain our infrastructure so that it meets certain SLOs and enables future growth.
- Work in close collaboration with the engineering team to shape the future roadmap and establish strong operational readiness across teams.
- Make sure that every change is done using infrastructure (and config) as code, is repeatable, and prevents config drift.
- Be the owner of SLIs: monitor every part, measure metrics that matter, and design the metrics & rules that are measured in Prometheus.
- Collaborate with the engineering team to make sure that the deployment process is as boring as possible.
- Document incidents, do proper post-mortem investigations, and plan necessary tasks to improve reliability.
You might be a good fit if you:
- Can analyze systems: their edge cases, failure modes, behaviors, specific implementations.
- Know your way around Linux and bash.
- Know what is the use of config management systems like Ansible (the one we use).
- Can comfortably develop python/golang/bash scripts.
- Have an urge for delivering quickly and iterating fast.
- Can break down broadly defined projects and objectives into smaller, well-defined steps, and execute them.
If you have some of these skills and would like to learn the rest, send us your resume.
Projects you will work on:
- Improve our monitoring infrastructure by defining better metrics in our existing setup of Prometheus & Grafana, and tracing solutions to our tool bench.
- Plan, prepare for, and execute the migration of Miare from VMs to cloud-native deployments with Kubernetes.
- Collaborate with the engineering team on resolving architectural bottlenecks.
Terms of cooperation:
- Company scope: Near Punak.
- Flexible working hours.
- Talented colleagues and an interesting work environment.
- Timely pay.
- Free supplementary health insurance.
- Military service benefits.
مهارتهای مورد نیاز
- SRE
- kubernetes
- Grafana
- Python
- Golang
حداقل سابقه کار
- مهم نیست
جنسیت
- مهم نیست
وضعیت نظام وظیفه
- معافیت تحصیلی معافیت دائم پایان خدمت