استخدام Site Reliability Engineer
شرح موقعیت شغلی
Job Description
You’ll be responsible for monitoring general uptime and availability for all applications owned by SnappBox. The SRE role is embedded within the cross-functional relationship with the DevOps/Tech and Security team. This role also includes weekend shifts.
- Monitoring services
- Incident Management
- Extending and improving current monitoring systems
- Automate the current monitoring process
- Deploying services to the production environment
- Communicating with other teams to resolve issues
- Troubleshooting system problems
- improve monitoring systems
- troubleshot system problems (production)
- deploy and change the production environment
Requirements:
- Have a good experience with Grafana and Prometheus
- Familiar with log shipment/management tools (elk stack)
- Experience in Java/spring as a Developer (More than half of the total professional experiences)
- Familiar with container docker
- Familiar with Kubernetes
- Strong TCP/IP knowledge
- Familiar with Microservice architecture
- Have a good experience with Linux (LPIC-2)
- Familiar with CI/CD
- Familiar with REST API
- Familiar with software engineering and development
مهارتهای مورد نیاز
- Python
- CI/CD
- SRE
- reliability
حداقل سابقه کار
- کمتر از سه سال
جنسیت
- مهم نیست
وضعیت نظام وظیفه
- مهم نیست