گروه اسنپ | Snapp Group

تاسیس در ۲۰۱۳ کامپیوتر، فناوری اطلاعات و اینترنت بیش از ۱۰۰۰ نفر careers.snappgroup.net/en

استخدام Site Reliability Engineer

این آگهی منقضی شده است

دسته‌بندی شغلی

IT / DevOps / Server
موقعیت مکانی

تهران ، تهران
نوع همکاری

تمام وقت
حداقل سابقه کار

سه تا شش سال
حقوق

توافقی

شرح موقعیت شغلی

As a Site Reliability Engineer (SRE) at Snapp! Express, you will play a critical role in ensuring the availability, reliability, and performance of our systems that power our logistics platform. You will be responsible for maintaining and improving the infrastructure and tools that support our services, with a focus on monitoring, alerting, and automation. Your expertise in Grafana, Prometheus, Python, and Linux will be essential in driving operational excellence and enabling rapid response to incidents. Responsibilities may include designing and implementing continuous integration and delivery pipelines, managing cloud infrastructure, monitoring system performance, troubleshooting issues, and ensuring security and compliance. Strong communication skills and experience with tools such as Docker, Kubernetes, Git, and Iac – CaC - CICD tools are often required for this role

Responsibilities:
• At least 3 years of experience as a DevOps/SRE Engineer or a related role
• Strong experience with Linux system administration
• Strong experience with at least one programming language (e.g. Bash, Python)
• Experience with Kubernetes, Docker, and container orchestration
• Experience with MySQL or other relational databases
• Experience with infrastructure automation tools (e.g. Ansible, Chef, Puppet)
• Experience with Git and GitLab CI/CD
• Familiarity with monitoring tools such as Prometheus, Grafana, ELK stack, or similar
• Excellent communication and problem-solving skills
• Monitor and maintain the health, performance, and availability of Snapp! Express systems using Grafana and Prometheus.
• Develop, implement, and maintain automated monitoring, alerting, and reporting solutions to proactively detect and resolve system issues.
• Collaborate with development and operations teams to identify and implement system improvements, including performance optimizations, capacity planning, and automation of repetitive tasks.
• Investigate and troubleshoot incidents, perform root cause analysis, and implement remediation actions.
• Create and maintain technical documentation, including runbooks and playbooks, to ensure effective knowledge sharing and incident resolution.
• Stay up-to-date with the latest industry trends and best practices in SRE, DevOps, and automation technologies, and proactively apply this knowledge to improve Snapp! Express systems.
• Design, implement, and maintain robust and reliable backup and disaster recovery solutions to protect critical systems and data.
• Design and implement infrastructure and tools to support software development, testing, deployment, and monitoring
• Maintain and improve existing infrastructure and tools
• Work with developers to ensure that applications are designed and deployed in a scalable, secure, and efficient manner
• Automate manual processes using tools such as Ansible, Bash, Python
• Deploy, manage, and monitor applications in a Kubernetes environment
• Manage and secure network traffic using iptables or similar tools
• Manage and monitor databases, specifically MySQL
• Manage and monitor application logs and metrics using tools such as Pro

Requirements :

• Bachelor's degree in computer science, engineering, or a related field.
• Strong experience in working with Grafana and Prometheus for monitoring and alerting in a production environment.
• Proficiency in Python scripting for automation tasks and system administration.
• Deep understanding of Linux-based operating systems, including performance tuning, troubleshooting, and security.
• Strong problem-solving skills and ability to analyze and resolve complex technical issues in a timely manner.
• Excellent communication and collaboration skills, with the ability to work effectively in a team-oriented environment.
• Prior experience in a Site Reliability Engineer or similar role, with a track record of improving system reliability, performance, and availability.

ثبت آگهی استخدام در جابینجا

معرفی شرکت

گروه اسنپ با برندهای شناخته‌شده‌ای همچون اسنپ، اسنپ‌فود، اسنپ‌باکس، اسنپ‌تریپ، اسنپ‌استور، اسنپ‌ساپلای، اسنپ‌دکتر، اسنپ‌کیچن، اسنپ‌پی، اسنپ‌مارکت و اسنپ‌شاپ شناخته می‌شود.
دستاوردهای چشمگیر گروه اسنپ، آن را به یکی از موفق‌ترین کسب‌وکارهای ایران تبدیل کرده است.
ما به‌ سرعت در حال رشد هستیم، و این به معنای فرصت‌های نامحدود برای شماست.
به ما بپیوندید و در سفری هیجان‌انگیز در قلب توسعه کسب‌وکار و عضوی از یک تیم بین‌المللی باشید.

مهارت‌های مورد نیاز

SRE CI/CD Docker Gitlab
جنسیت

مهم نیست
وضعیت نظام وظیفه

مهم‌ نیست
حداقل مدرک تحصیلی

کارشناسی

این آگهی منقضی شده است

مشاغل مشابه
اطلاع‌رسانی از طریق ایمیل

DevOps Engineer (امروز)
- گپ فیلم | Gapfilm
- تهران، تهران
- قرارداد تمام‌وقت (برای مشاهده حقوق وارد شوید)
(DevOps Engineer (SRE (۱۲ روز پیش)
- خانه هوش ایران | Iran Ai House
- تهران، تهران
- قرارداد تمام‌وقت (برای مشاهده حقوق وارد شوید)
SRE) DevOps Engineer-دورکاری) (۵ روز پیش)
- مکتب زبان دیدار | DidarLI
- تهران، تهران
- قرارداد دورکاری (برای مشاهده حقوق وارد شوید)
Cloud Engineer (۱۲ روز پیش)
- خانه هوش ایران | Iran Ai House
- تهران، تهران
- قرارداد تمام‌وقت (برای مشاهده حقوق وارد شوید)
DevOps Engineer (۲۴ روز پیش)
- صرافی ارز دیجیتال کبرین | Kebrin Crypto Exchange
- تهران، تهران
- قرارداد تمام‌وقت (برای مشاهده حقوق وارد شوید)
DevOps Engineer (۱۳ روز پیش)
- فناوری‌های علوم داده حسابا | Hesaba
- تهران، تهران
- قرارداد تمام‌وقت (برای مشاهده حقوق وارد شوید)