We are seeking an experienced and proactive Senior Site Reliability Engineer (SRE) to join our team. As a Senior SRE, you will play a key role in enhancing the reliability, availability, and performance of our critical services and applications. You will partner with development teams to create solutions that scale efficiently and support a robust production environment.
Key Responsibilities:
• Design and implement reliable, scalable, and automated infrastructure solutions to support our applications and services.
• Collaborate with cross-functional teams to improve service reliability and performance; establish best practices in monitoring, incident response, and capacity planning.
• Develop and maintain CI/CD (Continuous Integration/Continuous Deployment) pipelines and automation to improve deployment processes and reduce downtime.
• Monitor system performance, identify bottlenecks or issues, and formulate strategies for remediation and scalability.
• Lead incident response efforts and perform root cause analysis following outages and incidents, implementing changes to prevent future occurrences.
• Mentor and guide junior SRE and engineering team members in best practices and continuous improvement initiatives.
• Document processes, incident reports, and operational runbooks to enhance knowledge sharing and training within the team.
Requirements:
• Strong experience with orchestration technologies (Kubernetes, Docker).
• Proficient in scripting and programming languages (e.g., Python, Go, Bash).
• Extensive experience with monitoring and logging solutions (e.g., Prometheus, Zabbix, Grafana, ELK Stack).
• Solid understanding of networking concepts, distributed systems, and microservices architecture.
• Proven track record in incident management, root cause analysis, and performance tuning.
• Excellent communication and collaboration skills to work effectively across teams.
Preferred Qualifications:
• Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation).
• Familiarity with service mesh implementation (e.g., Istio, Linkerd).
• Knowledge of security best practices in cloud environments and applications.
• Contributions to open-source projects or active participation in the SRE/DevOps community.
Benefits:
• Competitive salary and performance-based bonuses.
• Flexible working hours and remote work opportunities.
• Comprehensive health, dental, and vision insurance.
• [List any additional benefits such as retirement plans, paid time off, professional development opportunities, etc.]
• A collaborative and innovative work environment that values growth and development.
معرفی شرکت
شرکت ارتباط فردا از سال ۸۹ فعالیت خود را آغاز کرده است. ارتباط فردا با نگاهی به آخرین تحولات دنیا در صنعت مالی و شناخت عمیق فضای بانکی و تجارت الکترونیکی در ایران، محصولاتی را طراحی کرده و توسعه میدهد که میتوانند در فضای زندگی دیجیتال ایرانیها تاثیر قابل توجهی داشته باشند. ارتباط فردا با بیش از ۱۵۰ همکار در کنار تلاش برای رشد مجموعه، به رشد زیرساختها و ایجاد تحول در خدمات مالی کشور کمک کرده و در در همین راستا با برنامهریزی برای آینده و دنبال کردن روندهای رو به رشد جهانی، اقدامات اجرایی و عملیاتی برای کمک به این زیرساختها را آغاز کرده است.