We seek an experienced engineer to join our team and help us fulfill our mission by delivering a rich feature set, ensuring high availability, and maintaining stellar performance. In this role, you will be responsible for designing and developing monitoring tools and platforms that guarantee the high-performance stability of our infrastructure.
Responsibilities
Maintain and configure monitoring services to ensure reliability and uptime;
Implement monitoring strategies to track the health and performance of systems and services;
Troubleshoot and resolve issues within the monitoring platforms;
Optimize and enhance existing monitoring tools for better performance and scalability;
Collaborate with other technical teams to integrate monitoring solutions across the infrastructure;
Develop both backend and frontend components of monitoring solutions;
Implement automated processes for data protection, disaster recovery, and failover procedures;
Develop, implement, and maintain procedures to measure and track service performance and quality;
Document problems, define solutions, prioritize issues, and assess the impact of problems;
Requirements
At least 2 years of work experience as a Software Engineer, SRE, or related positions;
Proven experience in backend development, preferably using Go, Python, and Flask;
Strong knowledge of Linux system management and administration;
Experience with performance tuning and optimization for high-traffic systems;
Experience with at least one logging stack, preferably ELK (Elasticsearch, Logstash, Kibana);
Experience with CI/CD pipelines and infrastructure as code (IaC) tools like Ansible or GitLab;
Knowledge of microservices architecture, containerization, and orchestration tools like Docker Swarm and Kubernetes;
Familiarity with open-source services such as HAProxy, MySQL, Redis, and Memcached;
Self-motivated, proactive, and capable of multi-tasking in a collaborative environment;
Excellent problem-solving mindset and the ability to diagnose complex technical issues;
Detail-oriented and the be able to manage multiple projects and meet deadlines;
Excellent communication and collaboration skills, essential for effectively working with and supporting team members;
Preferred Qualifications
Experience in software design and architecture, with a strong understanding of data structures and algorithms;
Familiarity with design patterns, with the ability to create scalable and maintainable architectures;
Familiarity in frontend development with knowledge of modern JavaScript frameworks;
Experience with networking principles of operation systems (DNS, Routing, Firewalls, etc.);
Knowledge of cloud platform development tools like OpenStack;
Prior experience in a similar role within a cloud-based environment;
Experience using artificial intelligence in the implementation of anomaly detection methods;
Suppose you are a skilled developer with experience in system operations and infrastructure management, and eager to work in a dynamic and challenging environment. In that case, we'd love to hear from you!
معرفی شرکت
ما در دیجیکالا به عنوان شرکتی که در حوزه تجارت الکترونیک فعالیت میکنه، به دنبال تحقق رویای «لبخندی برای همه ایران» هستیم. در همین راستا، با بهرهگیری از فناوریهای روز دنیا و توسعه مداوم سرویسهای مبتنی بر تکنولوژی، ارزشهای خودمون رو در مشتریمحوری، اشتیاق برای تعالی، کارگروهی و نتیجهگرایی دنبال میکنیم.
در گروه دیجیکالا امکانی فراهم شده تا ما با افراد با تخصصهای متنوع در یک مجموعه فعالیت کنیم. علاوه بر این، با توجه به سرعت رشد بالا در دیجیکالا، امکان رشد و توسعه رو در مواجهه با چالشها و استفاده از برنامههای توسعه و آموزش متنوع داریم.