Key Responsibilities: • Design, implement, and optimize data ingestion pipelines using Apache NiFi to handle sources like CSV and RDBMS, converting data to formats such as Parquet. • Configure and manage a Spark standalone cluster for efficient data processing. • Set up and maintain MinIO cluster for object storage, including raw and processed buckets. • Orchestrate end-to-end data workflows using Apache Airflow • Monitor system performance, logs, and health across nodes using built-in tools and optional monitoring services; ensure high availability and quick issue resolution. • Work cross-functionally with other stakeholders to align infrastructure with business needs, including documentation and knowledge sharing. • Develop and maintain ETL Pipelines using Pyspark and Python • Proficient in SQL and willing to write complex queries
Required Qualifications: • Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field (Master's preferred). • 3+ years of experience in data engineering, infrastructure, or operations roles, with a focus on building and maintaining data pipelines and systems. • Proven hands-on experience with Apache NiFi for data ingestion and ETL processes. • Strong expertise in Apache Spark (standalone or clustered) for distributed data processing. • Proficiency with object storage solutions like MinIO (or S3-compatible systems) and database management using SQL Server and Oracle. • Experience with workflow orchestration tools such as Apache Airflow. • Solid understanding of data formats (e.g., Parquet, CSV), data flows, and optimization techniques for performance and scalability. • Knowledge of monitoring, logging, and troubleshooting in data environments.
شرکت تجارت الکترونیک پارسیان وابسته به بانک پارسیان میباشد که در زمینه خدمات نرمافزاری، تجارت الکترونیک، خدمات کارت، درگاه پرداخت و دستگاه کارتخوان فعالیت دارد.