آگهی‌های استخدامی

استخدام متخصص هوش مصنوعی

شرح موقعیت شغلی

Position: Distributed Systems Engineer

Requirements

  • Must-Have:
    • Deep knowledge of LLM inference (transformers, forward passes, KV cache, prefill/decode phases)
    • Proficiency in Python (and ideally Rust, GO or C++ )
    • Experience with GPU programming (CUDA, vLLM, Hugging Face, or similar inference engines)
    • Experience with distributed systems and high-performance networking
  • Strongly Preferred:
    • Experience with decentralized systems, P2P networks, or blockchain-adjacent infrastructure
    • Familiarity with model parallelism techniques (pipeline parallelism, tensor parallelism)
    • Knowledge of compression algorithms for tensors/activations
    • Experience with real-time routing, service mesh, or orchestration systems (Kubernetes, Ray, etc.)
    • Strong systems thinking and performance engineering mindset
Key Responsibilities

  • Design & Implement the Core Architecture
    • Model partitioning into executable blocks/layers across distributed nodes 
    • Latency-aware dynamic routing using real-time telemetry (latency, queue, GPU load, trust scores, etc.) 
    • Adaptive layer-based replication with cost-benefit logic 
    • Observable metrics pipeline (p95 latency, bottlenecks, failure rates, audit overhead) 
  • Build Production-Grade Features
    • Request gateway, route planner, aggregator 
    • Operator dashboards for visibility into routes, nodes, and blocks 
    • Benchmarking framework against naive pipelines 
  • Technical Direction
    • Start with smaller open-source models and scale to large models 
    • Define MVP success metrics and iterate rapidly 

مهارت‌های مورد نیاز

  • هوش مصنوعی
  • Python
  • Pytorch

حداقل سابقه کار

  • سه تا شش سال

جنسیت

  • مهم نیست

وضعیت نظام وظیفه

  • مهم‌ نیست

نوع همکاری:

تمام وقت

تاریخ انتشار آگهی:

۱۴۰۵/۰۳/۱۱
ارسال رزومه