Other Jobs
Loading...

Machine Learning Infrastructure Engineer [UAE Based]

Sorry, looks like this job is no longer open 😔

Check out other job openings on our job board!

View more
Company
AI71
Job location
London, UK
Salary
Undisclosed
Posted
Hosted by
Adzuna

Job details

Job Title: ML Infrastructure Senior Engineer Location: Abu Dhabi, United Arab Emirates [Full relocation package provided] Job Overview We are seeking a skilled ML Infrastructure Engineer to join our growing AI/ML platform team. This role is ideal for someone passionate about large-scale machine learning systems and has hands-on experience deploying LLMs/SLMs using advanced inference engines like vLLM. You will play a critical role in designing, deploying, optimizing, and managing ML models and the infrastructure around them—both for inference, fine-tuning and continued pre-training. Key Responsibilities · Deploy large-scale or small language models (LLMs/SLMs) using inference engines (e.g., vLLM, Triton, etc.). · Collaborate with research and data science teams to fine-tune models or build automated fine-tuning pipelines. · Extend inference-level capabilities by integrating advanced features such as multi-modality, real-time inferencing, model quantization, and tool-calling. · Evaluate and recommend optimal hardware configurations (GPU, CPU, RAM) based on model size and workload patterns. · Build, test, and optimize LLMs Inference for consistent model deployment. · Implement and maintain infrastructure-as-code to manage scalable, secure, and elastic cloud-based ML environments. · Ensure seamless orchestration of the MLOps lifecycle, including experiment tracking, model registry, deployment automation, and monitoring. · Manage ML model lifecycle on AWS (preferred) or other cloud platforms. · Understand LLM architecture fundamentals to design efficient scalability strategies for both inference and fine-tuning processes. Required Skills Core Skills: · Proven experience deploying LLMs or SLMs using inference engines like vLLM, TGI, or similar. · Experience in fine-tuning language models or creating automated pipelines for model training and evaluation. · Deep understanding of LLM architecture fundamentals (e.g., attention mechanisms, transformer layers) and how they influence infrastructure scalability and optimization. · Strong understanding of hardware-resource alignment for ML inference and training. Technical Proficiency: · Programming experience in Python and C/C++, especially for inference optimization. · Solid understanding of the end-to-end MLOps lifecycle and related tools. · Experience with containerization, image building, and deployment (e.g., Docker, Kubernetes optional). Cloud & Infrastructure: · Hands-on experience with AWS services for ML workloads (SageMaker, EC2, EKS, etc.) or equivalent services in Azure/GCP. · Ability to manage cloud infrastructure to ensure high availability, scalability, and cost efficiency. Nice-to-Have · Experience with ML orchestration platforms like MLflow, SageMaker Pipelines, Kubeflow, or similar. · Familiarity with model quantization, pruning, or other performance optimization techniques. · Exposure to distributed training frameworks like Unsloth, DeepSpeed, Accelerate, or FSDP.
Get the freshest news and resources for developers, designers and digital creators in your inbox each week
Start Free Trial
Connect
RSSFacebookInstagramTwitter (X)
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
© 2000 - 2024 SitePoint Pty. Ltd.