ML Engineer - Model Training Infrastructure

CIeNET

Taipei City
Permanent
Full-time

2 days ago

What You'll Do / ResponsibilitiesWe are seeking a highly skilled ML Engineer with expertise in Natural Language Processing (NLP), Transformers, and Large Language Models (LLMs) to join our team. This role will focus on building and optimizing scalable model training infrastructure in cloud environments, with an emphasis on cloud technologies, cloud-based GPUs, and MLOps pipelines. The ideal candidate will have a strong foundation in developing and deploying ML models, particularly in the LLM domain, and experience with cutting-edge frameworks and cloud-native tools.Responsibilities:
- Design, implement, and optimize scalable machine learning training pipelines tailored for Transformers and LLM-related technologies.
- Fine-tune and deploy state-of-the-art models, including transformers and LLMs (e.g., GPT, Llama, NVIDIA NeMo..etc), for various cloud-based MLOps pipelines.
- Utilize cloud-based GPU resources for efficient model training and experimentation.
- Collaborate with cloud engineers to design robust MLOps pipelines and deploy trained models into production environments.
- Monitor, maintain, and optimize model performance in production, ensuring reliability and scalability.
- Research and integrate the latest advancements in related technologies into model training workflows.
- Contribute to the continuous improvement of the ML infrastructure, focusing on automation and reproducibility.Required Qualifications- Bachelor's or Master's degree in Computer Science, Machine Learning, Data Science, or a related field.
- Strong understanding of machine learning techniques, with a focus on deep learning, NLP, and transformer-based architectures.
- Proficiency in PyTorch or TensorFlow, with hands-on experience in fine-tuning AI models.
- Experience in designing and executing machine learning model training workflows, including data preparation, feature engineering, hyper-parameter tuning, and performance evaluation.
- Familiarity with MLOps principles, tools, and frameworks, including CI/CD pipelines for ML models.
- Strong programming skills, particularly in Python (required), and familiarity with other languages like Java or other languages is a plus.
- Excellent problem-solving, analytical, and communication skills.Preferred Qualifications- Experience in cloud computing technologies, particularly GCP and K8s, is a plus.- Experience with distributed and large-scale model training.- Experience with Linux and docker.SalaryNegotiable

CIeNET

Apply Now