ML Engineer - Model Training Infrastructure
CIeNET
- Taipei City
- Permanent
- Full-time
- Design, implement, and optimize scalable machine learning training pipelines tailored for Transformers and LLM-related technologies.
- Fine-tune and deploy state-of-the-art models, including transformers and LLMs (e.g., GPT, Llama, NVIDIA NeMo..etc), for various cloud-based MLOps pipelines.
- Utilize cloud-based GPU resources for efficient model training and experimentation.
- Collaborate with cloud engineers to design robust MLOps pipelines and deploy trained models into production environments.
- Monitor, maintain, and optimize model performance in production, ensuring reliability and scalability.
- Research and integrate the latest advancements in related technologies into model training workflows.
- Contribute to the continuous improvement of the ML infrastructure, focusing on automation and reproducibility.Required Qualifications- Bachelor's or Master's degree in Computer Science, Machine Learning, Data Science, or a related field.
- Strong understanding of machine learning techniques, with a focus on deep learning, NLP, and transformer-based architectures.
- Proficiency in PyTorch or TensorFlow, with hands-on experience in fine-tuning AI models.
- Experience in designing and executing machine learning model training workflows, including data preparation, feature engineering, hyper-parameter tuning, and performance evaluation.
- Familiarity with MLOps principles, tools, and frameworks, including CI/CD pipelines for ML models.
- Strong programming skills, particularly in Python (required), and familiarity with other languages like Java or other languages is a plus.
- Excellent problem-solving, analytical, and communication skills.Preferred Qualifications- Experience in cloud computing technologies, particularly GCP and K8s, is a plus.- Experience with distributed and large-scale model training.- Experience with Linux and docker.SalaryNegotiable