MLOps Engineer
Full-Time
/
Office
/
Gangnam
Test Tag
Key Responsibilities
- Design, build, and operate cluster architecture to serve machine learning models and establish scalable MLOps infrastructure
- Develop robust ML pipelines and supporting infrastructure to enable end-to-end ML system deployment
- Build and enhance monitoring systems to improve model performance and automate infrastructure operations
- Research and implement distributed processing frameworks applicable to production services
Requirements
- Strong analytical mindset and proactive communication skills to solve complex technical challenges
- Highly self-motivated with a strong sense of ownership and accountability in assigned roles
- Proficiency in Python or Go for system and infrastructure development
- Solid understanding of operating systems, networks, and databases
- Hands-on experience with GPU-based infrastructure, including development and performance optimization using GPU-accelerated frameworks
- Practical experience deploying and maintaining ML models in cloud environments (e.g., AWS, GCP)
- Familiarity with containerized architectures using tools like Kubernetes
- Experience building and maintaining CI/CD pipelines using tools such as GitHub Actions
Preferred Qualifications
- Experience with AutoML platforms or building custom ML pipelines
- Knowledge in model optimization for efficient inference
- Hands-on experience with Infrastructure as Code (IaC) tools like Helm or Terraform
- Familiarity with Grafana and Prometheus for monitoring infrastructure and model performance
- Experience with multi-cloud environments, including inter-cloud connectivity or data migration between AWS and GCP