Our CompanyChanging the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours! About Adobe’s Sensei Platform GroupSensei Platform has an exciting and challenging mission: Enable Adobe Sensei to design, develop, operate, and scale robust cloud solutions for ML workflows. We are the partner of choice for the business because we make the lives of our engineers and researchers better through streamlined workflows leveraging latest technology, framework and services. Sensei Platform provides hosting, operations, security, and architectural support to Adobe Sensei’s growing suite of solution infrastructure and clusters, frameworks, SDKs, CI/CD solutions for ML data processing and transformation, ML Training and ML Inference. In order for Adobe to provide compelling products, which are powered by ML/AI, to its customers, Adobe Sensei has a need to design, implement and maintain highly-available and responsive cloud solutions for both data and control planes. Sensei Foundation requires a dynamic research and development engineers with deep design and implementation background and exposure to ML requirements. He/She should be capable of delivering a scalable and reliable suite of integrated applications and tooling and will be responsible for a variety of architectural, technical, operational, consultation and support activities. It is critically important to the company that the solutions offer the highest levels of reliability, uptime, performance and scale.Position SummaryWe are seeking an experienced Senior MLOps Engineer with Kubernetes background to join our dynamic team. The Engineer will be responsible for bridging the gap between development, operations, and data science teams to ensure smooth deployment and operation of machine learning models in production environments. The ideal candidate will have a strong background in managing Kubernetes clusters at scale, deploying, maintaining and optimizing infrastructure for performance and reliability. Responsibilities include designing, implementing, and supporting Kubernetes clusters, automating deployment processes, monitoring system health, troubleshooting issues, and collaborating with cross-functional teams to ensure the smooth operation of our cloud-native infrastructure. The candidate should possess excellent communication skills, a proactive attitude, and a passion for staying updated with emerging technologies in the Kubernetes ecosystem. This role offers an exciting opportunity to contribute to the growth and success of our organization in a rapidly evolving cloud computing landscape. ResponsibilitiesArchitecting, deploying, and maintaining Kubernetes clusters according to best practices and organizational requirements.Managing containerized applications using Kubernetes, including pod scheduling, scaling, updating, and rolling deployments.Developing and maintaining automation scripts and tools for provisioning, configuring, and managing Kubernetes infrastructure, leveraging infrastructure-as-code principles.Collaborate with data scientists and software engineers to design, develop, and deploy machine learning models in production environments.Collaborating with cross-functional teams, including developers, DevOps engineers, and system administrators, to support application deployment and integration with Kubernetes. Documenting processes, configurations, and troubleshooting procedures.Build and maintain scalable, reliable, and efficient machine learning pipelines for data ingestion, model training, evaluation, and deployment.Implement monitoring, logging, and alerting systems to ensure the health and performance of deployed models.Develop and maintain infrastructure as code (IaC) using tools like Terraform or CloudFormation to automate the provisioning and configuration of cloud resources.Staying updated with the latest Kubernetes developments, best practices, and emerging technologies to continuously improve the organization's Kubernetes infrastructure and practices.Implementing disaster recovery strategies and high availability configurations to ensure business continuity and resilience of Kubernetes environments.Implement security best practices and compliance standards for handling sensitive data in production environments.Providing guidance, training, and support to junior team members and stakeholders on Kubernetes concepts, best practices, and usage.RequirementsB.Tech / M.Tech degree in Computer Science from a premiere institute.Should have 9 - 14 years of experience in software development and operations.Should have excellent computer science fundamentals and a good understanding of architecture, design and performance.Hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud, including services like EC2, S3, Lambda, Kubernetes, and managed AI/ML services.Experience with containerization technologies (e.g., Docker) and container orchestration platforms (e.g., Kubernetes).Familiarity with version control systems (e.g., Git), CI/CD pipelines (e.g., Jenkins, GitLab CI/CD), and configuration management tools (e.g., Ansible, Puppet).Should have good knowledge of cloud security domainProficient in Java/Python, Shell.Should be hands-on in writing code that is reliable, and maintainable.Ability to work independently with strong problem-solving skills.Good understanding of k8s and knowledge of product life cycles and associated issues.Technical depth in operating systems, computer architecture and OS internals.Technical depth in Cloud Computing, Cloud Platforms and Services architecture and design Good To HaveFoundational knowledge in, and fundamentals of Machine Learning and Artificial IntelligenceExperience with ML Lifecycle, AI Ethics, ML Frameworks like TensorFlow, Caffe, Torch, and other similar frameworksAdobe is proud to be an Equal Employment Opportunity and affirmative action employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. Learn more. Adobe aims to make Adobe.com accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email accommodations@adobe.com or call (408) 536-3015.Adobe values a free and open marketplace for all employees and has policies in place to ensure that we do not enter into illegal agreements with other companies to not recruit or hire each other’s employees.
View Original Job Posting