Engineering Manager - Distributed Systems - Cloud

Company: Apple
Company: Apple
Location: Cork, County Cork, Ireland
Department: Software and Services
Posted on: 2023-11-24 03:57
Summary Posted: Nov 14, 2023 Role Number: 200520001 Do you love crafting elegant solutions to highly complex challenges? Can you intrinsically see the importance of every detail? At Apple, our compute team is responsible for designing and building the foundational pieces of our data center software. In this role, you will collaborate with engineers across Apple to build and deploy forward-looking high-performance batch cloud systems that support Apple’s research and development. To better support Apple’s operations in Europe and Israel, the team is looking to expand engineering and support in the EMEIA region. The distributed systems engineering manager will be responsible for designing and implementing software for a variety of scalable, reliable, and secure distributed computing systems with a strong focus on control plane software components such as scheduling, resource management, and API design. The engineer in this role will also be the leader of a cross-functional engineering team in EMEIA that will work together will the US team to develop features across the stack. At the same time, the EMEIA team (like the one in Cupertino) will also engage with platform customers and will support platform issues in the local timezone. Key Qualifications Key Qualifications Strong understanding of concurrency, parallelism, and distributed systems concepts. In-depth knowledge of algorithms and distributed system architectures. Experience with measuring, analyzing, and optimizing performance. Familiarity with all aspects of software development from architecture to deployment and maintenance. Experience developing and managing a large-scale production system. Fluency in Golang, Python, or similar languages in a systems or distributed systems context. Quick at learning and contributing to new code bases. Customer focused thinking and strong problem solver with attention to detail. Able to thrive and make progress while the core of the team is in a different location or time zone. Highly organized, creative, motivated, and passionate about achieving results. Excellent written and oral communication skills 5+ years of experience in related software development (or comparable academic experience). 2+ years of experience leading an engineering team. Description Description The compute organization runs a multi-region, large-scale, in-house-developed, batch platform that empowers Apple’s R&D around the world. To keep supporting its scale and growing user base, the compute organization is starting a new engineering team in EMEIA that will work on the platform while being closer (in space and time) to some of its customer. The distributed systems engineering manager will lead this new team, but they will also: - Actively participate in the design and development of control plane components (scheduling, resource management, APIs, high-availability) for a large-scale multi-site cloud batch platform. - Deliver essential new features in platforms leveraging the platform runtime, storage, and networking capabilities. - Write and review code, generate and review design documentation. - Participate in software qualifications and rollouts to production clusters. - Participate in a business-hours rotation where engineers respond to platform issues for same-day resolution. - Work with a wide range of software and hardware engineering teams across Apple to support their workflows or integrate their technology into our platform. For such a role, it would be highly preferred that the candidate has previous experience with HPC systems and software (e.g., Slurm, MPI, Infiniband, Luster, or equivalent). It would also be preferred that the candidate has some familiarity with systems commonly present in cloud control planes like Raft, BoltDB, MySql, zookeeper, or etcd. Education & Experience Education & Experience - MS in Computer Science or related field (or equivalent work experience) - Highly-preferred: PhD or meaningful publications in related fields. Additional Requirements Additional Requirements
View Original Job Posting