Site Reliability Engineer

Company: AdoTube
Company: AdoTube
Location: Bangalore
Commitment: Full time
Posted on: 2023-09-08 06:14
Our CompanyChanging the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours! At Adobe, you will be immersed in an exceptional work environment that is recognized around the world. Adobe is listed as Fortune’s “100 Best Companies to Work For” for 20 consecutive years. You will also be surrounded by colleagues who are committed to helping each other grow. If you’re looking to make an impact, Adobe's the place for you. Discover what our employees are saying about their career experiences on the Adobe Life blog and explore the meaningful benefits we offer.  Changing the world through digital experiences is what Adobe’s all about. We give global brands everything they need to design and deliver exceptional digital experiences. We’re passionate about transforming how companies interact with customers across every screen.We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours.Job Description  Role Summary  Digital Experience (DX) ( https://www.adobe.com/experience-cloud.html) is a USD 4B+ business serving the needs of enterprise businesses including 95%+ of fortune 500 organizations.Adobe Marketo Engage, within Adobe DX is the world’s largest marketing automation platform, is a singular solution that lets enterprises attract, segment and nurture customers — from discovery to biggest fan. It lets enterprises do effective engagement through various surfaces and touchpoints.Adobe needs a Site Reliability Engineer (SRE) who knows how to balance going fast and going big with operating safely. Our mission is to progress, protect, and provide for the software and systems behind all of Adobe Marketo services - with an ever-watchful eye on their availability, latency, performance, and capacity. SRE is a mindset of engineering approaches which focuses on building the highly reliable systems and eliminate work through automation.We hire people from both systems and software backgrounds. Strong candidates will have experience with both. The engineer role within SRE is at the heart of fulfilling SRE’s mission: build highly reliable, scalable & measurable customer experience for the continued growth of Adobe’s infrastructure.Do you have an intimate understanding of the operational challenges of running services at scale, and are you also committed to overcoming those challenges with software instead of manpower, if yes, then we would be excited to talk to you! What you'll doThis is an individual contributor position. Expectations will be on the below lines:Turn high-touch manual processes into fully automatic solutions, and maintain and improve existing automationsWrite software layers, scripts, deployment frameworks, tracers, monitors, self-healing/auto remediation tools and automate the processesDevelop and maintain tools, scripts, and infrastructure for monitoring, deployment, and configuration management.Implement and maintain robust monitoring, alerting, and logging systems to enable proactive identification and resolution of issues.Use your automation and orchestration expertise to automate wherever and whenever possible while eliminating technical debtFind optimizations and other efficiencies in order to enhance system reliability and performanceBuild and maintain a resilient and highly available infrastructure by monitoring, analyzing, and resolving complex production incidentsEven after self-healing and automation done by you – if issues arise, get involved into troubleshooting and root-cause analysis of issues across the stacks – hardware, software, database, network and so onDevelop incident response procedures, conduct post-incident reviews, and implement preventive measures to minimize the risk of future incidentsCollaborate with and guide development, operations, and QA teams to drive the adoption of SRE principles throughout the organizationArticulate technical characteristics of your services and the dependencies between servicesCollaborate effectively with cross-functional teams, including developers, system administrators, and product owners, to ensure smooth operation and continuous improvement of systems and applications.Share your knowledge and expertise through documentation, training sessions, and mentorship to empower other team members.Participate in an on-call rotation ensuring that our systems remain reliable for those who depend on them.What you need to succeedA Bachelors or Masters degree in computer science engineering or related5+ years of proven experience as a Site Reliability Engineer or in similar rolesExperience designing for and dealing with a large production environment Experience in modern development practices (microservices architectures, REST interfaces, etc.)Experience in designing and implementing highly available, scalable, and fault-tolerant systems and applicationsProficiency in programming and scripting languages (e.g., Java, Python, Go, Shell scripting)Hands-on experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and containerization technologies (e.g., Docker, Kubernetes)Experience with designing, deploying and maintaining monitoring solutions such as Splunk, Prometheus, New Relic, Grafana etc. Experience with any of CI/CD tooling: Jenkins, Spinnaker, GitLab runners, Azure DevOps, etc. Recent large-scale experience with configuration management and infrastructure as code tools (e.g., Salt, Terraform, VMWare, Ansible, Chef or Puppet)Troubleshooting and system engineering exposure in Linux production environmentsStrong troubleshooting and problem-solving skills, with a data-driven and analytical approachExcellent communication skills, both verbal and written, with the ability to effectively collaborate with diverse teamsAdobe is an equal opportunity employer. We welcome and encourage diversity in the workplace regardless of race, gender, religion, age, sexual orientation, gender identity, disability or veteran status.Adobe is proud to be an Equal Employment Opportunity and affirmative action employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. Learn more. Adobe aims to make Adobe.com accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email accommodations@adobe.com or call (408) 536-3015.Adobe values a free and open marketplace for all employees and has policies in place to ensure that we do not enter into illegal agreements with other companies to not recruit or hire each other’s employees.
View Original Job Posting