Senior Manager, Site Reliability Engineering

Company: Podium

Location: Remote, US

Posted on: 2023-04-20 22:14

Podium exists to help local businesses win. Using Podium, local businesses can simplify the way they communicate with their customers—from collecting payments to facilitating online reviews to launching marketing campaigns, and much more. Our work and focus on helping local businesses thrive has been recognized across the industry, including Forbes’ Next Billion Dollar Startups, Forbes’ Cloud 100, the Inc. 5000, and Fast Company’s World’s Most Innovative Companies. We look for people who are curious, creative and are willing to do the work to be a little better every day. We also embody our company values in all that we do, which always starts with being Customer Obsessed, followed by Be a Founder, Zero Drama, and Enjoy the Ride. Does that sound like you? As a Senior Site Reliability Manager at Podium, you will be responsible for ensuring the reliability, scalability, and performance of our platform. You will lead a team of site reliability engineers to build and maintain tools and systems to enable a highly available and fault-tolerant infrastructure that can handle millions of users. You will work closely with our development, operations, and security teams to ensure that our platform is highly available, secure, and performs optimally at all times. What you will be doing: Work with the following technologies: Kubernetes, Helm, Docker, AWS, Terraform, Datadog, Prometheus, Ansible, StrongDM, Python, Go, Ruby, GitLab and GitLab CI. Lead a team of site reliability engineers to design, build, and maintain highly available and fault-tolerant infrastructure. Ensure the reliability, scalability, and performance of our platform by implementing monitoring, alerting, and incident response processes. Collaborate with development, operations, and security teams to design and implement highly available and secure systems. Define and track objectives and key results to ensure that our platform meets the needs of our growing user base. Continuously improve our infrastructure and processes to meet the needs of our growing business. Participate in the hiring and development of site reliability engineers to build a highly skilled and effective team. Help build a diverse team while promoting a collaborative and inclusive environment. What we hope you have: Bachelor's degree in Computer Science, Engineering, or related field. 5+ years of experience in site reliability engineering, DevOps, or related field. 3+ years of experience managing a team of site reliability engineers. Experience with cloud infrastructure such as AWS, GCP, or Azure. Strong experience with Linux systems administration and scripting. Strong experience with monitoring and alerting tools such as Prometheus, Grafana, or similar. Excellent communication and collaboration skills. Ability to thrive in a fast-paced, dynamic startup environment. BENEFITS Open and transparent culture - Checkout this video to see what it’s like to work at Podium Life insurance, long and short-term disability coverage Paid maternity and paternity leave Fertility Benefits Generous vacation time, plus three 4-day summer holiday weekends Excellent medical, dental, and vision benefits 401k Plan Bi-annual swag drops with cool Podium gear and apparel A stellar HQ (Utah) gym with local professional coaches and classes offered Onsite HQ (Utah) child care center, subsidized for employees Additional benefits for fully remote employees Podium is an equal opportunity employer. Podium provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, national origin, sexual orientation, gender identity or expression, age, disability, genetic information, marital status or veteran status.

View Original Job Posting