Who are we?Equinix is the world’s digital infrastructure company®, operating over 250 data centers across the globe. Digital leaders harness Equinix's trusted platform to bring together and interconnect foundational infrastructure at software speed. Equinix enables organizations to access all the right places, partners and possibilities to scale with agility, speed the launch of digital services, deliver world-class experiences and multiply their value, while supporting their sustainability goals. Our culture is based on collaboration and the growth and development of our teams. We hire hardworking people who thrive on solving challenging problems and give them opportunities to hone new skills and try new approaches, as we grow our product portfolio with new software and network architecture solutions. We embrace diversity in thought and contribution and are committed to providing an equitable work environment that is foundational to our core values as a company and is vital to our success. Staff Engineer, Product SoftwareJob Summary We are looking for Staff Engineer for the engineering team that will be responsible for operations of DCIM (Data Center Infrastructure Monitoring), a large globally distributed telemetry system that gathers IoT data, does Real-Time Stream Processing, Analytics, Alerting, Monitoring and makes the data available for further business use. For this particular role we focus on Production operations, monitoring, incident 1st line of support response/analysis and system maintenance. DCIM is a key Equinix product, a highly scalable, globally distributed software, using modern technology stack, cloud, microservices architecture, real-tile and near-real-time stream processing, API-first approach and focusing on the product quality and processing SLA. Vast amount of data flowing through DCIM, originating from 200+ data centers (and growing) spread worldwide, makes it very unique, interesting and challenging. This can make you a part of the world-class product engineering team that’s paving the path for new ways for enterprises to consume ever increasing cloud services. Responsibilities Monitors production live system, ensure its smooth operation Is the first line of support for Production incidents, investigates, analyzes and fixes them Identifies and troubleshoots availability and performance issues at multiple layers of deployment Provides RCA (Root Cause Analysis) for major incidents Provides status reports Works with Product Owners and Software Engineers to propose improvements in system’s stability, reliability and availability Participates in operations improvements along with Site Reliability Engineers Develops automation scripts and configuration for operations of multiple distributed modules Creates mechanisms that enable rapid recovery, repair and cleanup of faulty migrations and incidents Coordinates release management and Production deployments of new features and modules Builds knowledge base for smoother operations in future Understands, reads, and reviews requirements Cross-technical integration Defines and implements operational excellence best practices Keeps abreast of new developments to help define the necessary changes to practice Qualifications Experience BS in computer science or equivalent with 6+ years or MS in computer science or equivalent with 5+ years or PhD in computer science or equivalent with 4+ year of hands on professional DevOps experience building, deploying, and maintaining customer facing applications at scale in an innovative engineering environment. Software Engineering You have knowledge of: an Agile/Scrum SDLC how to operate with CI/CD pipelines understanding of the components of a CI/CD pipeline deployment best practices/strategies the twelve-factor app methodology (nice to have) Technologies You have experience with many of the following technologies (installation, operation, monitoring, performance tuning): Java ecosystem, Shell scripting, Python Kafka, Zookeeper, Hadoop, HDFS Databases: Redis, Postgres, Influx, Cassandra, Mongo, Orient Docker, Kubernetes, Rancher Prometheus, Grafana Flink, Spark, NiFi (nice to have) Architecture of the infrastructure You have experience in building and running production systems utilizing microservices and distributed systems architecture at scale You have a background in workload based on hybrid-cloud system with at least one of the leading public cloud platforms (AWS/Azure/GCP) Container and Machine Deployment You have working experience with containers and orchestrators: You Know how to build and operate Docker containers – architecture, construction and optimization You have experience with defining and manage applications that operate on orchestration platforms - is a plus You have knowledge in the area of configuration process of IaaS: How to effectively create and maintain Public Cloud resources (nice to have) Configuration Management and automation Knowledge of Ansible – create, maintain, and run Ansible playbooks (tasks, handlers, conditionals, loops and registers) You have experience in describing the infrastructure configuration as a code with Hashicorp Terraform with the usage of different providers API Gateway Engineering (nice to have) You have knowledge of: common API concepts and standards as well as aspects of data storage, service status and session handling architect API Management system for high availability, resilience, and recovery engineer API policies and standards for security and standardization of an enterprise You know how deploy, configure, tune, and monitor API Gateways Other skills and attitude A sense of ownership and pride in your performance and its impact on the company’s success Critical thinker and problem-solving skills Team player Good time-management skills Great interpersonal and communication skills Excellent English language (spoken and written) Why it is worth joining us? You will be working on great products with access to newest technologies. We are offering stable employment with competitive salary above average on the market and benefits including annual bonus or lunch card. Moreover, you will be eligible to participate on our employee stock units purchasing programs. And most of all- the opportunity to create unique atmosphere and company culture based on Magic of Equinix. The modern office space and Warsaw Spire view will be awaiting you, however You will be allowed to work from home according to company policy. If you are not living in Warsaw but you are still open to join us, we will support you with relocation package to make this journey easier to happen.Equinix is committed to ensuring that our employment process is open to all individuals, including those with a disability. If you are a qualified candidate and need assistance or an accommodation, please let us know by completing this form.Equinix is an Equal Employment Opportunity and, in the U.S., an Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to unlawful consideration of race, color, religion, creed, national or ethnic origin, ancestry, place of birth, citizenship, sex, pregnancy / childbirth or related medical conditions, sexual orientation, gender identity or expression, marital or domestic partnership status, age, veteran or military status, physical or mental disability, medical condition, genetic information, political / organizational affiliation, status as a victim or family member of a victim of crime or abuse, or any other status protected by applicable law.
View Original Job Posting