Principal Site Reliability Engineer

hace 5 días


Lima Metropolitana, Perú Myworkdayjobs A tiempo completo

Join to apply for the Principal Site Reliability Engineer role at Groupon.

Groupon is a marketplace where customers discover new experiences and services every day and local businesses thrive. To date, we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms uniquely committed to helping local businesses succeed on a performance basis.

Groupon is on a radical journey to transform our business with a relentless pursuit of results. Even with thousands of employees spread across multiple continents, we still maintain a culture that inspires innovation, rewards risk-taking, and celebrates success. The impact here can be immediate due to our scale and the speed of our transformation. We're a "best of both worlds" kind of company. We're big enough to have the resources and scale, but small enough that a single person has a surprising amount of autonomy and can make a meaningful impact.

Role Overview:

Are you ready to take your expertise to the next level and make a meaningful impact on the reliability and scalability of mission-critical systems? As a Principal Site Reliability Engineer (SRE Level V/VI), you will play a central role in ensuring the performance, availability, and resilience of our platforms. In this position, you will go beyond maintaining systems by leading initiatives that redefine operational excellence. You will collaborate with diverse teams to implement cutting-edge technologies and best practices, foster a culture of reliability, and mentor others in their growth as engineers. This is an exceptional opportunity for someone passionate about solving complex challenges and shaping the future of platform reliability in a high-impact role.

Key Responsibilities:
  • Architect and maintain fault-tolerant systems, ensuring uptime SLAs of 99.9% or higher.
  • Drive automation in infrastructure management and deployment using Terraform, Ansible, Kubernetes, and similar tools.
  • Create and optimize CI/CD pipelines to ensure reliable, secure, and efficient software delivery.
  • Build and enhance comprehensive observability solutions, including monitoring, logging, and alerting systems using Prometheus, Grafana, and the ELK stack.
  • Collaborate with stakeholders to define and achieve SLIs, SLOs, and error budgets aligned with business needs.
  • Lead incident response during on-call rotations, ensuring rapid resolution and root cause analysis for critical issues.
  • Design and execute performance testing, capacity planning, and scalability strategies for evolving workloads.
  • Proactively identify and resolve bottlenecks, increasing system performance and developer efficiency.
  • Mentor junior engineers, fostering a collaborative and growth-oriented team environment.
  • Guide architectural decisions that drive innovation and enhance system reliability.
Qualifications:
  • 10+ years in systems engineering, with at least 5+ years in SRE or DevOps roles.
  • Expertise in cloud platforms (GCP, AWS) and container orchestration (Kubernetes, Docker).
  • Proficiency in programming and scripting languages like Python, Go, and Bash.
  • Advanced knowledge of Infrastructure as Code (IaC) tools such as Terraform and Ansible.
  • Deep understanding of networking, DNS, load balancing, and security principles.
  • Proven track record of managing high-availability systems in demanding environments.
  • Exceptional analytical and problem-solving skills.
Preferred Qualifications:
  • Certifications in cloud or container technologies (e.g., AWS/GCP/Azure, Kubernetes CKA).
  • Experience in industries like eCommerce, FinTech, or SaaS.
  • Familiarity with Agile development processes and frameworks.
What We Offer:
  • The opportunity to work with cutting-edge technologies in a transformative environment.
  • A collaborative and innovative work culture that values your expertise and contributions.
  • Professional growth and leadership development pathways tailored to your aspirations.
  • A chance to leave a lasting impact by shaping the future of reliable and scalable systems.

Join us to push the boundaries of platform reliability and drive meaningful change in a fast-evolving digital world

Groupon's purpose is to build strong communities through thriving small businesses. To learn more about the world's largest local e-commerce marketplace, click here. You can also find out more about us in the latest Groupon news as well as learning about our DEI approach. If all of this sounds like something that's a great fit for you, then click apply and join us on a mission to become the ultimate destination for local experiences and services.

Beware of Recruitment Fraud: Groupon follows a merit-based recruitment process without charging job seekers any fees. We've noticed an increase in recruitment fraud, including fake job postings and fraudulent interviews and job offers aimed at stealing personal information or money. Be cautious of individuals falsely representing Groupon's Talent Acquisition team with fake job offers. If you encounter any suspicious job offers or interview calls demanding money, recognize these as scams. Groupon is not responsible for losses from such dealings. For legitimate job openings (and a sneak peek into life at Groupon), always check our official career website at grouponcareers.com.

Seniority level
  • Not Applicable
Employment type
  • Full-time
Job function
  • Engineering and Information Technology
  • Industries
  • Technology, Information and Internet
#J-18808-Ljbffr
  • Site Reliability Engineer

    hace 2 semanas


    Lima, Perú Rappi A tiempo completo

    It is time for you to join us to show the world that we are the company that is coming to change paradigms, where we revolutionize hours, minutes and seconds. Because in Rappi WE SEE OPPORTUNITIES where others see problems. WE SEE CLOSENESS where others see distance. WE SEE ADRENALINE where others see pressure. Join a team where we are all capable of...


  • Lima Metropolitana, Perú Groupon A tiempo completo

    Groupon is a marketplace where customers discover new experiences and services every day and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...

  • Site Reliability Engineer

    hace 2 semanas


    Lima, Perú Careers at SunDevs A tiempo completo

    **Descripción del puesto**: Como Site Reliability Engineer en SunDevs, colaborarás con otros ingenieros de software senior y Platform Engineers para diseñar y desarrollar sistemas y plataformas en la nube altamente disponibles, escalables, seguras y mantenibles para resolver grandes desafíos. Brindarás asesoramiento y guía a nuestros ingenieros de...

  • Site Reliability Engineer

    hace 2 semanas


    Lima, Perú WTW A tiempo completo

    We have spent many years growing and fostering a DevOps culture by bridging the divide between our Software and Infrastructure Engineering departments. We want the cross-functional teams that we are building to include Site Reliability Engineers. We operate in a complex, multi-tenant, hybrid cloud and on-premises infrastructure that spans both the Windows...


  • Lima Metropolitana, Perú Myworkdayjobs A tiempo completo

    Join Our Mission to Transform Platform ReliabilityWe are looking for a highly skilled Principal Site Reliability Engineer to join our team at Myworkdayjobs. As a key contributor to our engineering organization, you will be responsible for developing and maintaining scalable, highly available systems that meet the demands of our rapidly growing customer...


  • Lima Metropolitana, Perú BairesDev A tiempo completo

    Site Reliability Engineer - Remote Work:At BairesDev, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley.Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact...


  • Lima, Perú Neara A tiempo completo

    Neara is a high-growth, venture-backed Series B, tech company headquartered in Sydney, Australia. We work with 75% of the utilities in Australia and New Zealand and are growing rapidly across the US and Europe. Our mission is to revolutionise the utilities industry by helping them future-proof their infrastructure and navigate the challenges of the clean...


  • Lima, Perú Hunt Consolidated, Inc. A tiempo completo

    **ROLES AND RESPONSIBILITIES**: - Monitoring and calculation of reliability KPI (RAM, MTBF, etc). - Analyze predictive alerts from machine learning software ( for Rotaing and Mechanical assets) - Identify threats and opportunities for Plant production and manage them in MTO (mitigate Threats and Opportunities) process. - Analyze data and perform reliability...


  • Lima, Perú Scotiabank A tiempo completo

    Hola! Felicitamos y valoramos tu interés por seguir creciendo dentro del Grupo Scotiabank, nos encontramos en búsqueda de talento que aporte con sus conocimientos y experiência a la posición y sobre todo con OPTIMISMO. **Purpose**: As a member of the Global Systems Reliability team,the Global System Reliability Engineer (SRE) will work in collaboration...


  • Lima Metropolitana, Perú BairesDev A tiempo completo

    About the Position:We are seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring high service availability, performance, security, and maintainability of our cloud infrastructure hosted on AWS. This includes implementing CI/CD pipelines to automate deployments, automating tools...


  • Lima, Perú Kyndryl Peru SRL A tiempo completo

    **Why Kyndryl** Kyndryl is a market leader that thinks and acts like a start-up. We design, build, manage, and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our...


  • Lima, Perú Equifax, Inc. A tiempo completo

    Como Site Reliability Engineering (SRE) garantizarás que los servicios internos y externos cumplan o superen las expectativas de fiabilidad y rendimiento al tiempo que se adhieren a los principios de ingeniería de Equifax.¿Que harás?Gestionar el tiempo de actividad de los sistemas en arquitecturas nativas de la nube (GCP).Desarrollar patrones de...


  • Lima, Perú Wikimedia Foundation A tiempo completo

    **Summary** The Wikimedia Foundation is looking for a Site Reliability Engineer (Database) to join our SRE team to build, optimize and support the platform serving the world's favorite encyclopædia to millions of people around the globe. Wikipedia and its sister projects are a globally distributed architecture powered strictly by Free and Open Source...


  • Lima, Perú Canonical - Jobs A tiempo completo

    This role is an opportunity for a hands-on technologist with a passion for Linux to build a career with Canonical and drive the success with those leveraging Ubuntu and open source products. If you have an affinity for open source development and a passion for technology, then you will enjoy working with some of the best people in the industry at...


  • Lima Metropolitana, Perú BairesDev A tiempo completo

    At BairesDev, we strive to push the boundaries of technological innovation, collaborating with top talent worldwide on high-impact projects.We're seeking skilled Site Reliability Engineers for home-based modality to join our Development team. These professionals will leverage current technologies, mobile apps, web applications, and devices to deliver...


  • Lima, Perú Kyndryl Peru SRL A tiempo completo

    **Why Kyndryl** Kyndryl is a market leader that thinks and acts like a start-up. We design, build, manage, and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our...


  • Lima Metropolitana, Perú Myworkdayjobs A tiempo completo

    About MyworkdayjobsMyworkdayjobs is a leading marketplace where customers discover new experiences and services every day. Our platform connects over 16 million customers with deals across various categories, making us one of the few platforms uniquely committed to helping local businesses succeed.We're on a radical journey to transform our business with a...


  • Lima, Perú DIGITALHUB SAC A tiempo completo

    **DIGITALHUB** es una empresa peruana de outsourcing de **servicios de BPO y TI.** Nuestra visión es un futuro en el que cada persona pueda encontrar el mejor empleo y donde nuestros partners puedan descubrir lo mejor del talento latinoamericano. En esta oportunidad, nos encontramos buscando un **"Databricks Administrator and Site Reliability Engineer"**...


  • Lima Metropolitana, Perú Groupon A tiempo completo

    Groupon's mission is to connect customers with local businesses and help them thrive in a rapidly changing market. As a Principal Site Reliability Engineer, you'll play a crucial role in ensuring the performance, availability, and resilience of our platforms.Job Overview:In this position, you'll go beyond maintaining systems by leading initiatives that...


  • Lima Metropolitana, Perú Groupon A tiempo completo

    Groupon is on a mission to revolutionize the way people discover and experience new things. As a Principal Site Reliability Engineer, you'll be at the forefront of this transformation, ensuring the performance, availability, and resilience of our platforms.Role Description:You'll lead initiatives that redefine operational excellence, collaborating with...