Site Reliability Engineer
hace 7 días
**The Role** We are a group of passionate engineers who have built the largest private Medicare marketplace in the United States. We focus on the continuous improvement of our systems and culture. We improve and maintain a platform that provides the best possible experience to shop for insurance plans, and allows our insurance carriers to be be confident that their products are accurately and impartially represented. Our goal is to provide the best possible experience to customers while they review and enroll in health plans using our self-service platform. These plans include Medicare, Marketplace, Dental, and Vision plans. We enable users to shop for plans based on their individual needs (for doctors, networks, prescriptions, healthcare usage, premiums. etc.) so they can find and enroll in the best plan. We have spent years growing and fostering a DevOps culture by bridging the divide between our Software and Infrastructure Engineering departments. We operate in a complex, multi-tenant, hybrid cloud and on-premise infrastructure that includes both Windows and Linux environments. We strive for security, reliability, and automation in line with DevOps and Site Reliability Engineering principles. **Responsibilities** - Write and maintain architectural, stakeholder, and policy documentation - Look for new ways to improve our processes and the quality of our infrastructure - Look for new ways to increase our delivery velocity, leveraging various functional disciplines - Look for new ways to remediate production incidents more quickly and safely - Participate in department Communities of Practice, and collaborate with other teams and departments on best practices and implementation strategy - Adhere to and advocate for best practices, including: Infrastructure as Code, Monitoring, High availability, Disaster recovery, Security & DevOps methodologies - Create or improve SLIs, SLOs, and SLAs - Contribute to capacity planning, advise and consult with teams who will be load/stress testing - Keep up with industry innovations, recommending new tools or practices when appropriate - Provide timely assistance and remediation solutions during critical situations and production incidents - Document and share "lessons learned" from production, including blameless postmortems and root cause analyses **Requirements**: 3+ years of hands-on experience with a majority of the following technologies, along with a willingness to become proficient in the remaining areas: - Cloud platforms, preferably with Azure - Configuration management tools like Ansible, Salt, and Terraform - Windows and Linux Servers, Active Directory - VMware - Web servers, including IIS and NGINX - Secrets management with Vault, Azure Key vault, or similar systems - Database Server Infrastructure like Microsoft SQL Server and PostgreSQL - Application Performance Monitoring with tools like Application Insights or New Relic - Infrastructure monitoring with tools like Sensu, Zabbix, or Nagios - CI/CD with tools like Github Actions, TeamCity, Octopus Deploy, or Concourse - Log Aggregation tools like SumoLogic or Splunk Our hiring processes ensure equal conditions, objectivity and non-discrimination.
-
Site Reliability Engineer
hace 7 días
Lima, Perú Careers at SunDevs A tiempo completo**Descripción del puesto**: Como Site Reliability Engineer en SunDevs, colaborarás con otros ingenieros de software senior y Platform Engineers para diseñar y desarrollar sistemas y plataformas en la nube altamente disponibles, escalables, seguras y mantenibles para resolver grandes desafíos. Brindarás asesoramiento y guía a nuestros ingenieros de...
-
Principal Site Reliability Engineer
hace 1 semana
Lima, Perú Groupon A tiempo completoGroupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...
-
Senior Site Reliability Engineer
hace 4 días
Lima Metropolitan Area, Perú OpenLoop A tiempo completoOpenLoop is looking for a Senior Site Reliability Engineer to join our team in Lima, Peru.About the RoleCross-Functional CollaborationPartner with engineering teams to improve system reliability and deployment practices.Engage with teams on SRE guidelines and best practices for automation and infrastructure.Work with security teams to implement secure,...
-
Site Reliability Engineer
hace 1 día
Lima, Perú Rappi A tiempo completoRappi is looking for a seasoned engineer who wants to join us in implementing one of the most innovative methodologies for stability and reliability - Chaos Engineering **What you'll do**: **What we expect from you: Skills needed**:As one of the most challenging tech platforms in LATAM, some key aspects make it a unique place for your personal and...
-
Systems Reliability Engineer
hace 6 días
Lima, Perú Scotiabank A tiempo completoHola! Felicitamos y valoramos tu interés por seguir creciendo dentro del Grupo Scotiabank, nos encontramos en búsqueda de talento que aporte con sus conocimientos y experiência a la posición y sobre todo con OPTIMISMO. **Purpose**: As a member of the Global Systems Reliability team,the Global System Reliability Engineer (SRE) will work in collaboration...
-
Reliability Mechanical Engineer
hace 2 semanas
Lima, Perú Hunt Consolidated, Inc. A tiempo completo**ROLES AND RESPONSIBILITIES**: - Monitoring and calculation of reliability KPI (RAM, MTBF, etc). - Analyze predictive alerts from machine learning software ( for Rotaing and Mechanical assets) - Identify threats and opportunities for Plant production and manage them in MTO (mitigate Threats and Opportunities) process. - Analyze data and perform reliability...
-
Site Reliability Engineer
hace 7 días
Lima Metropolitan Area, Perú Nearsure A tiempo completoExplore the Nearsure experience Join our close-knit LATAM remote team:Connect through fun activities like coffee breaks, tech talks, and games with your team-mates and management. Say goodbye to micromanagementWe champion autonomy, open communication, and respect for diversity as our core values.Your well-being matters:Our People Care team is here from day...
-
Network Site Engineer
hace 7 días
Lima, Perú Tech Source Managed Services A tiempo completo**Role Description** This is a part-time on-site role for a Network Support Engineer located in Peru. The Network Support Engineer will be responsible for network administration, network engineering, technical support, troubleshooting, and network security. **Qualifications** - Network Administration and Network Engineering skills - Technical Support and...
-
Senior Site Reliability Engineer
hace 1 semana
Lima, Perú Groupon A tiempo completoGroupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...
-
Databricks Administrator and Site Reliability
hace 7 días
Lima, Perú DIGITALHUB SAC A tiempo completo**DIGITALHUB** es una empresa peruana de outsourcing de **servicios de BPO y TI.** Nuestra visión es un futuro en el que cada persona pueda encontrar el mejor empleo y donde nuestros partners puedan descubrir lo mejor del talento latinoamericano. En esta oportunidad, nos encontramos buscando un **"Databricks Administrator and Site Reliability Engineer"**...