Empleos actuales relacionados con Senior Systems Reliability Engineer - Lima - Scotiabank
-
Systems Reliability Engineer
hace 1 semana
Lima, Perú Scotiabank A tiempo completoHola! Felicitamos y valoramos tu interés por seguir creciendo dentro del Grupo Scotiabank, nos encontramos en búsqueda de talento que aporte con sus conocimientos y experiência a la posición y sobre todo con OPTIMISMO. **Purpose**: As a member of the Global Systems Reliability team,the Global System Reliability Engineer (SRE) will work in collaboration...
-
Principal Site Reliability Engineer
hace 2 semanas
Lima, Perú Groupon A tiempo completoGroupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...
-
Senior Site Reliability Engineer
hace 7 días
Lima Metropolitan Area, Perú OpenLoop A tiempo completoOpenLoop is looking for a Senior Site Reliability Engineer to join our team in Lima, Peru.About the RoleCross-Functional CollaborationPartner with engineering teams to improve system reliability and deployment practices.Engage with teams on SRE guidelines and best practices for automation and infrastructure.Work with security teams to implement secure,...
-
Senior Networking Engineer
hace 15 horas
Lima, Perú Outcode Software A tiempo completo**About the Role** We’re seeking a seasoned **Senior Networking Engineer** to help design and implement advanced network infrastructure components focused on security, performance, and resilience. This role involves deep systems-level programming, protocol design, and secure data transport — all within a high-stakes, security-critical environment....
-
Lima, Perú Product Perfect, LLC A tiempo completo**Job Title**: Senior Database Engineer Consultant (NetSuite Specialist) **Company**: Product Perfect **Location**: Remote (Orange County, California) **Job Type**: Freelance, 1099 Contract **Overview**: Product Perfect is seeking a highly skilled Senior Database Engineer Consultant with expertise in NetSuite and extensive experience in SQL database...
-
Site Reliability Engineer
hace 1 semana
Lima, Perú Careers at SunDevs A tiempo completo**Descripción del puesto**: Como Site Reliability Engineer en SunDevs, colaborarás con otros ingenieros de software senior y Platform Engineers para diseñar y desarrollar sistemas y plataformas en la nube altamente disponibles, escalables, seguras y mantenibles para resolver grandes desafíos. Brindarás asesoramiento y guía a nuestros ingenieros de...
-
Senior Technical Operations Engineer
hace 4 días
Lima, Perú QuickNode A tiempo completoQuickNode is a cloud-based infrastructure company that powers the blockchain ecosystem. We are a global remote company with an HQ in Miami, Florida. **The Role**: We are looking for an experienced Senior Technical Operations Engineer to join our team. The Technical Operations team is responsible for ensuring the stability, reliability, and performance of...
-
Systems Engineer
hace 2 días
Lima, Perú Palo Alto Networks A tiempo completoCompany Description **Our Mission** At Palo Alto Networks® everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. We have the vision of a world where each day is safer and more secure than the one before. These aren’t easy goals to accomplish - but we’re not here for easy. We’re...
-
Site Reliability Engineer
hace 2 días
Lima, Perú Willis Towers Watson A tiempo completoWe have spent many years growing and fostering a DevOps culture by bridging the divide between our Software and Infrastructure Engineering departments. We want the cross-functional teams that we are building to include Site Reliability Engineers. We operate in a complex, multi-tenant, hybrid cloud and on-premises infrastructure that spans both the Windows...
-
Systems Engineer
hace 2 días
Lima, Perú Palo Alto Networks A tiempo completoCompany Description **Our Mission** At Palo Alto Networks® everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are...
Senior Systems Reliability Engineer
hace 2 semanas
ID de la solicitud: 227737
Gracias por tu interés en ser parte de Scotiabank Perú, apreciamos tu postulación. Estamos en la búsqueda de personas con talento que quieran crecer y lograr los objetivos de nuestra organización. Te deseamos mucho éxito dentro de este proceso
**Senior Systems Reliability Engineer**
- Business Line: Operaciones & Tecnología
- Unit: SRO
- Nível: 7.2
- Tipo de Contrato: (Indefinido)
**Misión**:
**¿Qué esperamos de ti?**
- Degree in Computer Science, Engineering, or equivalent experience
- 6 years’ experience in IT
- 2-3 years professional coding experience in one or more of the following: C, C++, Java would be asset.
- Mastery of one or more scripting languages for automating systems, e.g. Bash, Python, Ansible would be asset.
- ITIL V3 Foundation Cert. in ITSM
- Experience with ITSM tools (ServiceNow, a plus) with strong understanding of SRE and service management principles
- Well-rounded broad knowledge of OS platforms (Linux/UNIX), Networking, Web Systems and IT Ops
- Experience working with large-scale distributed systems understanding of SOA or microservices architecture, using Jenkins, Bamboo or other CI tolos, Advanced experience with GCP/AWS services Understanding of serverless architecture (Lamda) and IaaS.
- Understanding of serverless architecture (Lamda) and IaaS, data structures, algorithms, best practices and containerization using Docker or similar
**¿A qué retos te enfrentarás?**
- Develop and implement system reliability strategies aligned with corporate objectives and SRE principles. Craft and Execute the technical implementation of comprehensive strategies aimed at improving the reliability and performance of critical systems. Collaborate with various teams to integrate these strategies into daily operations, continuously refining them based on performance metrics and evolving business needs.
- Proactive management of high-priority incidents and problems (classified as 411/911). Take a senior role in managing and resolving the technical resolution of major incidents and problems impacting the organization. Coordinate with relevant stakeholders, ensure timely resolution, and conduct thorough Post-Mortem analysis to understand root causes, implement corrective measures to prevent recurrence, and ensure accurate documentation in the ServiceNow platform.
- Define, monitor, and optimize Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Establish and manage SLOs and SLIs to maintain high service quality. Define these metrics, monitor performance closely, and adjust as needed to optimize system performance. Ensure service delivery consistently meets or exceeds defined standards using data and trend analysis to guide improvements.
- Oversee key performance indicators (KPIs) such as Mean Time to Recovery (MTTR), the timely closure of problem tickets, playbooks published and the impact of changes.
- Guarantee efficient and effective operations within respective areas. Execute technical operations are conducted efficiently and effectively, adhering to established business controls and regulatory requirements. Address operational risks and ensure compliance with anti-money laundering and counter-terrorism financing regulations.
- Design and develop advanced data analysis and visualizations. Creation of sophisticated data analysis and visualizations to provide actionable insights into system performance and reliability. Leverage data to identify trends, understand system behaviors, and drive continuous improvements, informing strategic decisions and optimizing business performance.
- Provide essential information to support areas regarding major incidents or root cause analysis reports. Ensure key stakeholders receive timely and accurate information about major incidents and root cause analysis. Prepare and technical incident reports, participate in Major Problem Review (MPR) sessions, and review Post Mortem reports to ensure compliance with regulatory requirements.
- Regularly update Playbooks with detailed information. Maintain and update Playbooks with comprehensive details on system architecture, functionalities, availability schedules, and escalation procedures. Ensure updates are reflected in the Application Portfolio Management (APM) system.
- Attend and contribute to key committees such as Risks (NIRA), Architecture (ARB), Demand Management, Operational Readiness (OR), and Digital Planning Week. Participate in these committees to align system reliability strategies with broader organizational goals. Provide technical insights and recommendations to support decision-making and strategic planning.
- Actively participate in daily escalation sessions with GTEP. Engage in daily escalation sessions to address recent impacts and coordinate responses. Collaborate with global and regional teams to provide updates, discuss ongoing issues, and ensure effective resolution of critical incidents.
- Participate in regional SRO committees to establish and ex