Job description Posted 06 May 2022

Senior Data Ops / Devops Engineer 

6 month contract

Location: Stevenage / Remote

Pay rate up to £750 per day via Umbrella inside IR35

 

 

The mission of the Data Science and Data Engineering (DSDE) organization within GSK Pharmaceuticals R&D is to get the right data, to the right people, at the right time. The Data Framework and Ops organization ensures we can do this efficiently, reliably, transparently, and at scale through the creation of a leading-edge, cloud-native data services framework. We focus heavily on developer experience, on strong, semantic abstractions for the data ecosystem, on professional operations and aggressive automation, and on transparency of operations and cost.

Achieving delivery of the right data to the right people at the right time needs design and implementation of data flows and data products which leverage internal and external data assets and tools to drive discovery and development is a key objective for the Data Science and Data Engineering (DSDE) team within GSK's Pharmaceutical R&D organisation. There are five key drivers for this approach, which are closely aligned with GSK's corporate priorities of Innovation, Performance and Trust: 

 

Automation of end-to-end data flows: Faster and reliable ingestion of high throughput data in genetics, genomics and multi-omics, to extract value of investments in new technology (instrument to analysis-ready data in <12h) 

Enabling governance by design of external and internal data: with engineered practical solutions for controlled use and monitoring 

Innovative disease-specific and domain-expert specific data products: to enable computational scientists and their research unit collaborators to get faster to key insights leading to faster biopharmaceutical development cycles.​  

Supporting e2e code traceability and data provenance: Increasing assurance of data integrity through automation, integration 

Improving engineering efficiency: Extensible, reusable, scalable, updateable, maintainable, virtualized traceable data and code​ would be driven by data engineering innovation and better resource utilization.  

 

We are looking for an experienced Sr. Data Ops Engineer to join our growing Data Ops team. As a Sr. Data Ops Engineer is a highly technical individual contributor, building modern, cloud-native, DevOps-first systems for standardizing and templatizing biomedical and scientific data engineering, with demonstrable experience across the following areas: 

 

  • Deliver declarative components for common data ingestion, transformation and publishing techniques 
  • Define and implement data governance aligned to modern standards 
  • Establish scalable, automated processes for data engineering teams across GSK 
  • Thought leader and partner with wider DSDE data engineering teams to advise on implementation and best practices 
  • Cloud Infrastructure-as-Code 
  • Define Service and Flow orchestration 
  • Data as a configurable resource (including configuration-driven access to scientific data modelling tools) 
  • Observabilty (monitoring, alerting, logging, tracing, ...) 
  • Enable quality engineering through KPIs and code coverage and quality checks 
  • Standardise GitOps/declarative software development lifecycle 
  • Audit as a service 

 

Sr. Data Ops Engineers take full ownership of delivering high-performing, high-impact biomedical and scientific data ops products and services, from a description of a pattern that customer Data Engineers are trying to use all the way through to final delivery (and ongoing monitoring and operations) of a templated project and all associated automation. They are standard-bearers for software engineering and quality coding practices within the team and are expected to mentor more junior engineers; they may even coordinate the work of more junior engineers on a large project. They devise useful metrics for ensuring their services are meeting customer demand and having an impact and iterate to deliver and improve on those metrics in an agile fashion. 

A successful Sr. Data Ops Engineer is developing expertise with the types of data and types of tools that are leveraged in the biomedical and scientific data engineering space, and has the following skills and experience (with significant depth in one or more of these areas): 

 

  • Demonstrable experience deploying robust modularised/container based solutions to production (ideally GCP) and leveraging the Cloud Native Computing Foundation (CNCF) ecosystem  
  • Significant depth in DevOps principles and tools (e.g. GitOps, Jenkins, CircleCI, Azure DevOps, ...), and how to integrate these tools with other productivity tools (e.g. Jira, Slack, Microsoft Teams) to build a comprehensive workflow 
  • Programming in Python. Scala or Go 
  • Embedding agile software engineering (task/issue management, testing, documentation, software development lifecycle, source control, ) 
  • Leveraging major cloud providers, both via Kubernetes or via vendor-specific services 
  • Authentication and Authorization flows and associated technologies (e.g. OAuth2 + JWT) 
  • Common distributed data tools (e.g. Spark, Hive) 

 

The DSDE team is built on the principles of ownership, accountability, continuous development, and collaboration. We hire for the long term, and we're motivated to make this a great place to work. Our leaders will be committed to your career and development from day one. 


 

Basic Qualifications:

 

  • Masters in Computer Science with a focus in Data Engineering, DataOps, DevOps, MLOps, Software Engineering, etc, plus 5 years job experience (or PhD plus 3 years job experience) 
  • Deep experience with DevOps tools and concepts (e.g. Jira, GitLabs / Jenkins / CircleCI / Azure DevOps / …)  
  • Excellent with common distributed data tools in a production setting (Spark, Kafka, etc) 
  • Experience with specialized data architecture (e.g. optimizing physical layout for access patterns, including bloom filters, optimizing against self-describing formats such as ORC or Parquet, etc) 
  • Experience with search / indexing systems (e.g. Elasticsearch) 
  • Deep expertise with agile development in Python, Scala, Go, and/or C++ 
  • Experience building reusable components on top of the CNCF ecosystem including Kubernetes 
  • Metrics-first mindset 
  • Experience mentoring junior engineers into deep technical expertise 

 

Preferred Qualifications:

 

If you have the following characteristics, it would be a plus:

  • Experience with agile software development 
  • Experience building and designing a DevOps-first way of working 
  • Demonstrated experience building reusable components on top of the CNCF ecosystem including Kubernetes (or similar ecosystem)

 

Why GSK?

 

Our values and expectations are at the heart of everything we do and form an important part of our culture.

These include Patient focus, Transparency, Respect, Integrity along with Courage, Accountability, Development, and Teamwork. As GSK focuses on our values and expectations and a culture of innovation, performance, and trust, the successful candidate will demonstrate the following capabilities:

  • Operating at pace and agile decision making – using evidence and applying judgement to balance pace, rigour and risk.
  • Committed to delivering high-quality results, overcoming challenges, focusing on what matters, execution.
  • Continuously looking for opportunities to learn, build skills and share learning.
  • Sustaining energy and wellbeing
  • Building strong relationships and collaboration, honest and open conversations.
  • Budgeting and cost consciousness