Senior Data Framework Engineer
Pay Rate: £700 - £715 per day Inside IR35 - Umbrella
Duration: 6 months
We are looking for a skilled and experienced Sr. Data Platform Engineer to join our growing team. Sr. Data Platform Engineers take full ownership of delivering high-performing, high-impact data framework products, and services, from a description of a problem customer Data Engineers are trying to solve all the way through to final delivery (and ongoing monitoring and operations).
Achieving delivery of the right data to the right people at the right time needs design and implementation of data flows and data products which leverage internal and external data assets and tools to drive discovery and development is a key objective for the Data Science and Data Engineering (DSDE) team within GSK's Pharmaceutical R&D organization.
There are five key drivers for this approach, which are closely aligned with GSK's corporate priorities of Innovation, Performance and Trust:
· Automation of end-to-end data flows: Faster and reliable ingestion of high throughput data in genetics, genomics and multi-omics, to extract value of investments in new technology (instrument to analysis-ready data in <12h)
· Enabling governance by design of external and internal data: with engineered practical solutions for controlled use and monitoring
· Innovative disease-specific and domain-expert specific data products: to enable computational scientists and their research unit collaborators to get faster to key insights leading to faster biopharmaceutical development cycles.
· Supporting e2e code traceability and data provenance: Increasing assurance of data integrity through automation, integration
· Improving engineering efficiency: Extensible, reusable, scalable, updateable, maintainable, virtualized traceable data and code would be driven by data engineering innovation and better resource utilization.
The Data Framework team builds and manages (in partnership with Tech) reusable components and architectures designed to make it both fast and easy to build robust, scalable, production-grade data products and services in the challenging biomedical data space.
Additional responsibilities also include:
· Partner with Tech where modifications to underlying tools (e.g. infrastructure as code, Cloud Ops, DevOps, logging / alerting) are needed to serve new use-cases, and to ensure operations are planned
· Write fantastic code along with the proper unit, functional, and integration tests for code and services to ensure quality. Mentor more junior engineers in these skills
· Stay up to date with developments in the open-source community around data engineering, data science, and similar tooling.
· Spot opportunities to test out new tooling for internal use cases, as well as opportunities to contribute back to the community.
The DSDE team is built on the principles of ownership, accountability, continuous development, and collaboration. We hire for the long term, and we're motivated to make this a great place to work. Our leaders will be committed to your career and development from day one.
We are looking for professionals with these required skills to achieve our goals:
· Master’s in Computer Science with a focus in Data Engineering, DataOps, DevOps, MLOps, Software Engineering, etc, plus 5 years job experience, (or PhD or Bachelor’s degree in Computer Science plus 3-8 years job experience)
· Experience with common distributed data tools in a production setting (Spark, Kafka, Hive, Presto, etc.)
· Experience with specialized data architecture (e.g. data lake, lake house, data fabric, data mesh, optimizing physical layout for access patterns)
· Experience with public cloud providers like AWS, Azure and GCP
· Experience with search / indexing systems (e.g. Elasticsearch)
· Practical experience with agile software development and DevOsps-forward ways of working
If you have the following characteristics, it would be a plus:
· Experience building and designing a DevOps first way of working
· Demonstrated excellence writing production Python, Java, Scala, Go, and/or C#/C++
· Demonstrated experience building reusable components on top of the CNCF ecosystem including platforms like Kubernetes (or similar ecosystem)
· Metrics-first mindset