Job description Posted 05 January 2021

Data Scientist: Python, ARIMA, LTSM, Machine Learning


Role is initially working from home / GSK Stevenage


About the Role

Medicinal Science & Technology (MST), a 3000+ person organization in GSK Research & Development (R&D), is accountable for ensuring that GSK has access to technology that amplifies our science, providing new insights that improve probability of success, speed and/or efficiency of medicine discovery and development.


Within MST the ‘Portfolio, Project & Resource Management’ group is accountable for technical project management for GSK’s prospective medicines, MST-related portfolio analyses, and MST resource forecasting. We are looking for a high-performing data scientist, operating at the interface of science and business, to improve the sophistication of our analyses and yield greater insight into the way in which we forecast and deploy resource to support the portfolio.


This is a hands-on position where you will be empowered to be curious, creative, and ambitious. While the business context, and initial business questions, will be defined for you, you will be expected to proactively liaise with other data scientists in R&D and Tech to quickly build an undertstanding of relevant datasets, both business-related and scientific, and understand their limitations and utility. You will then be expected to answer the initial business questions and start to rapidly self-identify and test subsequent hypotheses.


As a Data Scientist we’d like you to be able to:

  • Demonstrate experience in manipulating data in a big data environment – experience of hive, SQL, and visualization software like Spotfire/Tableau, is a must.
  • Forecasting experience classical ARIMA and LSTM approaches
  • Excellent knowledge of Forecast Data
  • Curate large multiparametric data sets, and deploy algorithms to extract information and derive insights.
  • Design, develop and implement analytical solutions using a variety of commercial and open source tools (common tools include Python, R, TensorFlow).
  • Develop and embed automated processes for predictive model validation, deployment, and implementation.
  • Identify opportunities to apply Machine Learning and Artificial Intelligence to build, test, and validate predictive models.
  • Effectively explain technical concepts to senior managers/stakeholders who have no understanding of data science as a discipline.
  • Explore how to visualise data to business owners in a way that ‘speaks to the questions’ they are trying to answer.
  • Use these data visualisations to postively influence decision making and challenge the way business owners traditionaly think.



It would be fantastic if you have

  • A higher degree in Engineering, Statistics, Data Science, Applied Mathematics, Computer Science, Physics, Computational Biology, Computational Chemistry or related quantitative field.
  • Deep experience of Extraction, Transformation and Load processes in a big data environment (hive, Impala, SQL).
  • Expert understanding of a programming language such as Python or R.
  • Experience with at least one Deep Learning framework such as TensorFlow, Keras, or PyTorch.
  • Excellent verbal communication skills.
  • Ability to work autonomously and collaboratively as part of a team to both teach and learn every day.