- Hadoop, Cloud (GCP), CDC, Kafka Streaming, Microservices, Spark, React, Java
Key Responsibilities (not limited to):
- Design and implement strategic near real time infrastructure
- Tier 1 Service / DR (Disaster Recovery)
- Review existing infrastructure and recommend changes - infrastructure and tooling enhancements
- GCP service set up
- Data Distribution
As a key member of the technical team alongside Engineers, Data Scientists and Data Users, you will be expected to define and contribute at a high-level to many aspects of our collaborative Agile development process:
- Experience with most of the following technologies (Apache Hadoop, Scala, Apache Spark, Spark streaming, YARN, Kafka, Hive, HBase, Presto, Python, ETL frameworks, MapReduce, SQL, RESTful services).
- Sound knowledge on working Unix/Linux Platform
- Experience in end to end automation
- Experience of managing a team
- Hands-on experience building data pipelines using Hadoop components Sqoop, Hive, Pig, Spark, Spark SQL.
- Experience with time-series/analytics db's such as Elasticsearch
- Experience with industry standard version control tools (Git, GitHub), automated deployment tools and requirement management in JIRA
- Exposure to Agile Project methodology but also with exposure to other methodologies (such as Kanban)
- Understanding of big data modelling techniques using relational and non-relational techniques
- Coordination between Onsite and Offshore
- Experience on Debugging the Code issues and then publishing the highlighted differences to the development team/Architects;
Understanding or experience of Cloud design patterns
- 8+ years professional experience, all or majority of which in Big Data
- 4+ years of programming experience in Java, Scala, and Spark.
- 2+ years of programming experience with distributed stream processing, e.g. Apache Kafka
- 2+ years of experience with CDC
- 2+ years of Agile and DevOps experience
- Proficient in SQL and relational database design.
- Elastic Search experience (Elastic/Logstash/Kibana etc)
- Project planning.
- Google Cloud Platform, or other cloud vendor