I am looking to speak with Senior Data Engineers for our client based in Dublin 1. Our client is at the forefront of healthcare innovation empowering their business partners and team members to improve healthcare delivery each day. As a Data Engineer within our client you will be responsible for researching cutting edge big data tools and designing innovative solutions to solve industry problems only a Data Engineer can do!
Within our client you will be part of a newly formed team that is focused on developing new cutting-edge big data analytics platforms that will support advanced machine learning models to improve healthcare outcomes for our client’s members. You will be responsible for the integration of multiple complex data sources into our client’s data platform using a mix of different platforms (Kubernetes, Hadoop) and various data applications (Spark, Hive, Hbase, Airflow).
- Design and build data pipelines (mostly in Spark) to process terabytes of data
- Orchestrate in Airflow the data tasks to run on Kubernetes/Hadoop for the ingestion, processing and cleaning of data.
- Create Docker images for various applications and deploy them on Kubernetes
- Design and build best in class processes to clean and standardize data.
- Troubleshoot production issues in our client’s Elastic Environment
- Tuning and optimizing data processes
- Work on Proof of Concepts for Big Data and Data Science
- Modelling of big volume datasets to maximize performance for our client’s BI & Data Science Team
- Create real-time analytics pipelines using Kafka / Spark Streaming
Hands on experience on the following technologies:
- Minimum 5-8 years’ experience
- Demonstrated experience developing processes in Spark
- Significant experience writing complex SQL queries
- Considerable experience building ETL/data pipelines
- Exposure to Kubernetes and Linux containers (i.e. Docker) for at least 1 year
- Related/complementary open source software platforms and languages (e.g. Scala, Python, Java, Linux)
Exposure to the following technologies:
- Hive /HBase / Presto
- Jenkins / Travis
- Cloud technologies: Amazon AWS or Microsoft Azure
- Previous experience with Relational Databases (RDBMS) & Non- Relational Database
- Analytical and problem-solving experience applied to a Big Data environment and Distributed Processing.
- Experience working in projects with agile/scrum methodologies
- Exposure to DevOps methodology
- Data warehousing principles, architecture and its implementation in large environments