We are seeking an experienced Lead Data Engineer to join a high-performing team focused on combating financial crime through advanced analytics and scalable data platforms. In this role, you will design, build, and evolve a resilient Databricks + AWS lakehouse environment that powers fraud detection and anti-money laundering (AML) solutions for financial institutions.
You will take ownership of end-to-end data engineering delivery - from architecture and ingestion to optimisation, governance, and operational support - ensuring secure, compliant, and high-quality data solutions. This is a 12-month daily rate contract (with potential extension), based in Dublin with a hybrid working model (3 days onsite).
Own the end-to-end design, build, optimisation, and support of scalable Spark/PySpark data pipelines on Databricks (batch and streaming).
Define and enforce lakehouse architecture standards (Medallion: bronze/silver/gold), including schema governance, lineage, quality SLAs, and cost controls.
Architect secure and compliant AWS-based data infrastructure (S3, IAM, KMS, Glue, Lake Formation, EC2/EKS, Lambda, Step Functions, CloudWatch, Secrets Manager).
Implement and standardise orchestration frameworks (Airflow, Databricks Workflows, Step Functions) with strong observability and idempotent design patterns.
Develop secure ingestion pipelines using Apache NiFi, APIs, SFTP/FTPS, and other enterprise integration methods.
Embed data quality frameworks including anomaly detection, reconciliation, data contracts, SLIs/SLOs, and monitoring.
Implement metadata and lineage capabilities (Unity Catalog, Glue, OpenLineage) to support audit and regulatory transparency.
Drive CI/CD best practices for data engineering (infrastructure as code, automated testing, Git workflows, environment promotion).
Optimise Spark workloads for performance and cost (partitioning, Delta Lake optimisation, caching, cluster tuning).
Partner with data science, compliance, and product teams to translate detection and analytical requirements into robust, production-ready data models.
Lead technical design reviews, mentor engineers, and promote engineering best practices.
Support incident response, root cause analysis, and continuous platform improvement.
10+ years of data engineering experience, with demonstrated technical leadership.
Expert-level SQL and strong hands-on experience with Databricks, Spark/PySpark, and Python.
Proven production experience building and optimising large-scale Spark pipelines and Delta Lake architectures.
Strong AWS ecosystem expertise (S3 architecture, IAM least privilege, Glue, Lake Formation, encryption, networking).
Experience with orchestration tools such as Airflow, Databricks Workflows, and/or Step Functions.
Strong understanding of data governance, lineage, metadata management, and regulatory compliance (PII/PCI handling, retention controls).
Experience implementing CI/CD and infrastructure as code (Terraform or CloudFormation).
Solid knowledge of performance optimisation and cost management strategies in cloud environments.
Strong stakeholder engagement skills and ability to lead design sessions across technical and non-technical audiences.
Experience in Financial Services (preferred), or regulated industries such as Telecom or Insurance.
Bachelor's or Master's degree in Computer Science, Engineering, Data Engineering, or related discipline (preferred).
