Company Overview
Our client is a global leader in financial and technology-driven services, delivering innovative data solutions to some of the world's largest institutional clients. They are seeking a Test Automation Engineer with a strong focus on Databricks and modern data platforms to ensure the quality, performance, and reliability of their data solutions.
Role Overview
As a Data & Databricks Test Automation Engineer, you will be responsible for developing and implementing automated testing frameworks across Databricks-based data platforms. You'll collaborate closely with data engineering teams to validate pipelines, ensure data quality, and strengthen the integrity of a modern Lakehouse architecture.
Databricks Testing
Design and implement automated testing for Databricks notebooks and workflows.
Develop test frameworks for Delta Lake tables and ACID transactions.
Build automated validation processes for structured streaming pipelines.
Validate Delta Live Tables implementations.
Data Pipeline Testing
Automate testing for ETL/ELT processes within Databricks.
Implement Spark job testing and performance validation.
Create and manage test cases for data ingestion and transformation workflows.
Test Unity Catalog configurations, access controls, and governance models.
Quality Assurance
Design and execute data quality test strategies and reconciliation processes.
Implement performance testing for large-scale Spark jobs.
Ensure compliance with internal data governance and quality standards.
Monitoring & Reporting
Develop test monitoring frameworks and dashboards.
Automate quality reporting and produce actionable test metrics.
Maintain clear test documentation and version control across projects.
Education
Bachelor's degree in Computer Science, Data Science, Engineering, or related field.
Certifications in Databricks or data testing tools are advantageous.
Technical Skills
2+ years' hands-on experience with Databricks (Spark).
Strong programming experience with Python (PySpark) and SQL.
Exposure to data testing frameworks and tools.
Familiarity with AWS services (S3, Glue, Lambda) or similar cloud platforms.
Knowledge of Delta Lake and Lakehouse architectures.
Proficiency with version control systems such as Git.
Additional Skills
Strong analytical and problem-solving mindset.
Experience in large-scale data processing environments.
Understanding of data governance, compliance, and data quality best practices.
Previous experience working within Agile or DevOps teams.
Platform Knowledge
Databricks workspace and notebook development.
Delta Lake and Delta Live Tables.
Unity Catalog testing for governance and permissions.
Spark optimization and performance analysis.
