PySpark Data Engineer | up to £450/day Inside | Remote with occasional London travel We are seeking a PySpark Data Engineer to support the development of a modern, scalable data lake for a new strategic programme. This is a greenfield initiative to replace fragmented legacy reporting solutions, offering the opportunity to shape a long-term, high-impact platform from the ground up. Key Responsibilities: * Design, build, and maintain scalable data pipelines using PySpark 3/4 and Python 3. * Contribute to the creation of a unified data lake following medallion architecture principles. * Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing. * Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage. * Collaborate with cross-functional teams and participate in Agile delivery workflows. * Manage configurations and workflows using YAML, Git, and Azure DevOps. Required Skills & Experience: * Proven expertise in PySpark 3/4 and Python 3 for large-scale data engineering. * Hands-on experience with Databricks, Delta Lake, and medallion architecture. * Familiarity with Python Behave for Behaviour Driven Development. * Strong understanding of YAML, code quality tools (e.g. Python Coverage), and CI/CD pipelines. * Knowledge of Azure DevOps and Git best practices. * Active SC clearance is essential - applicants without this cannot be considered. Contract Details: * 6-month initial contract with long-term extension potential (multi-year programme). * Inside IR35. This is an excellent opportunity to join a high-profile programme at its inception and help build a critical data platform from the ground up. If you are a mission-driven engineer with a passion for scalable data solutions and secure environments, we'd love to hear from you.