Role Overview:
- Job Title: Principal Data Engineering Lead
- Location: Germany (Remote work is possible)
- Contract Duration: 6 Months
- Rate:450 EUR/day
- Minimum Experience: 10 – 15 years
Key Responsibilities:
- Technical Leadership: Provide guidance and hands-on oversight in developing and deploying data-intensive applications using Microsoft Azure Synapse, specifically working with Spark and Scala.
- Data Pipeline Optimization: Design and optimize data pipelines using the Medallion architecture in Azure Synapse pipelines.
- Data Ingestion & Quality: Manage data ingestion processes and ensure data quality using tools like Deequ, including validation and error management.
- Configuration Management: Handle configuration settings for various zones through JSON files, specifically focusing on ApplicationConfig and TableConfig classes.
- Collaboration: Work closely with data scientists, analysts, and cross-functional teams to ensure alignment with business and marketing strategies.
- Logging and Auditing: Ensure proper logging, auditing, and error tracking with tools like Azure Log Analytics and KQL queries.
- Testing and Validation: Implement unit testing with tools such as ScalaTest and maintain rigorous data quality checks.
Required Technical Skills & Experience:
- Azure Synapse Spark Runtime: Deep experience with deploying and managing data processing applications in Microsoft Azure, specifically leveraging Synapse for Spark.
- Spark and Scala Programming: Expertise in using Apache Spark, particularly with Scala for building scalable, high-performance data processing systems.
- Data Engineering Frameworks: Experience with architecture like Medallion for designing data pipelines across multiple zones.
- Data Quality Tools: Knowledge of data validation and quality assurance tools like Deequ, and error handling within large-scale data systems.
- Cross-functional Collaboration: Ability to work with data scientists and analysts, ensuring alignment across teams.
- Log Management & Auditing: Familiarity with Azure Log Analytics and proficiency in querying with KQL (Kusto Query Language).
- Unit Testing & Validation: Experience with testing frameworks such as ScalaTest for ensuring reliable, high-quality data pipelines.
Additional Qualifications:
- Strong understanding of data privacy, data security, and machine learning scaling in cloud environments.
- Experience working at the enterprise scale, particularly in designing data architectures that scale across multiple business units and regions.
This role requires an expert-level understanding of both technical and collaborative aspects of data engineering in a cloud environment, with a focus on the Microsoft Azure ecosystem.