Job Title: Senior Dataflow Engineer
Job Location: Krakow, Poland
Job Type: Kraków (Hybrid) (2 days Office/3 days Remote
Experience: 5-10 yrs
Job Description:
We are looking for a skilled Dataflow Engineer to join our team and play a key role in designing, building, and maintaining robust data pipelines and workflows. In this role, you will work closely with data architects, data analysts, and other engineers to ensure that data flows seamlessly across systems and is processed efficiently at scale.
Key Responsibilities:
1. Design and Build Data Pipelines: Develop and implement efficient data pipelines for collecting, transforming, and storing data across different platforms.
2. Data Integration: Integrate data from a variety of sources, including cloud platforms, Google Cloud), databases (SQL/NoSQL), APIs, and external services.
3. Optimize Data Flow: Ensure optimal performance of data flows by troubleshooting and fine-tuning existing pipelines.
4. Data Transformation: Implement ETL (Extract, Transform, Load) processes, transforming raw data into usable formats for analytics and reporting.
5. Collaboration: Work with cross-functional teams (data engineering, operations) to understand data needs and implement solutions; support the applications on need basis over the weekend or non-office hours.
6. Automation & Scalability: Build scalable, automated workflows to handle large volumes of data with high reliability and low latency.
7. Monitoring & Maintenance: Set up monitoring and alerting systems for data pipelines to ensure minimal downtime and maximum performance.
8. Documentation: Document data flows, pipeline configurations, and processing logic to ensure maintainability and transparency.
Required Qualifications:
1. Experience: 4+ years of experience as a Dataflow Engineer, Data Engineer, or similar role working with large datasets and distributed systems.
2. Technical Skills:
- Proficient in programming languages such as Python, Java.
- Hands-on experience with data pipeline orchestration tools (e.g. Google Dataflow).
- Experience with cloud-based data platforms like Google Cloud (BigQuery, Dataflow).
- Strong experience with ETL frameworks and tools; real-time data streaming and processing.
- Familiarity with data formats like JSON, Parquet, etc.
- Knowledge of SQL and NoSQL databases; data governance, data quality, and security best practices.
3. Problem-Solving: Ability to troubleshoot complex data integration and processing issues.
4. Communication Skills: Strong written and verbal communication skills to collaborate with technical and non-technical stakeholders.