Design and Build Data Pipelines: Develop scalable and efficient data pipelines for ingesting, transforming, and loading data from multiple sources, ensuring high data quality and timeliness.
Data Integration: Work with internal and external data sources, including APIs, financial data systems, and cloud-based data repositories.
Cloud Data Engineering: Utilize cloud technologies such as AWS, Azure, or Google Cloud to create flexible, cost-effective data storage and processing solutions. Experience with Snowflake or similar cloud data platforms is a plus.
Collaborate with Data Science and Analytics Teams: Work closely with data scientists and business analysts to understand data needs and ensure the availability of high-quality, clean, and relevant data for their work.
Data Warehousing and Storage: Manage and optimize data storage solutions (including data lakes, warehouses, and databases) to ensure efficient data access and retrieval for financial analysis and reporting.
ETL Development: Design and maintain ETL workflows for the effective transformation of data into business insights. Ensure seamless integration between diverse data sources and business intelligence tools.
Monitoring and Maintenance: Ensure the smooth operation of data pipelines and data systems by setting up automated monitoring, performing regular maintenance, and troubleshooting issues as they arise.
Data Governance and Compliance: Apply best practices for data governance, ensuring compliance with data security regulations (e.g., GDPR) and maintaining data privacy standards for sensitive financial data.
Automation and Optimization: Automate data ingestion and processing tasks, and optimize pipeline performance to handle large data volumes efficiently.
Required Skills and Qualifications:
Bachelor’s degree in Computer Science, Engineering, Finance, or a related field, or equivalent experience.
10 years of experience in Data Engineering, with a focus on building data pipelines and working with large, complex datasets.
Strong proficiency in SQL and experience with relational and NoSQL databases (e.g., PostgreSQL, MongoDB, Cassandra).
Solid experience with Python, Java, or Scala for building data processing applications and automation scripts.
Experience with cloud platforms such as AWS, Azure, or Google Cloud Platform, and familiarity with cloud-native data storage solutions (e.g., S3, Redshift, BigQuery, Snowflake).
Knowledge of ETL frameworks and tools such as Apache NiFi, Apache Kafka, Airflow, or similar.
Familiarity with big data processing frameworks such as Apache Spark or Hadoop for distributed data processing.
Experience with data warehousing concepts and platforms, with a preference for experience in Snowflake, Redshift, or BigQuery.
Knowledge of financial data systems and familiarity with market data and transactional systems is a plus.
Strong analytical and troubleshooting skills, with an ability to work independently and deliver high-quality results under tight deadlines.
Preferred Qualifications:
Experience with machine learning pipelines and integration with data engineering workflows.
Exposure to DevOps practices or tools such as Docker or Kubernetes for containerization and orchestration.
Familiarity with data security and compliance requirements specific to the financial services industry.
Certification in data engineering tools or cloud platforms (e.g., AWS Certified Data Analytics or Google Cloud Professional Data Engineer) is a plus.