Job Title: Jupyter Notebooks
Location: London
Work model: Hybrid
Key Responsibilities
Develop and maintain reproducible data analysis workflows and visualizations in Jupyter Notebooks.
Utilize libraries such as Pandas, NumPy, Matplotlib, Seaborn, and Plotly for data manipulation and visualization.
Collaborate with cross-functional teams to extract, clean, and analyze data for business insights.
Build and train machine learning models using frameworks like scikit-learn, TensorFlow, or PyTorch.
Integrate data from multiple sources (e.g., databases, APIs, flat files) and ensure consistency across notebooks.
Use Jupyter Notebooks for creating dynamic reports and interactive dashboards (e.g., with Voila, Jupyter Dashboards, or Bokeh).
Debug and optimize Jupyter Notebook workflows for performance and scalability.
Document and version-control notebooks using Git or similar tools.
Stay current with emerging data science tools and best practices to enhance productivity.
Required Skills and Qualifications
Proficiency in Python programming and familiarity with Jupyter Notebook extensions.
Hands-on experience with data analysis libraries (e.g., Pandas, NumPy) and visualization tools (e.g., Matplotlib, Seaborn, Plotly).
Strong understanding of data science methodologies, statistics, and machine learning algorithms.
Experience with relational databases (e.g., SQL, PostgreSQL) and data querying.
Familiarity with version control tools like Git and cloud platforms (e.g., AWS, Google Cloud, or Azure).
Strong communication skills for presenting findings in a clear, reproducible manner.
Ability to debug and troubleshoot issues within Jupyter Notebooks and related code.
Preferred Qualifications
Experience with interactive dashboards using Voila, Streamlit, or Dash.
Familiarity with notebook infrastructure tools like JupyterHub, Binder, or Kubernetes for Notebooks.
Knowledge of big data tools and frameworks (e.g., Spark, Dask).
Background in scientific computing, such as SciPy or SymPy.
Experience in integrating Jupyter Notebooks with CI/CD pipelines for data workflows.
Certifications in Data Science, Machine Learning, or related fields.