Job Overview:
We are looking for an individual who has at least a basic to intermediate understanding of SQL, Python, Pandas, and NumPy. It would be a plus to have knowledge of advanced-level concepts, but it's not a strict requirement.
Key Responsibilities:
- Extract, transform, and analyze, manipulate, restructure data using SQL and Python
- Clean, filter, transform, and reshape data using Pandas
- Perform numerical computations and vectorized operations using NumPy
- Create scripts to automate data loading and preprocessing tasks
- Collaborate with team members to understand data needs and deliver insights
- Prepare and present basic analytical reports or summaries
__________________________________________________________________________________
Core Skills – Must Have:
=>SQL (Basic to Intermediate): SQL Skills
- Core SQL Clauses — Must Know
Proficiency in using:
SELECT, WHERE, GROUP BY, ORDER BY, LIMIT/OFFSET, TOP, HAVING - Different types of Operators.
- Joins — Very Important
Must have hands-on knowledge of:
INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, should understand how improper join conditions can lead to duplicate data. - Functions — Must Know
Built-in functions: Aggregate, Analytical, String, Date functions
User Defined Functions (UDFs) - Filtering Techniques — Must Know
Use of:
BETWEEN, IN, LIKE, IS NULL - Subqueries & Common Table Expressions (CTEs) — Must Know
- Window Functions — Preferred (Good to Have), should know how it can be used in analytical and aggregate functions. (Good to have)
- Basic Schema Design — Must Know
Understanding of: Tables and relationships, Primary and foreign keys, Data types
=>Python (Basic to Intermediate — Very Important):
- Data types: lists, dictionaries, tuples, sets
- Loops, list comprehensions, and conditional statements
- Functions, lambda expressions, and exception handling
- File I/O operations (working with CSV, Excel, JSON, and text files)
- Basics of modular programming (importing custom Python files/modules)
- Using virtual environments and package management (pip)
- Familiarity with Jupyter notebooks and script-based workflows
=>Python's Pandas (Basic to Mid-Level-— Very Important):
- Reading data from CSV, Excel, JSON, and databases
- Creating, indexing, slicing, and filtering DataFrames
- Applying groupby(), pivot(), pivot_table() for aggregation
- Handling missing data: fillna(), dropna(), isnull()
- Merging, joining, and concatenating DataFrames
- Applying custom functions using apply() and lambda
- Data type conversions and column renaming
- Working with datetime fields and string operations
- Writing data back to file or database
=>Python's NumPy (Basic to Mid-Level— Very Important):
- Creating and reshaping arrays (reshape, flatten, transpose)
- Element-wise operations and broadcasting
- Indexing, slicing, and conditional selection
- Aggregations: mean, sum, max, min, std
- Random number generation (np.random)
- Stack/split arrays (vstack, hstack, split)
- Basic linear algebra: dot product, matrix multiplication (optional)
Last and most important concept- How to write, secure and optimized code, queries using Python and SQL.
This job has now closed
You can find more jobs over on our careers page.
See More Jobs