Senior Data Engineer (Azure, PySpark, SQL)
Overview
We are seeking a highly skilled Senior Data Engineer with 5+ years of experience in building scalable data solutions. The ideal candidate will have strong expertise in SQL, PySpark, ETL processes, Data Lakes, and the Azure data ecosystem.
Key Responsibilities
- Design, develop, and maintain robust data pipelines using Python/PySpark
- Build and optimize ETL workflows for large-scale data processing
- Work with Azure services such as:
- Azure Blob Storage
- Azure Data Lake
- Azure Data Factory
- Azure Synapse Analytics
- Ensure high performance and reliability of data systems
- Collaborate with cross-functional teams to understand data requirements
- Implement best practices for data governance, security, and quality
- Use version control tools like Git and manage tasks via Azure DevOps or Jira
- Provide technical guidance and mentorship to junior team members.
- Drive design decisions and contribute to data architecture planning.
Required Skills & Experience
- 5+ years of experience with SQL
- 4+ years of hands-on experience in building data pipelines using Python/PySpark
- 4+ years of experience with the Azure ETL stack
- Strong understanding of:
- Data modeling
- Distributed computing
- Data warehousing concepts
- Experience with code versioning tools (Git)
- Familiarity with Agile tools such as Azure DevOps or Jira
Qualifications
- Bachelor’s degree in:
- B.Sc / BCA / B.Tech / B.E (any specialization)
- Strong verbal and written communication skills
- Reliable internet connection (for remote/hybrid roles)
- Azure certifications are a plus
Preferred (Bonus) Skills
- Experience with real-time data processing
- Knowledge of CI/CD pipelines for data engineering
- Exposure to big data technologies beyond Azure ecosystem
- Provide technical guidance and mentorship to junior team members.
- Drive design decisions and contribute to data architecture planning.