Dubai, United Arab Emirates
3 days ago
Data Warehouse Engineer - Intern
In a world of possibilities, pursue one with endless opportunities. Imagine Next!

 

At Parsons, you can imagine a career where you thrive, work with exceptional people, and be yourself. Guided by our leadership vision of valuing people, embracing agility, and fostering growth, we cultivate an innovative culture that empowers you to achieve your full potential. Unleash your talent and redefine what’s possible.

 

Job Description:

Position Overview

Parsons is seeking a high-potential Data Engineer Graduate Intern to join our Technology and Innovation team. This role is designed for candidates with strong analytical foundations and an interest in building scalable, enterprise-grade data platforms that support operational, engineering, and executive decision-making.

As an intern, you will contribute to the design, development, and optimization of cloud-based data pipelines and analytics platforms, primarily within the Microsoft Azure ecosystem. You will work alongside experienced data engineers, architects, and product teams on real delivery programs, gaining exposure to enterprise data standards, governance, and DevOps practices.

Key Responsibilities

Data Processing

Work with frameworks like Apache Spark, Hadoop, or Apache Beam to process large datasets efficiently.Support development of batch and streaming data pipelines using Python and distributed processing frameworks such as Apache Spark (Databricks)Assist in processing and transforming structured and semi-structured data at scale

ETL/ELT Implementation

Assist in designing and implementing ETL/ELT processes for data integration and transformation.Contribute to the design and implementation of ETL/ELT workflows using Azure Data Factory, Databricks, or equivalent toolsSupport data ingestion from multiple sources (databases, APIs, files, cloud storage)

Cloud Integration & Platform (Microsoft Azure)

Work with Azure-native data services, including:Azure Data FactoryAzure Synapse AnalyticsAzure Data Lake Storage (ADLS Gen2)Azure DatabricksUtilize cloud services such as Azure (Data Factory, Synapse, Data Lake), AWS (S3, Redshift, Glue), or Google Cloud Platform (BigQuery, Dataflow) for data storage and processing.Support secure configuration of cloud resources, access controls, and data storage

Database Management:

Manage and query relational databases (e.g., PostgreSQL, MySQL, Oracle) and NoSQL databases (e.g., MongoDB, Cassandra, DynamoDB).Query and manage relational databases (Azure SQL, SQL Server, PostgreSQL, MySQL)Support analytics and reporting use cases using modern data warehouse / lakehouse architectures

Data Warehousing:

Support the development and optimization of modern data warehouse solutions like Databricks, Snowflake, Redshift, or BigQuery.

Pipeline Orchestration

Build and manage workflows using orchestration tools like Apache Airflow, Prefect, or Luigi.Assist with workflow orchestration using tools such as Azure Data Factory pipelines or Apache Airflow (where applicable)Support scheduling, monitoring, and failure handling of data pipelines

Big Data Tools

Work with distributed data systems and storage solutions like HDFS or cloud-native equivalents.

Version Control

Collaborate with the team using Git for code versioning and management.

Debugging and Optimization

Diagnose and resolve performance issues in data systems and optimize database queries.

DevOps, Quality & Optimization

Collaborate using Git-based workflows (Azure DevOps Repos or GitHub)Support data quality checks, performance tuning, and query optimizationAssist with documentation of data pipelines, schemas, and system design

Technical Requirements

Skill Area

Requirements

Programming

Proficiency in Python- Experience with scripting languages for automationSolid understanding of SQL for data querying and transformation

Data Processing Frameworks

Hands-on experience with Apache Spark, Hadoop, or Apache Beam- Familiarity with ETL/ELT processesUnderstanding of ETL / ELT concepts and data pipeline design

Database and Querying

Strong understanding of SQL- Experience with relational databases (PostgreSQL, MySQL, Oracle)Experience with NoSQL databases (MongoDB, Cassandra, DynamoDB)

Cloud Platforms

- Familiarity with Microsoft Azure data services

Azure Data FactoryAzure Synapse AnalyticsAzure Data LakeAzure Databricks

(Data Factory, Synapse, Data Lake)- AWS (S3, Redshift, Glue

Awareness of Azure security and identity concepts (RBAC, managed identities) is advantageous

Data Warehousing

- Experience with Databricks, Snowflake, Redshift, or BigQuery

Data Pipelines and Orchestration

- Knowledge of tools like Apache Airflow, Prefect, or Luigi

Big Data Tools

- Experience with distributed data systems and storage solutions like HDFS

Version Control

- Proficiency with Git for code versioning and collaboration

Preferred Qualifications

Exposure to Azure DevOps or GitHub ActionsFamiliarity with Agile / Scrum delivery environmentsInterest in enterprise analytics, cloud platforms, and data governanceAwareness of data privacy and governance principles (e.g., GDPR concepts)Note: Multi-cloud exposure (AWS / GCP) is beneficial but not required. The primary environment is Microsoft Azure.Experience: Practical exposure to building and optimizing scalable data pipelines, batch and real-time data processing.Debugging: Familiarity with diagnosing and resolving performance issues in data systems.Data Governance: Understanding of data privacy regulations (e.g., GDPR, CCPA) and experience implementing data quality checks and access controls.Certifications (Optional but Valuable):AWS Certified Data Analytics – SpecialtyGoogle Professional Data EngineerMicrosoft Azure Data Engineer AssociateDatabricks Certified Data Engineer Associate

Soft SkillsProblem-Solving: Ability to troubleshoot complex data and system issues independently.Communication: Collaborate with data analysts, scientists, and engineers to understand data needs and deliver solutions.Documentation: Document data workflows, system designs, and troubleshooting procedures effectively.Team Collaboration: Experience working in cross-functional teams using Agile or similar methodologies.

EducationBachelor’s degree (or final-year student) in Computer Science, Data Engineering, Information Systems, Engineering, or a related fieldRelevant projects, internships, or practical experience may substitute for formal education

Learning OpportunitiesHands-on experience building data pipelines in a Microsoft Azure enterprise environmentExposure to lakehouse architectures, analytics platforms, and cloud security practicesPractical experience with Databricks, Azure Data Factory, and SynapseMentorship from senior data engineers and architects working on live programsInsight into how data engineering supports large-scale infrastructure, engineering, and program delivery

Duration

Internship duration: 3 to 6 months with possibility of extension.

Parsons equally employs representation at all job levels no matter the race, color, religion, sex (including pregnancy), national origin, age, disability or genetic information.

We truly invest and care about our employee’s wellbeing and provide endless growth opportunities as the sky is the limit, so aim for the stars! Imagine next and join the Parsons quest—APPLY TODAY!

Parsons is aware of fraudulent recruitment practices. To learn more about recruitment fraud and how to report it, please refer to https://www.parsons.com/fraudulent-recruitment/.

Confirmar seu email: Enviar Email
Todos os Empregos de Parsons