At Parsons, you can imagine a career where you thrive, work with exceptional people, and be yourself. Guided by our leadership vision of valuing people, embracing agility, and fostering growth, we cultivate an innovative culture that empowers you to achieve your full potential. Unleash your talent and redefine what’s possible.
Job Description:
Position OverviewParsons is seeking a high-potential Data Engineer Graduate Intern to join our Technology and Innovation team. This role is designed for candidates with strong analytical foundations and an interest in building scalable, enterprise-grade data platforms that support operational, engineering, and executive decision-making.
As an intern, you will contribute to the design, development, and optimization of cloud-based data pipelines and analytics platforms, primarily within the Microsoft Azure ecosystem. You will work alongside experienced data engineers, architects, and product teams on real delivery programs, gaining exposure to enterprise data standards, governance, and DevOps practices.
Key Responsibilities
Data Processing
Work with frameworks like Apache Spark, Hadoop, or Apache Beam to process large datasets efficiently.Support development of batch and streaming data pipelines using Python and distributed processing frameworks such as Apache Spark (Databricks)Assist in processing and transforming structured and semi-structured data at scaleETL/ELT Implementation
Assist in designing and implementing ETL/ELT processes for data integration and transformation.Contribute to the design and implementation of ETL/ELT workflows using Azure Data Factory, Databricks, or equivalent toolsSupport data ingestion from multiple sources (databases, APIs, files, cloud storage)Cloud Integration & Platform (Microsoft Azure)
Work with Azure-native data services, including:Azure Data FactoryAzure Synapse AnalyticsAzure Data Lake Storage (ADLS Gen2)Azure DatabricksUtilize cloud services such as Azure (Data Factory, Synapse, Data Lake), AWS (S3, Redshift, Glue), or Google Cloud Platform (BigQuery, Dataflow) for data storage and processing.Support secure configuration of cloud resources, access controls, and data storageDatabase Management:
Manage and query relational databases (e.g., PostgreSQL, MySQL, Oracle) and NoSQL databases (e.g., MongoDB, Cassandra, DynamoDB).Query and manage relational databases (Azure SQL, SQL Server, PostgreSQL, MySQL)Support analytics and reporting use cases using modern data warehouse / lakehouse architecturesData Warehousing:
Support the development and optimization of modern data warehouse solutions like Databricks, Snowflake, Redshift, or BigQuery.Pipeline Orchestration
Build and manage workflows using orchestration tools like Apache Airflow, Prefect, or Luigi.Assist with workflow orchestration using tools such as Azure Data Factory pipelines or Apache Airflow (where applicable)Support scheduling, monitoring, and failure handling of data pipelinesBig Data Tools
Work with distributed data systems and storage solutions like HDFS or cloud-native equivalents.Version Control
Collaborate with the team using Git for code versioning and management.Debugging and Optimization
Diagnose and resolve performance issues in data systems and optimize database queries.DevOps, Quality & Optimization
Collaborate using Git-based workflows (Azure DevOps Repos or GitHub)Support data quality checks, performance tuning, and query optimizationAssist with documentation of data pipelines, schemas, and system designTechnical RequirementsSkill Area
Requirements
Programming
Proficiency in Python- Experience with scripting languages for automationSolid understanding of SQL for data querying and transformationData Processing Frameworks
Hands-on experience with Apache Spark, Hadoop, or Apache Beam- Familiarity with ETL/ELT processesUnderstanding of ETL / ELT concepts and data pipeline designDatabase and Querying
Strong understanding of SQL- Experience with relational databases (PostgreSQL, MySQL, Oracle)Experience with NoSQL databases (MongoDB, Cassandra, DynamoDB)Cloud Platforms
- Familiarity with Microsoft Azure data services
Azure Data FactoryAzure Synapse AnalyticsAzure Data LakeAzure Databricks(Data Factory, Synapse, Data Lake)- AWS (S3, Redshift, Glue
Awareness of Azure security and identity concepts (RBAC, managed identities) is advantageousData Warehousing
- Experience with Databricks, Snowflake, Redshift, or BigQuery
Data Pipelines and Orchestration
- Knowledge of tools like Apache Airflow, Prefect, or Luigi
Big Data Tools
- Experience with distributed data systems and storage solutions like HDFS
Version Control
- Proficiency with Git for code versioning and collaboration
Preferred Qualifications
Exposure to Azure DevOps or GitHub ActionsFamiliarity with Agile / Scrum delivery environmentsInterest in enterprise analytics, cloud platforms, and data governanceAwareness of data privacy and governance principles (e.g., GDPR concepts)Note: Multi-cloud exposure (AWS / GCP) is beneficial but not required. The primary environment is Microsoft Azure.Experience: Practical exposure to building and optimizing scalable data pipelines, batch and real-time data processing.Debugging: Familiarity with diagnosing and resolving performance issues in data systems.Data Governance: Understanding of data privacy regulations (e.g., GDPR, CCPA) and experience implementing data quality checks and access controls.Certifications (Optional but Valuable):AWS Certified Data Analytics – SpecialtyGoogle Professional Data EngineerMicrosoft Azure Data Engineer AssociateDatabricks Certified Data Engineer AssociateSoft SkillsProblem-Solving: Ability to troubleshoot complex data and system issues independently.Communication: Collaborate with data analysts, scientists, and engineers to understand data needs and deliver solutions.Documentation: Document data workflows, system designs, and troubleshooting procedures effectively.Team Collaboration: Experience working in cross-functional teams using Agile or similar methodologies.EducationBachelor’s degree (or final-year student) in Computer Science, Data Engineering, Information Systems, Engineering, or a related fieldRelevant projects, internships, or practical experience may substitute for formal educationLearning OpportunitiesHands-on experience building data pipelines in a Microsoft Azure enterprise environmentExposure to lakehouse architectures, analytics platforms, and cloud security practicesPractical experience with Databricks, Azure Data Factory, and SynapseMentorship from senior data engineers and architects working on live programsInsight into how data engineering supports large-scale infrastructure, engineering, and program deliveryDuration
Internship duration: 3 to 6 months with possibility of extension.Parsons equally employs representation at all job levels no matter the race, color, religion, sex (including pregnancy), national origin, age, disability or genetic information.We truly invest and care about our employee’s wellbeing and provide endless growth opportunities as the sky is the limit, so aim for the stars! Imagine next and join the Parsons quest—APPLY TODAY!
Parsons is aware of fraudulent recruitment practices. To learn more about recruitment fraud and how to report it, please refer to https://www.parsons.com/fraudulent-recruitment/.