Falls Church, Virginia, United States of America
10 hours ago
AI/ML Data Engineer

AI DATA ENGINEER SENIOR


Own your opportunity to turn data into measurable outcomes for our customers’ most complex challenges. As a AI Data Engineer Senior at GDIT, you’ll power innovation to drive mission impact and grow your expertise to power your career forward.

MEANINGFUL WORK AND PERSONAL IMPACT
As an AI Data Engineer Senior, the work you’ll do at GDIT will be impactful to the mission of GSA. We are seeking a highly skilled and motivated Sr. AI Data Engineer with a proven track record in building scalable data platforms and pipelines, with demonstrated experience incorporating Generative AI into data engineering workflows. The ideal candidate will have deep expertise in Databricks data engineering capabilities including Delta Lake, data pipelines, and Unity Catalog, combined with innovative use of GenAI for enhancing data quality, metadata generation, and workflow automation. You will work collaboratively with data scientists, AI engineers, and analytics teams to design and implement robust data infrastructure that powers AI/ML initiatives. Additionally, you will play a key role in establishing data engineering best practices and mentoring team members in modern data platform technologies.


WHAT YOU’LL NEED TO SUCCEED
Bring your expertise and drive for innovation to GDIT. The AI Data Engineer Senior must have:
● Education: Bachelor of Science
● Experience: 5+ years of related experience
● Technical skills: Databricks Data Engineering, Delta Lake, GenAI-Enhanced Workflows, Python, PySpark, AWS

● Responsibilities:

Design, build, and maintain scalable data pipelines and ETL/ELT workflows using Databricks and PySpark for AI/ML and analytics workloadsLeverage Databricks core data capabilities including Delta Lake, Delta Live Tables, and Databricks Workflows to create reliable, high-performance data platformsImplement GenAI-enhanced data workflows for automated metadata generation, data cataloging, data quality validation, and intelligent data profilingUtilize LLMs to generate documentation, create data dictionaries, and automate schema inference and data lineage trackingDesign and implement medallion architecture (Bronze, Silver, Gold layers) following data lakehouse best practicesCollaborate with data architects to establish data modeling standards, governance policies, and data quality frameworksIntegrate AWS data services (S3, Glue, Kinesis, MSK, Redshift) with Databricks to build end-to-end data solutionsLeverage and integrate into Unity Catalog or other data catalogs/access management tools in the enterprise for data governance, access control, and data asset management across the platformOptimize data pipeline performance through partitioning strategies, caching, and query optimization techniquesEstablish DataOps and MLOps practices including version control, CI/CD for data pipelines, and automated testingCreate reusable data transformation frameworks and libraries to accelerate data pipeline developmentCollaborate with AI/ML teams to prepare, curate, and serve high-quality datasets for model training and inferenceImplement real-time and batch data processing architectures to support diverse analytics and AI use casesStay current with emerging data engineering technologies, GenAI capabilities, and Databricks platform enhancementsDocument data architectures, pipeline designs, and operational procedures for knowledge sharing and compliance

● Required Skills

5+ years of proven experience as a Data Engineer with focus on building large-scale data platforms and pipelines3+ years of hands-on experience with Databricks platform, specifically data engineering features (Delta Lake, DLT, Workflows, Unity Catalog)2+ years of experience incorporating Generative AI into data engineering workflows (metadata generation, data quality, documentation)5+ years of strong proficiency in Python and PySpark for distributed data processing3+ years of experience with AWS data services (S3, Glue, Lambda, Kinesis, Redshift, Athena)Deep understanding of data lakehouse architecture, Delta Lake ACID transactions, and time travel capabilitiesProven experience with SQL optimization, data modeling, and dimensional modeling techniquesStrong knowledge of data orchestration tools and workflow management (Airflow, Databricks Workflows)Experience implementing data quality frameworks and validation rules at scaleUnderstanding of data governance, data lineage, and metadata management principlesExcellent problem-solving skills with ability to debug complex data pipeline issuesStrong communication skills to collaborate with data scientists, analysts, and business stakeholdersExperience working in Agile environments with version control (Git) and CI/CD practices


GDIT IS YOUR PLACE
At GDIT, the mission is our purpose, and our people are at the center of everything we do.
● Growth: AI-powered career tool that identifies career steps and learning opportunities
● Support: An internal mobility team focused on helping you achieve your career goals
● Rewards: Comprehensive benefits and wellness packages, 401K with company match, and competitive pay and paid time off
● Community: Award-winning culture of innovation and a military-friendly workplace

OWN YOUR OPPORTUNITY
Explore a career in data science and engineering at GDIT and you’ll find endless opportunities to grow alongside colleagues who share your determination for solving complex data challenges.

Confirmar seu email: Enviar Email