Bangalore / Chennai, India
13 days ago
Senior Data Engineer
We are seeking a highly skilled and experienced Senior Data Engineer with deep expertise in Databricks to join our Digital Capital team. The ideal candidate will have over 6 years of experience working in Databricks to create operational excellence across the platform, develop, optimize, and maintain data pipelines, and has a solid foundation in traditional enterprise data warehousing. You will play a critical role in building and maintaining our next-generation data platform, ensuring data quality, reliability, and accessibility for various analytical and operational needs across CDM Smith.

Key Responsibilities:
• Databricks Platform: Act as a subject matter expert for the Databricks platform within the Digital Capital team, provide technical guidance, best practices, and innovative solutions.
• Databricks Workflows and Orchestration: Design and implement complex data pipelines using Azure Data Factory or Qlik replicate.
• End-to-End Data Pipeline Development: Design, develop, and implement highly scalable and efficient ETL/ELT processes using Databricks notebooks (Python/Spark or SQL) and other Databricks-native tools.
• Delta Lake Expertise: Utilize Delta Lake for building reliable data lake architecture, implementing ACID transactions, schema enforcement, time travel, and optimizing data storage for performance.
• Spark Optimization: Optimize Spark jobs and queries for performance and cost efficiency within the Databricks environment. Demonstrate a deep understanding of Spark architecture, partitioning, caching, and shuffle operations.
• Data Governance and Security: Implement and enforce data governance policies, access controls, and security measures within the Databricks environment using Unity Catalog and other Databricks security features.
• Collaborative Development: Work closely with data scientists, data analysts, and business stakeholders to understand data requirements and translate them into Databricks based data solutions.
• Monitoring and Troubleshooting: Establish and maintain monitoring, alerting, and logging for Databricks jobs and clusters, proactively identifying and resolving data pipeline issues.
• Code Quality and Best Practices: Champion best practices for Databricks development, including version control (Git), code reviews, testing frameworks, and documentation.
• Performance Tuning: Continuously identify and implement performance improvements for existing Databricks data pipelines and data models.
• Cloud Integration: Experience integrating Databricks with other cloud services (e.g., Azure Data Lake Storage Gen2, Azure Synapse Analytics, Azure Key Vault) for a seamless data ecosystem.
• Traditional Data Warehousing & SQL: Design, develop, and maintain schemas and ETL processes for traditional enterprise data warehouses. Demonstrate expert-level proficiency in SQL for complex data manipulation, querying, and optimization within relational database systems.
Confirmar seu email: Enviar Email