Chennai, IND
20 hours ago
Officer - Big Data Engineer - C11 - Hybrid - Chennai
Responsible for designing, developing, and optimizing data processing solutions using a combination of Big Data technologies. Focus on building scalable and efficient data pipelines for handling large datasets and enabling batch & real-time data streaming and processing. Responsibilities: > Develop Spark applications using Scala or Python (Pyspark) for data transformation, aggregation, and analysis. > Develop and maintain Kafka-based data pipelines: This includes designing Kafka Streams, setting up Kafka Clusters, and ensuring efficient data flow. > Create and optimize Spark applications using Scala and PySpark: They leverage these languages to process large datasets and implement data transformations and aggregations. > Integrate Kafka with Spark for real-time processing: They build systems that ingest real-time data from Kafka and process it using Spark Streaming or Structured Streaming. > Collaborate with data teams: This includes data engineers, data scientists, and DevOps, to design and implement data solutions. > Tune and optimize Spark and Kafka clusters: Ensuring high performance, scalability, and efficiency of data processing workflows. > Write clean, functional, and optimized code: Adhering to coding standards and best practices. > Troubleshoot and resolve issues: Identifying and addressing any problems related to Kafka and Spark applications. > Maintain documentation: Creating and maintaining documentation for Kafka configurations, Spark jobs, and other processes. > Stay updated on technology trends: Continuously learning and applying new advancements in functional programming, big data, and related technologies. Proficiency in: **Hadoop** ecosystem big data tech stack(HDFS, YARN, MapReduce, Hive, Impala). **Spark (Scala, Python)** for data processing and analysis. Kafka for real-time data ingestion and processing. ETL processes and data ingestion tools Deep hands-on expertise in Pyspark, Scala, Kafka Programming Languages: Scala, Python, or Java for developing Spark applications. SQL for data querying and analysis. Other Skills: Data warehousing concepts. Linux/Unix operating systems. Problem-solving and analytical skills. Version control systems ------------------------------------------------------ **Job Family Group:** Technology ------------------------------------------------------ **Job Family:** Applications Development ------------------------------------------------------ **Time Type:** Full time ------------------------------------------------------ **Most Relevant Skills** Please see the requirements listed above. ------------------------------------------------------ **Other Relevant Skills** For complementary skills, please see above and/or contact the recruiter. ------------------------------------------------------ _Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law._ _If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review_ _Accessibility at Citi (https://www.citigroup.com/citi/accessibility/application-accessibility.htm)_ _._ _View Citi’s_ _EEO Policy Statement (https://www.citigroup.com/global/eeo-aa-policy)_ _and the_ _Know Your Rights (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf)_ _poster._ Citi is an equal opportunity and affirmative action employer. Minority/Female/Veteran/Individuals with Disabilities/Sexual Orientation/Gender Identity.
Confirmar seu email: Enviar Email
Todos os Empregos de Citigroup