Job Description
· We are looking for a skilled Big Data QA Engineer to join our team and ensure the quality and reliability of our data processing pipelines.
· This role involves designing and executing both automated and manual tests for batch workflows, validating data integrity across transformation stages, and maintaining a robust Python-based test automation framework.
· The ideal candidate will have hands-on experience with AWS services, big data formats, and a strong understanding of testing methodologies in a data-centric environment.
Preferred Skills and Experience:
· Design, develop, and execute automated and manual batch test cases for data processing pipelines.
· Validate data integrity and transformation logic across multiple stages of batch workflows.
· Monitor and troubleshoot batch job failures, performance issues, and data mismatches.
· Maintain and enhance Python-based test automation suite and framework.
· Evaluates system performance, reproduces issues, and works with development teams to resolve identified problems.
· Collaborates effectively with peers and management to improve application quality and testing processes.
· Minimum 4 years of experience with a focus on batch testing and data validation.
· Strong proficiency in Python for test automation and scripting. Hands-on experience with AWS services such as S3, Lambda, EMR, and Athena.
· Experience with data formats like JSON, Parquet, and CSV, and tools like Pandas and PySpark. Familiarity with version control systems (e.g., Git) and CI/CD tools (e.g., Jenkins, GitHub Actions).
· Solid understanding of software testing methodologies, including regression, integration, and system testing.
· Ability to analyze logs and metrics to identify root causes of failures.
· Experience working with agile methodology.
· Excellent verbal and written communication skills, and ability to collaborate with cross-functional teams.