Noida, IND
19 hours ago
AI Data Engineer
**Introduction** At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, let’s talk. IBM Consulting is IBM’s consulting and global professional services business, with market leading capabilities in business and technology transformation. With deep expertise in many industries, we offer strategy, experience, technology, and operations services to many of the most innovative and valuable companies in the world. Our people are focused on accelerating our clients’ businesses through the power of collaboration. We believe in the power of technology responsibly used to help people, partners and the planet. Within IBM Consulting, Asset Engineering Services is a group who build software products and repeatable solutions to accompany and support multiple consulting and services engagements across different clients. Asset Engineering is seeking an AI Data Engineer to build intelligent, scalable data pipelines that power advanced AI/ML agents. This role is central to transforming raw infrastructure and observability data into high-quality, ML-ready features that support agentic AI, RAG pipelines, and real-time incident reasoning. **Your role and responsibilities** * Design and implement AI/ML data pipelines for structured, semi-structured, and unstructured data ingestion from logs, IaC templates (Terraform/CloudFormation), metrics, traces, and incidents. * Build ETL/ELT workflows to power feature extraction for ML models and Retrieval-Augmented Generation (RAG) systems. * Construct vector embedding pipelines for document chunking and ingestion into vector databases (e.g., FAISS, Weaviate, Qdrant) to support LLM-based RAG flows. * Engineer data inputs for root cause analysis, event correlation, and incident classification in AIOps contexts using observability platforms (Prometheus, ELK, Instana). * Parse and model infrastructure-as-code templates (Terraform, CloudFormation) to extract compliance metadata and enforce policy alignment. * Build streaming or batch pipelines to process telemetry and service health signals for downstream AI agents. * Integrate with LangGraph, CrewAI, and LLM workflows, ensuring data availability and quality for inference tasks. * Maintain data lakes, feature stores, and versioned datasets for reproducibility and auditing. * Work closely with ML Engineers to optimize data pipelines for latency, scalability, and explainability. **Required technical and professional expertise** * Total Years of Experience : 8+ Years with at least 5+ Years of Relevant Experience * Expertise in data engineering for AI/ML use cases, including feature engineering, embedding generation, and training data prep. * Minimum 5+ years working as DBA/Data pipelines and 1 yr experience as AI Data Engineer. * Experience with Python, Spark, or Java for data processing. * Familiarity with RAG architectures and vector database ingestion workflows. * Strong knowledge of SQL, Parquet/ORC/Avro, and schema evolution best practices. * Hands-on experience with data orchestration tools (Airflow, Prefect, Dagster). * Proficiency in data lake and warehouse technologies (e.g., S3, BigQuery, Delta Lake, Snowflake). * Familiarity with observability data sources (APM, logging, metrics) and ITSM systems like ServiceNow. * Understanding of infrastructure metadata, IaC artifacts, and compliance rules extraction. **Preferred technical and professional experience** * Knowledge of vector databases (FAISS, Pinecone, Qdrant) and embedding models (e.g., BERT, OpenAI, Hugging Face). * Experience integrating with agentic frameworks (LangChain, LangGraph, CrewAI). * Experience with stream processing frameworks (Kafka, Flink, Kinesis). * Exposure to cloud AI/ML services on AWS, GCP, or Azure. * Understanding of OpenAPI 3.0, Flask APIs, and integration into AI microservices. IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Confirmar seu email: Enviar Email