Intern 2026: AI Inference Optimization Engineer
IBM
**Introduction**
IBM Research takes responsibility for technology and its role in society. Working in IBM Research means you'll join a team who invent what's next in computing, always choosing the big, urgent and mind-bending work that endures and shapes generations. Our passion for discovery, and excitement for defining the future of tech, is what builds our strong culture around solving problems for clients and seeing the real world impact that you can make.
IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.
**Your role and responsibilities**
As a software engineer with IBM Research, you'll bridge the gap between groundbreaking AI research and practical software solutions. Collaborating with top researchers and developers, your mission is to implement AI and Hybrid Cloud advancements into IBM product. You'll construct the AI platform technology stack, building the software components that optimize specialized AI hardware and leverage new software paradigms with AI agents.
Key Duties
* Apply techniques in AI model development and training, perform foundation model inference and deployment using containerized programming paradigms
* Integrate innovate LLMs of various model architectures, including Hybrid Mixture of Expert models by leveraging and contributing to leading open-source libraries and frameworks in AI, such as PyTorch, TensorFlow, vLLM, and Hugging Face Transformers, TRL
* Enhance data handling and pre-processing techniques using open source libraries for Natural Language Processing (NLP) tasks
* Design and execute performance evaluation and benchmarking using simulated and observed techniques
**Required technical and professional expertise**
* Student enrolled in a Master's level or Ph.D. degree program in Computer Science or related fields
* Strong programming skills in languages such as Python, Java, or C/C+* Strong proficiency in software engineering principles, with a focus on scalable and maintainable code, with a focus on AI or machine learning
* Understanding of various machine learning algorithms and their applications
* Knowledge of model serving frameworks like vLLM, TensorFlow Serving, or TorchServe and experience with ML frameworks such as TensorFlow, PyTorch, Keras, Scikit-Learn
* Proficiency in using version control systems like Git for collaborative development
* Proven track record of contributing to open-source projects, preferably in AI-related domains
**Preferred technical and professional experience**
* Proficiency in designing, training, and validating machine learning models, particularly in the domain of Natural Language Processing (NLP) using libraries like PyTorch and Hugging Face TransformersExperience in implementing and fine-tuning pre-trained models for specific use cases
* Expertise in containerization technologies such as Docker and familiarity with container orchestration platforms like Kubernetes for managing and scaling AI applications
* Ability to deploy AI models for inference, ensuring low latency and high throughput
* Skills in hyperparameter tuning techniques to optimize model performance
* Experience working with GraphQL and its implications for LLMs
* Understanding of model compression and quantization methods to improve inference speed and reduce memory footprint
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Confirmar seu email: Enviar Email