PRIMARY FUNCTION
The Data and AI Scientist Intern supports the Data Science and AI team in developing, testing, and validating AI-enabled analytics workflows, with a strong emphasis on Large Language Model (LLM) based applications. Working under close supervision, the intern contributes to activities such as prompt design, retrieval-augmented techniques, model interaction, and evaluation of AI outputs, as well as other applied AI and analytics tasks as needed. The role supports internal experimentation and prototyping in notebook-based environments while following established quality, privacy, and responsible AI guidelines. This position emphasizes hands-on learning, technical problem-solving, and exposure to applied AI systems in a regulated environment.
ESSENTIAL DUTIES AND RESPONSIBILITIES
Assist in designing, testing, and refining prompts for Large Language Model (LLM)–enabled tools across multiple internal use cases. Support development and evaluation of retrieval-augmented generation (RAG) and other retrieval-based approaches by testing document selection, context quality, and response accuracy. Perform structured evaluation and quality assurance of AI outputs, including accuracy checks, consistency reviews, and adherence to internal guidelines. Support internal experimentation and prototyping of AI-enabled workflows in notebook-based environments (e.g., Python, SQL, Databricks) under close supervision. Document prompts, evaluation results, methodologies, and findings to support knowledge sharing and reproducibility. Follow established responsible AI, data privacy, and security standards when working with sensitive or regulated information.QUALIFICATIONS
EDUCATION: Master’s degree in Computer Science, Data Science, Artificial Intelligence, Engineering, or a related quantitative field required. PhD in a relevant discipline is preferred.
EXPERIENCE: Prior academic, research, or professional experience involving AI, machine learning, or data science required. Experience working with healthcare data or healthcare-related AI use cases is preferred.
KNOWLEDGE, SKILLS AND ABILITIES
Strong foundational knowledge of artificial intelligence, machine learning, and data science concepts, typically gained through graduate-level coursework or doctoral research. Demonstrated understanding of Large Language Models (LLMs), including prompt design, model behavior, limitations, and evaluation techniques. Familiarity with retrieval-augmented generation (RAG) concepts, embedding-based retrieval, or other techniques for grounding model responses in source material. Proficiency in Python and experience working in notebook-based analytical environments; familiarity with SQL and cloud-based platforms is preferred. Ability to design structured evaluations, analyze AI outputs critically, and identify patterns, errors, or opportunities for improvement. Strong written communication skills for documenting technical work, experiments, and findings clearly and concisely. Ability to work independently on defined tasks while collaborating effectively within a technical team under close supervision. Awareness of responsible AI principles, data privacy, and security considerations, particularly in regulated or healthcare environments.
TYPICAL WORKING CONDITIONS
Full-time remote This role must be U.S.-based May involve working with sensitive or regulated data in compliance with privacy, security, and responsible AI policies
PERFORMANCE REQUIREMENTS
Adhere to all organizational information security policies and protect all sensitive information including but not limited to ePHI and PHI (Protected Health Information) in accordance with organizational policy, Federal, State, and local regulations.
The foregoing description is not intended and should not be construed to be an exhaustive list of all responsibilities, skills, efforts or working conditions associated with the job. It is intended to be an accurate reflection of the general nature of level of the job.