Develop NLP Models for Medical Data Extraction
Develop, evaluate and deploy ML and deep learning NLP models to perform field extraction on radiology reports.

Jay Jha

Description
Radiology reports follow a standard format with key sections that medical specialists must complete when analyzing patient scans. This Build Project focuses on developing a machine learning model that can automatically segment radiology reports into these standard sections, such as title, indication, technique, comparisons, findings, and impression. You'll take on the role of a Machine Learning Engineer, applying natural language processing techniques to build predictive models that extract these sections from unstructured text. You'll learn to represent text as embeddings, visualize data, and implement MLOps to automate training, evaluation and deployment of these machine learning models. These skills—ranging from NLP to MLOps—are in high demand across industries, equipping you with practical experience relevant to entry-level roles in AI-driven healthcare, data science, and machine learning engineering. As automation and AI continue to reshape the workforce, expertise in building, deploying, and optimizing machine learning models is essential for addressing real-world challenges in healthcare and beyond.
Application timeline
What you will learn
- Understand different embedding techniques.
- Build traditional and transformer-based NLP models.
- Build workflow orchestration pipelines.
- Utilize docker to deploy machine learning models.
Project workshops
Prerequisites
- Intermediate Python skills: you should know how to debug, compile and run Python code and have experience with Python data structures like list, string, and dictionary.
- Knowledge of Github for version control is needed.
- Experience using machine libraries: scikit-learn, Huggingface, Pytorch
- Experience developing and testing NLP models
- Familiarity with docker for deploying and managing containerized applications
Apply Now!
Ready to start this exciting project? Submit your application today and begin your journey with Build!
About the Fellow

Jay Jha


Jay is a Computer Science Build Fellow at The Build Fellowship, where he works with students leading projects in applied artificial intelligence and natural language processing. Jay is an AI Engineer at Alexandria Technology, where he focuses on developing systems for financial document analysis. Jay has over 6 years of experience in the machine learning field. His career spans work in healthcare AI, substance use research, and financial NLP. He has contributed to multiple patents in generative AI for radiology. He holds a Master's degree in Computer Science with a focus on artificial intelligence. A fun fact about Jay is that he gets a new personality every week based on whatever podcast he's into.