Medical Text Classification using Large Language Models
Natural Language Processing in Cancer:
Extracting Diagnostic Insights from Pathology Reports
PROJECT SPONSOR: National AI Campus
Project Liaisons:
Yimeng He and Xiuzhen Huang
Description:
This project focuses on leveraging natural language processing (NLP) techniques to extract critical information from pathology reports.
Participants will gain hands-on experience with a wide range of text classification methods, from traditional TF-IDF analysis to cutting-edge large language models. The Cancer Genome Atlas (TCGA) pathology report corpus used in this project offers a unique opportunity for the development of advanced NLP technologies that can ultimately enhance patient diagnosis, treatment selection, and many other aspects of cancer care.
Objective:
Fall 2025:
- To develop an understanding of technology used in LLM building.
- We created multiple Jupyter notebooks to process data, optimize the accuracy of data classification and implemented an NLP using technically worded data.
Spring 2026:
- To develop and implement new ideas based on our understanding of last semester this time with the cancer pathology datasets. TBA.
Team Layout:
| ROLE | NAME | |
| Advisor | Yuqing Zhu | yzhu14@calstatela.edu |
| Team Lead | Kenia Sanchez-Macario | ksanch183@calstatela.edu |
| Coder | Christopher Gonzales | cgonza238@calstatela.edu |
| Coder | Rocio Hernandez | rherna168@calstatela.edu |
| Coder | Joseph Howerton | |
| Coder | Yvan Kemsseu Yobeu | ykemsse2@calstatela.edu |
| Coder | Haonan Ma | hma4@calstatela.edu |
| Coder | Steven Magana | smagan26@calstatela.edu |
| Coder | Alan Mai | amai15@calstatela.edu |
| Coder | Georgina Mateo | gmateo2@calstatela.edu |
| Coder | Laura Rodriguez Zea | lrodri161@calstatela.edu |
| Coder | Sean Santos | ssanto40@calstatela.edu |
Project Stack:
| AI (LLM & NLP) | WHOLE TEAM |
Meetings:
| MEETING | DATE | TIME |
| Weekly Team Meeting | Fridays | 9:45AM - 11:00AM |
| Biweekly Liaison Meeting | Fridays | 9:00AM - 9:45AM |
- Christopher Gonzales
- Rocio Hernandez
- Joseph Howerton
- Yvan Kemsseu Yobeu
- Haonan Ma
- Steven Magana
- Alan Mai
- Georgina Mateo
- Laura Rodriguez Zea
- Kenia Sanchez-Macario
- Sean Santos