Medical Text Classification using Large Language Models


                                                           Natural Language Processing in Cancer:
                                           Extracting Diagnostic Insights from Pathology Reports

                                       PROJECT SPONSOR: National AI Campus 


Project Liaisons: 
Yimeng He and Xiuzhen Huang

Description: 

This project focuses on leveraging natural language processing (NLP) techniques to extract critical information from pathology reports.
Participants will gain hands-on experience with a wide range of text classification methods, from traditional TF-IDF analysis to cutting-edge large language models. The Cancer Genome Atlas (TCGA) pathology report corpus used in this project offers a unique opportunity for the development of advanced NLP technologies that can ultimately enhance patient diagnosis, treatment selection, and many other aspects of cancer care.

Objective:

Fall 2025:
- To develop an understanding of technology used in LLM building.
- We created multiple Jupyter notebooks to process data, optimize the accuracy of data classification and implemented an NLP using technically worded data. 

Spring 2026:
- To develop and implement new ideas based on our understanding of last semester this time with the cancer pathology datasets. TBA.  


Team Layout: 
ROLENAMEEMAIL
AdvisorYuqing Zhuyzhu14@calstatela.edu
Team LeadKenia Sanchez-Macarioksanch183@calstatela.edu
CoderChristopher Gonzalescgonza238@calstatela.edu
CoderRocio Hernandezrherna168@calstatela.edu
CoderJoseph Howerton
CoderYvan Kemsseu Yobeuykemsse2@calstatela.edu
CoderHaonan Mahma4@calstatela.edu
CoderSteven Maganasmagan26@calstatela.edu
Coder       Alan Mai                        amai15@calstatela.edu
Coder      Georgina Mateo         gmateo2@calstatela.edu
Coder     Laura Rodriguez Zea
lrodri161@calstatela.edu
CoderSean Santosssanto40@calstatela.edu

Project Stack:

AI (LLM & NLP)WHOLE TEAM

Meetings: 
MEETING DATE TIME 
Weekly Team MeetingFridays9:45AM - 11:00AM
Biweekly Liaison MeetingFridays9:00AM - 9:45AM 
Student Team
  • Christopher Gonzales
  • Rocio Hernandez
  • Joseph Howerton
  • Yvan Kemsseu Yobeu
  • Haonan Ma
  • Steven Magana
  • Alan Mai
  • Georgina Mateo
  • Laura Rodriguez Zea
  • Kenia Sanchez-Macario
  • Sean Santos
Project Sponsor
Project Liaisons
Faculty Advisors