Ascent - Project

Self-Attention Mechanism of ChatGPT

AI systems have been studied and applied in small scale applications for decades. ChatGPT, however, has taken over the world with it's almost human-like replies and fast processing of inputs from humans. This kind of breakthrough in the AI technology along with his ease of access to the general public has opened the doorway to more similar systems to be developed. Our team has focused our attention on Andrej Karpathy's small-scale gpt program have turned our efforts into analysizing the core learning mechanism of ChatGPT. Because ChatGPT is a fairly new application of the GPT language model, this project will dive into the core learning mechanism called self-attention and breakdown the process step by step.

The goal of this project is to explore the use of self-attention using a small scale data with characters to replicate the format of various texts. The primary deliverables will be (A) an expository report on how self-attention works.

Student Team

Fardeen Abir
Nathan Campos
Anthony Edeza
Jose Flores De Santiago
Edward Kim
Kenneth Lieu
Kevin Mateo
Michael Nguyen
Roberto Reyes
Maggie Yang

Project Sponsor

Computer Science

Project Liaisons

Russ Abbott

Faculty Advisors

Yuqing Zhu

Resources