Traffic Prediction Using Deep Neural Networks

Project Background

The City of Los Angeles is very known for its diverse cultures and timeless night-life. However, it is also infamous for the dreaded traffic, especially during rush-hours. Hence, the City of LA, Data Science Federation and LADOT have collaborated with CSULA to apprehend this grand problem. As part of a stepping stone that Los Angeles City have taken in preparation of the Olympics 2028 whereby we presume traffic will may worse. Apart from preparing for the Olympics, this project is one of the many that will assist the Los Angeles Department of Transportation (LADOT) in achieving Vision Zero, which purposes to end all traffic injuries and deaths by 2025. Thus, the objectives of this project consist of building an advanced machine learning model, the Graph Recurrent Neural Network, that can predict the traffic flow of of a time-series data, analyzing and visualizing the resulting prediction, and designing a data preprocessing pipeline whereby city engineers can use to process any traffic data.

Challenges and Limitations

As such, any beautiful problem will have their own limitations and challenges. For us, we are faced with the challenge of designing and implementing multiple techniques that can reasonably fill in 80% of the missing data. We also faced various big data challenges; both in data preprocessing and training the GRNN model. Thus, the outcome of our results will be limited by our available computational resources and the amount of actual data that we have.

Design Principles

As a quick introduction of the GRNN, it is a type of recurrent neural network that can train and make inferences on spatial-temporal data. Hence, this model is perfect for the project because it requires both geographical and time-series data. And its implementation is new (published in November 2018 by doctorates from Shang Tong University) and novel. The researchers had overhauled the traditional road network, where they introduced a new type of road network representation called Linkage Network wherein it can reduce overall redundancy of the traditional road network and the computational resources.

Following the research paper of the GRNN, we designed a data preprocessing pipeline that is data agnostic, and it can model any data into a Linkage Network. The implementation include, but are not limited to, pre-cleaning dataset, filtering and selecting the required features, filling in missing data for any time-step, transforming road network to a graph (Linkage Network), computing with both CPU and GPU, and shaping the dimension of the processed data to fit into the GRNN model.

Tools & Methods

In order to achieve our requirements, we use various data analytics technologies and techniques. The major technologies include GRNNPytorch, Scikit-Learn, Numpy, Pandas, Dask, Matplotlib, and ArcGIS. In short, we will adapt the GRNN code to predict the average speed of traffic in Los Angeles City. Pytorch, Scikit-Learn, Numpy, and Pandas are the underlying libraries and tools that were used to implement the GRNN. We used Dask because it allows us to use distributed processing on the dataframe. Matplotlib and ArcGIS are the main visualization libraries we used to plot and to verify the model and the predictions. And an example of an advanced technique that we used is K-Nearest Neighbors Missing Data Imputation, which was used to impute missing speed data for any given time step.

Results and Conclusion

In this project, we have successfully 1. designed and implemented a data preprocessing pipeline to create the Linkage Network for the City of Los Angeles and to reasonably impute missing data for any time series, 2. trained and tested our GRNN model to predict the traffic speeds of the streets in the Los Angeles Financial District, and 3. verify the GRNN model and its resulting output using various visualizations. Even though the results of our model was only tested for the Financial District, it can definitely be used for any road network of LA, and it can yield fairly accurate predictions.

Our Team

We are a proud team and are grateful to have such a fantastic opportunity to work on this project.

Project Lead: Javier Hernandez
Architecture Lead: Daniel Caceres
Data Optimization Lead: Russell Carter
Documentation & Data Visualization Lead: Hue Ngo
Quality Assurance Lead: Vrezh Khalatyan
Presentation Lead: Gracie Zamora
Assistant Advisers: Mohammad Vahedi & Luis Fisher

Student Team
  • Daniel Caceres
  • Javier Hernandez
  • Vrezh Khalatyan
  • Hue Ngo
  • Grecia Zamora
Project Sponsor
Project Liaisons
Faculty Advisors