Document Tag Parser & Box.com/eDefender Integration
Document Tag Parser
Background
The attorneys and paralegals who work for the Santa Barbara Public Defender review a large volume of documents daily. For organizational purposes, they rename each document of evidence based on a sequence of numbers found on the first and last page, known as Bates Stamps. This requires staff to open and check each document manually. Cases often have twenty or more documents, and this process can be time consuming on a large scale. Our objective was to automate this process to allow large batches of documents to be renamed simultaneously. Our application would require a fraction of the original time to complete this frequently performed task.
Description
Our solution came in the form of a desktop python application. Utilizing the machine learning module pytesseract to convert image documents to text, we were able to create an application that could read and then rename documents automatically. A graphical user interface allows the user to choose specific folders for processing and output of renamed files. The application is multi-threaded to improve performance and overall processing times. Please refer to our poster and documentation for sample images of the application.
Box.com and eDefender Integration
Background
Our sponsor, Santa Barbara Public Defender, has transitioned to a fully paperless case mangement system. To support this new project, they are using Box.com as their cloud storage provider to store data. This requires over 50 terabytes of data to be moved to the cloud, and their data needs are increasing exponentially each year. As a part of this process, our team's objective was to assist Santa Barbara Public Defender with the integration of the Box cloud provider with their case management system eDefender. The role of our team was to provide a way for a transcription of case discovery media (video, and audio) to be created and then uploaded to the cloud. This would allow for more efficient review of evidence by their attorneys and paralegals. Due to the complexity of this transition, contracts with Box and Azure are still pending and need to be finalized for this project to be completed. After discussing the options with our sponsor, we agreed to put this part of the project on hold to assist with a separate task, the Document Tag Parser mentioned above.
Description
The design is based on an application originally created by another CSULA team in 2020. A box skill application is a program which runs everytime a file is uploaded to the Box cloud service. The program is a box skill application written in Node JS, and deployed as a serverless instance in AWS. It utilizes Azure Video Analyzer to transcribe audio as well as video. The video analyzer is able to provide time stamps which use facial recognition to mark the individuals that appear throughout the video. Everytime a video or audio file is uploaded to Box, the application will be triggered to handle the processing, and the transcript will be returned to Box and seamlessly integrated into their existing user interface. Finally, an alert system will be added to the Box Skill application to notify attorneys of new evidence and their corresponding transcriptions.
Role | Name | |
---|---|---|
Faculty Advisor | Jungsoo Lim | jlim34@calstatela.edu |
Project Lead | Daniel Guevara-Dominguez | dgueva20@calstatela.edu |
Document Lead | Jesica Lopez De Leon | jlopezd3@calstatela.edu |
Customer Liaison/Requirements Lead | Luke Williams | lwillia@calstatela.edu |
Architecture/Design Lead | Sergio Tapia | Stapia11@calstatela.edu |
UI Lead | Shaocheng Shi | sshi5@calstatela.edu |
Backend Lead | Chuang Huang | chuang11@calstatela.edu |
Database Schema Lead | Marco De La Torre | mdelat23@calstatela.edu |
QA/QC Lead | Joshua Cabrera | jcabre83@calstatela.edu |
Demo Lead | Raul Gallegos | rgalle17@calstatela.edu |
Presentation Lead | Dang Le | dle18@calstatela.edu |
Meeting Schedule:
- Weekly meeting with the advisor: Thursday 6:00 PM
- Weekly team meeting: Friday 10:00 AM
- Meeting with liaison: Friday 9:00 AM
Link to Presentation Recording: https://cosantabarbara.box.com/s/i4fwfgu5x30pcm9by0gauiuse8e7jzz7
- Joshua Cabrera
- Marco De La Torre
- Raul Gallegos
- Daniel Guevara-Dominguez
- Chuang Huang
- Dang Le
- Jesica Lopez De Leon
- Shaocheng Shi
- Sergio Tapia
- Luke Williams
- Box and Box Skill Set up - Updated User Manual
- Document Tag Parser Installer
- Document Tag Parser Source Code
- Expo Presentation Recording
- Fall 2021 Presentation
- Final Presentation Slides - DTP & Box/eDefender Integration
- PD Discovery Overview
- Project Poster
- Project Report: Box.com/eDefender Integration Document Tag Parser
- REST+API
- SDD V2 Spring 2022 - DTP & Box/eDefender
- SRS V2 Spring 2022 - DTP & Box/eDefender
- Software Design Document (SDD) Version 1.1
- Software Requirements Specification (SRS) Version 1.1