This project aims to assist individuals with visual impairments in their daily activities. It employs an application that uses the mobile phone's camera to identify objects and gauge their distances. The app then leverages a Large Language Model (LLM) to interpret the findings from the YOLO (You Only Look Once) object detection and ZoeDepth perception models. This information is converted into audio commands, guiding users to navigate their environment more effectively.
The main components of this project include:
- A mobile application written in Flutter
- A YOLO object detection model
- A depth detection model (ZoeD_N)
- A llama2 LLM model
- Python 3
- Flutter
- Jupyter Notebook
-
Clone the repository
git clone https://github.com/emre-bl/EVI-AI.git
-
Install dependencies
pip install -r requirements.txt
-
Go to mobile app's directory
cd pipeline_scripts/mobile_app
-
Run the commands below
flutter clean
flutter pub get
flutter build apk
adb install path/to/your/app-release.apk
- To be the server: first run app.py, then run server.py, lastly run user.py
- To be the client: build and run the APK