•  


GitHub - lxe/llavavision: A simple "Be My Eyes" web app with a llama.cpp/llava backend
Skip to content

A simple "Be My Eyes" web app with a llama.cpp/llava backend

Notifications You must be signed in to change notification settings

lxe/llavavision

Repository files navigation

LLaVaVision

Screenshot

A simple "Be My Eyes" web app with a llama.cpp/llava backend created in about an hour using ChatGPT, Copilot, and some minor help from me, @lxe . It describes what it sees using SkunkworksAI BakLLaVA-1 model via llama.cpp and narrates the text using Web Speech API .

Inspired by Fuzzy-Search/realtime-bakllava .

Getting Started

You will need a machine with about ~5 GB of RAM/VRAM for the q4_k version.

Set up the llama.cpp server

(Optional) Install the CUDA toolkit:

sudo apt install nvidia-cuda-toolkit

Build llama.cpp (build instructions for various platforms at llama.cpp build ):

git clone https://github.com/ggerganov/llama.cpp
cd
 llama.cpp
mkdir build
cd
 build
cmake .. -DLLAMA_CUBLAS=ON 
#
 Remove the flag if CUDA is unavailable

cmake --build 
.
 --config Release

Download the models from ggml_bakllava-1 :

wget https://huggingface.co/mys/ggml_bakllava-1/resolve/main/mmproj-model-f16.gguf
wget https://huggingface.co/mys/ggml_bakllava-1/resolve/main/ggml-model-q4_k.gguf 
#
 Choose another quant if preferred

Start the server (server options detailed here ):

./bin/server -m ggml-model-q4_k.gguf --mmproj mmproj-model-f16.gguf -ngl 35 -ts 100,0 
#
 For GPU-only, single GPU

#
 ./bin/server -m ggml-model-q4_k.gguf --mmproj mmproj-model-f16.gguf # For CPU

Launch LLaVaVision

Clone and set up the environment:

git clone https://github.com/lxe/llavavision
cd
 llavavision
python3 -m venv venv
.
 ./venv/bin/activate
pip install -r requirements.txt

Create dummy certificates and start the server. HTTPS is required for mobile video functionality:

openssl req -newkey rsa:4096 -x509 -sha256 -days 365 -nodes -out cert.pem -keyout key.pem
flask run --host=0.0.0.0 --key key.pem --cert cert.pem --debug

Access https://your-machine-ip:5000 from your mobile device. Optionally, start a local tunnel with ngrok or localtunnel:

npx localtunnel --local-https --allow-invalid-cert --port 5000

Acknowledgements and Inspiration

- "漢字路" 한글한자자동변환 서비스는 교육부 고전문헌국역지원사업의 지원으로 구축되었습니다.
- "漢字路" 한글한자자동변환 서비스는 전통문화연구회 "울산대학교한국어처리연구실 옥철영(IT융합전공)교수팀"에서 개발한 한글한자자동변환기를 바탕하여 지속적으로 공동 연구 개발하고 있는 서비스입니다.
- 현재 고유명사(인명, 지명등)을 비롯한 여러 변환오류가 있으며 이를 해결하고자 많은 연구 개발을 진행하고자 하고 있습니다. 이를 인지하시고 다른 곳에서 인용시 한자 변환 결과를 한번 더 검토하시고 사용해 주시기 바랍니다.
- 변환오류 및 건의,문의사항은 juntong@juntong.or.kr로 메일로 보내주시면 감사하겠습니다. .
Copyright ⓒ 2020 By '전통문화연구회(傳統文化硏究會)' All Rights reserved.
 한국   대만   중국   일본