We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation .
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I've successfully use my RTX 3080ti with Stable Diffusion, Fooocus, Stable Cascade and my system is ready to work with GPU. Arch Linux.
$ ./run.sh --model 7b --with-cuda ... ? Container llama-gpt-llama-gpt-ui-1 Recreated 0.3s ? Container llama-gpt-llama-gpt-api-cuda-ggml-1 Recreated 0.3s Attaching to llama-gpt-api-cuda-ggml-1, llama-gpt-ui-1 llama-gpt-ui-1 | [INFO wait] -------------------------------------------------------- llama-gpt-ui-1 | [INFO wait] docker-compose-wait 2.12.1 llama-gpt-ui-1 | [INFO wait] --------------------------- llama-gpt-ui-1 | [DEBUG wait] Starting with configuration: llama-gpt-ui-1 | [DEBUG wait] - Hosts to be waiting for: [llama-gpt-api-cuda-ggml:8000] llama-gpt-ui-1 | [DEBUG wait] - Paths to be waiting for: [] llama-gpt-ui-1 | [DEBUG wait] - Timeout before failure: 3600 seconds llama-gpt-ui-1 | [DEBUG wait] - TCP connection timeout before retry: 5 seconds llama-gpt-ui-1 | [DEBUG wait] - Sleeping time before checking for hosts/paths availability: 0 seconds llama-gpt-ui-1 | [DEBUG wait] - Sleeping time once all hosts/paths are available: 0 seconds llama-gpt-ui-1 | [DEBUG wait] - Sleeping time between retries: 1 seconds llama-gpt-ui-1 | [DEBUG wait] -------------------------------------------------------- llama-gpt-ui-1 | [INFO wait] Checking availability of host [llama-gpt-api-cuda-ggml:8000] Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]] Gracefully stopping... (press Ctrl+C again to force) $ nvidia-smi Sat Feb 24 17:49:50 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3080 ... Off | 00000000:01:00.0 Off | N/A | | N/A 36C P8 10W / 125W | 10MiB / 16384MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2044 G /usr/lib/Xorg 4MiB | +---------------------------------------------------------------------------------------+
What should I check?
The text was updated successfully, but these errors were encountered:
nvidia-ctk runtime configure systemctl restart docker
Sorry, something went wrong.
This worked for me, thanks! Now I'm at "forward compatibility was attempted on non supported HW"
llama-gpt-api-cuda-ggml-1 | llama-gpt-api-cuda-ggml-1 | /models/llama-2-7b-chat.bin model found. llama-gpt-api-cuda-ggml-1 | make: *** No rule to make target 'build'. Stop. llama-gpt-api-cuda-ggml-1 | Initializing server with: llama-gpt-api-cuda-ggml-1 | Batch size: 2096 llama-gpt-api-cuda-ggml-1 | Number of CPU threads: 24 llama-gpt-api-cuda-ggml-1 | Number of GPU layers: 10 llama-gpt-api-cuda-ggml-1 | Context window: 4096 llama-gpt-api-cuda-ggml-1 | CUDA error 804 at /tmp/pip-install-7rxfzzup/llama-cpp-python_c62cf07cbfa449a7b268f9102316d6db/vendor/llama.cpp/ggml-cuda.cu:4883: forward compatibility was attempted on non supported HW
Setup: System: Host: TowerPC Kernel: 6.1.0-18-amd64 arch: x86_64 bits: 64 Desktop: KDE Plasma v: 5.27.5 Distro: Debian GNU/Linux 12 (bookworm) Machine: Type: Desktop Mobo: ASUSTeK model: PRIME X299-A II v: Rev 1.xx serial: BIOS: American Megatrends v: 0702 date: 06/10/2020 CPU: Info: 12-core model: Intel Core i9-10920X bits: 64 type: MT MCP cache: L2: 12 MiB Speed (MHz): avg: 1201 min/max: 1200/4600:4800:4700 cores: 1: 1200 2: 1201 3: 1200 4: 1200 5: 1200 6: 1200 7: 1206 8: 1200 9: 1200 10: 1232 11: 1200 12: 1200 13: 1200 14: 1200 15: 1200 16: 1200 17: 1200 18: 1200 19: 1200 20: 1200 21: 1200 22: 1200 23: 1200 24: 1200 Graphics: Device-1: NVIDIA GA104 [GeForce RTX 3060 Ti Lite Hash Rate] driver: nvidia v: 525.147.05
I'm seeing this same error in a 1080 Ti, but when issuing nvidia-smi , the result is
nvidia-smi
? nvidia-smi Failed to initialize NVML: Driver/library version mismatch NVML library version: 550.54
I followed the instructions from nVidia's CUDA page here to install the CUDA drivers, but I'm guessing there's a driver mismatch somewhere..
Edit Miraculously, a reboot didn't break my system and I now see similar output in the nvidia-smi command. However, I don't have an nvidia-ctk command that I can run, so I remain stuck(ish)
nvidia-ctk
Edit 2 - A quick search indicated I needed to install the nVidia Container Toolkit , and restart docker. The errors have now gone away.
I was able to bring up the docker-compose-cuda-ggml.yml file using the command docker compose -f docker-compose-cuda-ggml.yml up -d - however, the other cuda compose (gguf) did not work for me
docker-compose-cuda-ggml.yml
docker compose -f docker-compose-cuda-ggml.yml up -d
No branches or pull requests