GPU benchmarking with Llama.cpp
After adding a GPU and configuring my setup, I wanted to benchmark my graphics card. I used Llama.cpp and compiled it to leverage an NVIDIA GPU. Here, I summarize the steps I followed. Hardware Used OS: Ubuntu 24.04 LTS (Official page) GPU: NVIDIA RTX 3060 (affiliate link) CPU: AMD Ryzen 7 5700G (affiliate link) RAM: 52 GB Storage: Samsung SSD 990 EVO 1TB (affiliate link) Installing the NVIDIA CUDA Toolkit To compile llama.cpp, you need to install the NVIDIA CUDA Toolkit. The process is straightforward—just follow the well-documented guide. ...
Install Ollama and OpenWebUI on Ubuntu 24.04 with an NVIDIA RTX3060 GPU
As part of a personal project, I equipped myself with an NVIDIA GPU (an RTX 3060) to properly run LLM models locally. To easily use different models, I rely on OpenWebUI (with Ollama). Since the installation can be a bit of an adventure, I’m summarizing the steps here. Configuration Used On my PC, I have: OS: Ubuntu 24.04 LTS (Official page) GPU: NVIDIA RTX 3060 (affiliate link) CPU: AMD Ryzen 7 5700G (affiliate link) RAM: 52 GB Storage: Samsung SSD 990 EVO 1TB (affiliate link) This setup allows me to properly run 14B models (around thirty tokens/s). ...