Experimenting with Lllama 3 via Ollama
I saw that Meta released the Llama 3 AI model, and people seem excited about it, so I decided to give it a try.
I don’t have much experience running open-source AI models, and I didn’t see a lot of documentation about how to run them. I tinkered with it for a few hours and got Llama 3 working with Ollama, so I wanted to share my instructions.
Provisioning a cloud server with a GPU ๐︎
To run this experiment, I provisioned the following server on Scaleway:
- Server instance type: GPU-3070-S
- OS: Ubuntu Focal
- Disk size: 100 GB (needed because the model is large)
To SSH in, I ran the following command with port forwarding because I’ll need access to the web interface that will run on the server’s localhost interface.
TARGET_IP='51.159.184.186' # Change to your server's IP.
REMOTE_PORT='8080'
LOCAL_PORT='8080'
# SSH in and port-forward a port to access the Open-WebUI web interface.
ssh "${TARGET_IP}" -L "${REMOTE_PORT}:localhost:${LOCAL_PORT}"
Install CUDA ๐︎
First, install CUDA to enable Ollama to use the GPU:
sudo apt-get install linux-headers-$(uname -r) && \
sudo apt-key del 7fa2af80 && \
echo "deb [signed-by=/usr/share/keyrings/cudatools.gpg] https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" | sudo tee /etc/apt/sources.list.d/cuda-ubuntu2204-x86_64.list && \
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin && \
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
sudo apt-get update && \
sudo apt-get install -y cuda-toolkit nvidia-container-toolkit ca-certificates curl
Install docker ๐︎
Next, install Docker so that you can run ollama under the Open-WebUI web interface for Ollama:
sudo install -m 0755 -d /etc/apt/keyrings && \
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc && \
sudo chmod a+r /etc/apt/keyrings/docker.asc && \
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null && \
sudo apt-get update && \
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin && \
sudo usermod -aG docker "${USER}" && \
newgrp docker
To test everything is working, run the following command:
docker run hello-world
Start Ollama and Open-WebUI ๐︎
I adapted the standard Open-WebUI Docker Compose file to make one for Ollama, which you can download and run with the following command:
wget https://mtlynch.io/notes/ollama-llama3/docker-compose.yml && \
docker-compose up
Once the server is up and running, visit the following URL in your browser:
You’ll first see a page prompting for a login. Click “Sign up.”
Then enter any details. You don’t really need a valid email, as far as I can tell.
From here, you need to download a model to use. Click the settings button:
I don’t know the differences between the models, but Llama 3 is the newest one that just came out a few days ago, so I decided to try that. It says on ollama.com that llama3:70b is optimized for chatbot use cases, so I initially went with that one, but it was incredibly slow. I switched to llama3 and that performed decently:
It’s going to sit at 100% for a while, but it’s not done until you see a popup announcing the model is fully downloaded.
Once that’s downloaded, close the settings dialog and select llama3:latest from the dropdown:
From there, you can start playing with Llama 3. Here’s me having a conversation with Llama 3 as it pretends to be Nathan Fielder:
Read My Book

I'm writing a book of simple techniques to help developers improve their writing.
My book will teach you how to:
- Create clear and pleasant software tutorials
- Attract readers and customers through blogging
- Write effective emails
- Minimize pain in writing design documents
Be the first to know when I post cool stuff
Subscribe to get my latest posts by email.





