Ollama, a free, open-source tool, enables the operation of various large language models, such as Llama 3, on personal computers, even those with limited capabilities. It utilizes llama.cpp, an open-source library that optimizes the performance of LLMs on local machines with minimal hardware demands. Ollama also features a type of package manager that simplifies the process of quickly and efficiently downloading and activating LLMs with a single command.
Ruinning Llama 3 locally with Ollama step by step
To get started, the initial step is to install Ollama, which is compatible with the three major operating systems, with the Windows version currently in preview.
Llama 3: Everything you need to know about Meta’s latest LLM
After installation, simply open your terminal. The command to activate Ollama is consistent across all supported platforms.
Please allow a few minutes for the model to download and load, after which you can begin interacting with it!
Ollama API
If you’re considering incorporating Ollama into your projects, it provides its own API as well as one compatible with OpenAI. These APIs facilitate the automatic loading of a locally stored LLM into memory, executing the inference, and then unloading it after a specified period.
However, you must first download the models you wish to utilize via the command line before you can operate the model through the API.
Ollama also offers dedicated APIs, including several SDKs specifically for Javascript and Python. For a basic text generation task using the API, simply adjust the model parameter to the desired model. For detailed instructions, refer to the official API documentation.
curl http://localhost:11434/api/generate -d ‘{
“model”: “llama”,
“prompt”:”what is artificial intelligence?”
}’
Here’s the method to perform a chat generation inference using the API.
curl http://localhost:11434/api/chat -d ‘{
“model”: “llama”,
“messages”: [
{ “role”: “user”, “content”: “what is artificial intelligence?” }
]
}’
How to setup Ollama with WebUI?
Start by downloading and installing Docker Desktop on your computer to set up a stable environment for running containerized applications. After installation, open your terminal and execute the following command to pull the latest Ollama image from Docker Hub:
docker run -d \
–name ollama \
-p 11434:11434 \
-v ollama_volume:/root/.ollama \
ollama/ollama:latest
This command retrieves the latest version of the Ollama image, complete with all the necessary libraries and dependencies needed for operating the model:
- docker run: This initiates the creation and startup of a new Docker container.
- -d: Enables detached mode, allowing the container to operate in the background of your terminal.
- –name ollama: Assigns the name “ollama” to the container, which simplifies future references to it via Docker commands.
- -p 11434:11434: Maps port 11434 on the container to port 11434 on the host system, facilitating interaction with the application inside the container through the specified host system’s port.
- -v ollama_volume:/root/.ollama: Attaches a volume named “ollama_volume” to /root/.ollama within the container for persistent storage, ensuring that data remains intact across container restarts and recreations. Docker will automatically create “ollama_volume” if it doesn’t already exist.
- ollama/ollama:latest: Specifies the container image, using the “latest” version of the “ollama/ollama” image from a Docker registry such as Docker Hub.
Next, to unleash the functionality, enter this command in your terminal:
$ docker ps
aa492e7068d7 ollama/ollama:latest “/bin/ollama serve” 9 seconds ago Up 8 seconds 0.0.0.0:11434->11434/tcp ollama
$ curl localhost: 11434
Ollama is running
Proceed by cloning the official repository of Ollama WebUI:
git clone https://github.com/ollama-webui/ollama-webui
cd ollama-webui
Next, open the Compose file to view the YAML configuration:
version: ‘3.6’
services:
ollama:
volumes:
– ollama:/root/.ollama
# Uncomment below to expose Ollama API outside the container stack
# ports:
# – 11434:11434
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:latest
ollama-webui:
build:
context: .
args:
OLLAMA_API_BASE_URL: ‘/ollama/api’
dockerfile: Dockerfile
image: ollama-webui:latest
container_name: ollama-webui
depends_on:
– ollama
ports:
– 3000:8080
environment:
– “OLLAMA_API_BASE_URL=http://ollama:11434/api”
extra_hosts:
– host.docker.internal:host-gateway
restart: unless-stopped
volumes:
ollama: {}
Before proceeding with the next steps, ensure you stop the Ollama Docker container using this method:
Navigate to Docker Dashboard, then hit Containers, next click on WebUI port.
You’ve now successfully set up Ollama with its WebUI in merely two minutes, avoiding complex pod deployment processes. With this setup, you can access a wide range of features and functionalities through the WebUI. Engage Ollama to generate various creative text outputs, such as poems, code, scripts, musical compositions, emails, and letters. You can also use it to effortlessly translate text between different languages or receive coding assistance and suggestions.
Featured image credit: Meta