
Installation
First, visit the official website to download the installer.
Next, follow the prompts to install the app on your machine.
After installation, verify its success by running the following command -
ollama --version
Downloading models
Use the command below to download and run a model -
ollama run gemma2
If the model is not already downloaded, this command will automatically download it to your system and then run it.
Browse the full list of available models here.
Optionally, you can change the path where you want to store the models in your computer.
Set the value of a environment variable named OLLAMA_MODELS to something like D:\Ollama\Models.
List model
To view all the models currently available on your system, use the following command -
ollama list
Enabling GPU
Leveraging a GPU can dramatically enhance the performance and speed of model inference.
Ensure you GPU supports compute by referring to the documentation for instructions.
Now, add an environment variable named CUDA_VISIBLE_DEVICES and assign the value to the GUID of your GPU in your system.
If you have a Nvidia GPU you can get the GUID by using the following command -
nvidia-smi -L
It should be display the GUID like GPU-4978c8cd-047e-258e-69d5-4731be9bg377.
If you are using an AMD GPU, follow the steps in the documentation to enable it for Ollama.
With Ollama running on your Windows machine, you can easily access powerful language models locally ensuring privacy without compromise. Be sure to have a capable machine to fully leverage its potential for efficient performance.