ollama

Installation

First, visit the official website to download the installer.

Next, follow the prompts to install the app on your machine.

After installation, verify its success by running the following command -

ollama --version

Downloading models

Use the command below to download and run a model -

ollama run gemma2

If the model is not already downloaded, this command will automatically download it to your system and then run it.

Browse the full list of available models here.

Optionally, you can change the path where you want to store the models in your computer.

Set the value of a environment variable named OLLAMA_MODELS to something like D:\Ollama\Models.

List model

To view all the models currently available on your system, use the following command -

ollama list

Enabling GPU

Leveraging a GPU can dramatically enhance the performance and speed of model inference.

Ensure you GPU supports compute by referring to the documentation for instructions.

Now, add an environment variable named CUDA_VISIBLE_DEVICES and assign the value to the GUID of your GPU in your system.

If you have a Nvidia GPU you can get the GUID by using the following command -

nvidia-smi -L

It should be display the GUID like GPU-4978c8cd-047e-258e-69d5-4731be9bg377.

If you are using an AMD GPU, follow the steps in the documentation to enable it for Ollama.

With Ollama running on your Windows machine, you can easily access powerful language models locally ensuring privacy without compromise. Be sure to have a capable machine to fully leverage its potential for efficient performance.