Ollama

Ollama allows you to run open-source large language models, such as Llama 2, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

It optimizes setup and configuration details, including GPU usage.

For a complete list of supported models and model variants, see the Ollama model library.

Setup

First, follow these instructions to set up and run a local Ollama instance:

Download and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)
Fetch available LLM model via ollama pull <name-of-model>
- View a list of available models via the model library
- e.g., ollama pull llama3
This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

On Mac, the models will be download to ~/.ollama/models
On Linux (or WSL), the models will be stored at /usr/share/ollama/.ollama/models

Specify the exact version of the model of interest as such ollama pull vicuna:13b-v1.5-16k-q4_0 (View the various tags for the Vicuna model in this instance)
To view all pulled models, use ollama list
To chat directly with a model from the command line, use ollama run <name-of-model>
View the Ollama documentation for more commands. Run ollama help in the terminal to see available commands too.

Usage

You can see a full list of supported parameters on the API reference page.

If you are using a LLaMA chat model (e.g., ollama pull llama3) then you can use the ChatOllama interface.

This includes special tokens for system message and user input.

Interacting with Models

Here are a few ways to interact with pulled local models

directly in the terminal:

All of your local models are automatically served on localhost:11434
Run ollama run <name-of-model> to start interacting via the command line directly

via an API

Send an application/json request to the API endpoint of Ollama to interact.

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":"Why is the sky blue?"
}'

See the Ollama API documentation for all endpoints.

via LangChain

See a typical basic example of using Ollama chat model in your LangChain application.

from langchain_community.llms import Ollama

llm = Ollama(model="llama3")

llm.invoke("Tell me a joke")

API Reference:

Ollama

"Here's one:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!\n\nHope that made you smile! Do you want to hear another one?"

To stream tokens, use the .stream(...) method:

query = "Tell me a joke"

for chunks in llm.stream(query):
    print(chunks)

S
ure
,
 here
'
s
 one
:




Why
 don
'
t
 scient
ists
 trust
 atoms
?


B
ecause
 they
 make
 up
 everything
!




I
 hope
 you
 found
 that
 am
using
!
 Do
 you
 want
 to
 hear
 another
 one
?

To learn more about the LangChain Expressive Language and the available methods on an LLM, see the LCEL Interface

Ollama has support for multi-modal LLMs, such as bakllava and llava.

ollama pull bakllava

Be sure to update Ollama so that you have the most recent version to support multi-modal.

from langchain_community.llms import Ollama

bakllava = Ollama(model="bakllava")

API Reference:

Ollama

import base64
from io import BytesIO

from IPython.display import HTML, display
from PIL import Image


def convert_to_base64(pil_image):
    """
    Convert PIL images to Base64 encoded strings

    :param pil_image: PIL image
    :return: Re-sized Base64 string
    """

    buffered = BytesIO()
    pil_image.save(buffered, format="JPEG")  # You can change the format if needed
    img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
    return img_str


def plt_img_base64(img_base64):
    """
    Display base64 encoded string as image

    :param img_base64:  Base64 string
    """
    # Create an HTML img tag with the base64 string as the source
    image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
    # Display the image by rendering the HTML
    display(HTML(image_html))


file_path = "../../../static/img/ollama_example_img.jpg"
pil_image = Image.open(file_path)
image_b64 = convert_to_base64(pil_image)
plt_img_base64(image_b64)

llm_with_image_context = bakllava.bind(images=[image_b64])
llm_with_image_context.invoke("What is the dollar based gross retention rate:")

'90%'

Setup​

Usage​

Interacting with Models​

directly in the terminal:​

via an API​

via LangChain​

API Reference:

Multi-modal​

API Reference:

Help us out by providing feedback on this documentation page:

Setup

Usage

Interacting with Models

directly in the terminal:

via an API

via LangChain

Multi-modal