What is Ollama?

Ollama is an open-source tool that lets you run large language models (LLMs) directly on your computer. With Ollama, you can download, run, and manage LLMs locally without relying on cloud services.

Key Features

  • Run Locally : No internet needed; models run on your device, ensuring privacy.
  • Wide Model Support : Supports Llama 3.1, Phi 3, Mistral, Gemma 2, and more.
  • Modelfile System : Packages model weights, configurations, and data (similar to Docker for LLMs).
  • Customization : Modify and fine-tune models for specific tasks.
  • Easy Setup : Simple installation and user-friendly interface.

How to Use Ollama

  1. Download and Install: Visit ollama.com and install it(Mac, Windows, or Linux).

  2. Get a Model: Run the command to download a model:

    ollama run llama3.2
    

    This downloads and runs the model. Large models may require more RAM and CPU power. If the model isn’t downloaded, the command will fetch it (one-time process).

  3. Interact with the Model: Use the command line or web interface to ask questions or input prompts. Running Ollama

Useful Commands

  • ollama list - Shows downloaded models.
  • ollama run mistral - Switch to a different model.

Shortcuts

  • Ctrl + C - Stop current output.
  • /? - View available commands.
  • /clear - Clear the screen.
  • /bye - Close the model and exit Ollama.

Using Ollama in Applications

Ollama provides APIs to interact with models via localhost. You can start serving it with below command:

ollama serve

This will start the Ollama server on port 11434.

Here is an example of how to use the Ollama API in a Python application:

# Script to demonstrate basic usage of the Ollama API
import ollama

# Initialize Ollama client
client = ollama.Client()

# Set the LLM model to use
model = "llama3.2"
# Define the prompt to send to the model
prompt = "Where are we heading with AI?"

# Generate a response using the specified model and prompt
response = client.generate(model, prompt)
# Print the text response from the model
print(f"Response: {response.response}")

Building Custom Models with Modelfile

Use the Modelfile to create a package for your model. It includes the model’s weights, configuration, data and lot other parameters. Here’s a simple example and commands to create a custom model:

Create a Modelfile:

# Base model - Using Meta's Llama 3.2 as the foundation model
FROM llama3.2

# Set the temperature parameter to control the randomness of responses
# Higher values (e.g., 1) make output more creative but less predictable
# Lower values (e.g., 0.1) make output more deterministic and focused
PARAMETER temperature 1

# Define the system prompt that establishes the AI's role and behavior
# This text will guide how the model responds in all conversations
SYSTEM """
You are customer care agent. You are talking to a customer. Greet the customer "Hello there! How can I help you today?"
"""

Run the below command to create the model:

ollama create SupportAI -f Modelfile

creating custom model

Once the custom model is created, you can use it in your applications or in terminal.

Conclusion

Ollama is a powerful tool that brings large language models to your local machine, offering privacy, flexibility, and control over how you use AI. Whether you’re exploring different models, fine-tuning them for specific tasks, or integrating them into applications, Ollama makes it simple and efficient.

With its easy setup, offline capabilities, and API support, it’s an excellent choice for developers, researchers, and AI enthusiasts. If you’re looking for a way to run LLMs without relying on cloud services, give Ollama a try! 🚀