Ollama Configuration

This guide explains how to configure and use Ollama with HummerBot AI for local model serving.

Setting up Ollama URL in Model Selector

HummerBot AI allows you to connect to a local Ollama instance through the model selector. To configure this:

Open the model selector dropdown in the AI service interface
Click the settings icon (gear symbol) next to the "Model" label
In the Ollama Settings dialog, enter your Ollama server URL (typically http://localhost:11434)
Click "Save" to store the URL in your browser's local storage
The Ollama models will now be available in the "Offline Models" section of the model selector

The Ollama URL is stored in your browser's local storage under the key "ollama-url" and defaults to http://localhost:11434 if not set.

Installing and Running Ollama

Before configuring HummerBot to use Ollama, you need to install and run Ollama on your system:

Download Ollama from https://ollama.ai
Follow the installation instructions for your operating system
Start the Ollama service:
```
ollama serve
```
Verify that Ollama is running by checking if you can access the API at http://localhost:11434

Pulling and Downloading Ollama Models

Ollama provides access to a wide range of open-source models. To pull and download models:

Open your terminal or command prompt
Use the ollama pull command to download a model:
```
ollama pull llama3
```

You can also pull other models like:

ollama pull mistral
ollama pull phi3
ollama pull gemma2

Check available models with:
```
ollama list
```

Understanding Model Differences

Different Ollama models have varying capabilities, performance characteristics, and resource requirements:

Llama models (llama3, llama3.1, llama3.2, etc.): Meta's family of models with good general-purpose performance. Larger versions like 70B parameter models offer better reasoning but require more resources.
Mistral models (mistral, mistral-openorca): Known for strong reasoning and coding capabilities, with open-source variants available.
Phi models (phi3, phi3.5): Microsoft's lightweight models that perform well on reasoning tasks despite smaller size.
Gemma models (gemma, gemma2): Google's open models optimized for efficiency and performance.
Code-specific models (codellama, starcoder, deepseek-coder): Optimized for programming tasks and understanding code.
Specialized models (llava for vision tasks, nomic-embed-text for embeddings): Designed for specific use cases.

Model Parameters and Requirements

Models have different parameter counts that affect performance:

Parameter count: Models range from 1B to 70B+ parameters
Higher parameter models require more GPU memory and computational resources, resulting in slower inference but potentially better quality
Lower parameter models run faster with less memory but may be less accurate or capable
GPU requirements: Models with >7B parameters typically need modern GPUs with 8GB+ VRAM for reasonable performance
CPU fallback: If no GPU is available, models will run on CPU but significantly slower

Model Capabilities

Different models have varying capabilities:

Tool support: Not all models support function calling or tool use. Models like llama3.1 and newer mistral models have better tool integration
Accuracy: Larger models and newer architectures generally provide better accuracy, though this varies by task
Context length: Models support different maximum context lengths (2K to 128K tokens), affecting how much text they can process at once
Specialized tasks: Some models excel at coding, others at creative writing, and others at factual accuracy

Serving Models with Ollama

Once you've downloaded models, Ollama automatically serves them. To test a model:

Run a simple query:
```
ollama run llama3
```

Or use the API directly:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

Using Ollama with HummerBot AI

After configuring the Ollama URL in HummerBot:

The model selector will display Ollama models under the "Offline Models" section
Select your preferred model from the dropdown
Start using the local model for AI tasks
Local models offer privacy benefits as your data doesn't leave your machine

Note: The model selector groups models by category (online/offline) and provider. Ollama models appear in the "Offline Models" section when properly configured.

Building Custom Models

In addition to using pre-built models from Ollama, you can create custom models by downloading from other sources like Hugging Face and building your own model files.

Downloading Models from Hugging Face

You can use models from Hugging Face with Ollama by creating a model file that references the Hugging Face model:

Find a compatible model on Hugging Face
Look for models that are compatible with Ollama (GGUF format or models that can be converted)
Create a Modelfile that references the Hugging Face model

Creating a Modelfile

A Modelfile is a text file that defines how Ollama should build your custom model. Here's the basic structure:

FROM <base_model>
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER max_tokens 2048
SYSTEM "You are a helpful AI assistant."
ADAPTER /path/to/adapter/model
LICENSE Apache 2.0

Key components of a Modelfile:

FROM: Specifies the base model to use (e.g., llama3, mistral, or a model you've downloaded)
TEMPLATE: Defines how to format prompts for the model
PARAMETER: Sets model parameters like temperature, top_p, max_tokens
SYSTEM: Sets a system prompt that will be used with every request
ADAPTER: Optional path to a LoRA adapter to modify the model's behavior
LICENSE: Optional license information

Example Modelfile

Here's an example of a custom model based on Llama3:

FROM llama3
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER max_tokens 4096
SYSTEM "You are a coding assistant that specializes in JavaScript and Python."

Building Your Custom Model

To build your custom model:

Create your Modelfile with the desired configuration
Run the following command in the same directory as your Modelfile:
```
ollama create my-custom-model -f Modelfile
```
Your custom model will now be available in Ollama as my-custom-model

Useful Resources

Ollama Model Library - Official collection of models
Hugging Face - Large collection of open-source models
Ollama Documentation - Official Modelfile documentation
GGUF Converter - Tool to convert models to GGUF format for Ollama

Exposing Local Ollama Instance

If you need to access your local Ollama instance from other devices or over the internet:

Using Tunnels

Using ngrok:
- Install ngrok from https://ngrok.com
- Start your Ollama service
- Create a tunnel:
```
ngrok http 11434
```
- Use the provided public URL in HummerBot's Ollama configuration
Using Cloudflare Tunnel:
- Install Cloudflare's cloudflared utility
- Authenticate: cloudflared tunnel login
- Create and route a tunnel: cloudflared tunnel --url http://localhost:11434

Using Port Forwarding

Access your router's configuration page
Set up port forwarding for port 11434 to your local machine
Use your public IP address with the port to access Ollama externally
Note: This method requires a static IP or dynamic DNS service

Security Considerations

When exposing your local Ollama instance:

Use authentication if possible
Limit access to trusted networks
Consider using VPN instead of direct port exposure
Regularly update Ollama to the latest version
Be mindful of the computational resources required when running models

Troubleshooting

If HummerBot can't connect to Ollama, verify that the Ollama service is running
Check that the URL is correctly formatted (including the protocol and port)
Ensure firewall settings allow connections to port 11434
For remote access, verify that your tunnel or port forwarding is properly configured

Ollama Configuration

Guide to configuring Ollama for local model serving in Hummerbot AI, including URL setup, model management, and remote access.

Ollama Configuration

Guide to configuring Ollama for local model serving in Hummerbot AI, including URL setup, model management, and remote access.

Ollama Configuration

Setting up Ollama URL in Model Selector

Installing and Running Ollama

Pulling and Downloading Ollama Models

Understanding Model Differences

Model Parameters and Requirements

Model Capabilities

Serving Models with Ollama

Using Ollama with HummerBot AI

Building Custom Models

Downloading Models from Hugging Face

Creating a Modelfile

Example Modelfile

Building Your Custom Model

Useful Resources

Exposing Local Ollama Instance

Using Tunnels

Using Port Forwarding

Security Considerations

Troubleshooting