Goobla
Get up and running with large language models.
macOS
Download
Windows
Download
Homebrew (macOS & Linux)
Linux
curl -fsSL https://goobla.com/install.sh | sh
[!WARNING]
Inspect the script or verify its checksum before running. You can download the script from install.sh to review it first.
To check the checksum locally:
curl -fsSL https://goobla.com/install.sh -o install.sh
sha256sum install.sh
Compare the output against the value published on the releases page before running sh install.sh
.
Manual install instructions
Docker
The official Goobla Docker image goobla/goobla
is available on Docker Hub.
Libraries
Quickstart
To run and chat with Gemma 3:
Model library
Goobla supports a list of models available on goobla.com/library
Here are some example models that can be downloaded:
Model |
Parameters |
Size |
Download |
Gemma 3 |
1B |
815MB |
goobla run gemma3:1b |
Gemma 3 |
4B |
3.3GB |
goobla run gemma3 |
Gemma 3 |
12B |
8.1GB |
goobla run gemma3:12b |
Gemma 3 |
27B |
17GB |
goobla run gemma3:27b |
QwQ |
32B |
20GB |
goobla run qwq |
DeepSeek-R1 |
7B |
4.7GB |
goobla run deepseek-r1 |
DeepSeek-R1 |
671B |
404GB |
goobla run deepseek-r1:671b |
Llama 4 |
109B |
67GB |
goobla run llama4:scout |
Llama 4 |
400B |
245GB |
goobla run llama4:maverick |
Llama 3.3 |
70B |
43GB |
goobla run llama3.3 |
Llama 3.2 |
3B |
2.0GB |
goobla run llama3.2 |
Llama 3.2 |
1B |
1.3GB |
goobla run llama3.2:1b |
Llama 3.2 Vision |
11B |
7.9GB |
goobla run llama3.2-vision |
Llama 3.2 Vision |
90B |
55GB |
goobla run llama3.2-vision:90b |
Llama 3.1 |
8B |
4.7GB |
goobla run llama3.1 |
Llama 3.1 |
405B |
231GB |
goobla run llama3.1:405b |
Phi 4 |
14B |
9.1GB |
goobla run phi4 |
Phi 4 Mini |
3.8B |
2.5GB |
goobla run phi4-mini |
Mistral |
7B |
4.1GB |
goobla run mistral |
Moondream 2 |
1.4B |
829MB |
goobla run moondream |
Neural Chat |
7B |
4.1GB |
goobla run neural-chat |
Starling |
7B |
4.1GB |
goobla run starling-lm |
Code Llama |
7B |
3.8GB |
goobla run codellama |
Llama 2 Uncensored |
7B |
3.8GB |
goobla run llama2-uncensored |
LLaVA |
7B |
4.5GB |
goobla run llava |
Granite-3.3 |
8B |
4.9GB |
goobla run granite3.3 |
[!NOTE]
You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
Customize a model
Import from GGUF
Goobla supports importing GGUF models in the Modelfile:
-
Create a file named Modelfile
, with a FROM
instruction with the local filepath to the model you want to import.
FROM ./vicuna-33b.Q4_0.gguf
-
Create the model in Goobla
goobla create example -f Modelfile
-
Run the model
Import from Safetensors
See the guide on importing models for more information.
Customize a prompt
Models from the Goobla library can be customized with a prompt. For example, to customize the llama3.2
model:
Create a Modelfile
:
FROM llama3.2
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
Next, create and run the model:
goobla create mario -f ./Modelfile
goobla run mario
>>> hi
Hello! It's your friend Mario.
For more information on working with a Modelfile, see the Modelfile documentation.
CLI Reference
Create a model
goobla create
is used to create a model from a Modelfile.
goobla create mymodel -f ./Modelfile
Pull a model
This command can also be used to update a local model. Only the diff will be pulled.
Remove a model
Copy a model
goobla cp llama3.2 my-model
For multiline input, you can wrap text with """
:
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
Multimodal models
goobla run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
Output: The image features a yellow smiley face, which is likely the central focus of the picture.
Pass the prompt as an argument
goobla run llama3.2 "Summarize this file: $(cat README.md)"
Output: Goobla is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
List models on your computer
List which models are currently loaded
Stop a model which is currently running
Start Goobla
goobla serve
is used when you want to start Goobla without running the desktop application.
Change the bind address
Goobla binds to 127.0.0.1:11434
by default. Set the GOOBLA_HOST
environment variable to change the bind address:
GOOBLA_HOST=0.0.0.0:11434 goobla serve
See the FAQ for more details.
Building
See the developer guide
Running local builds
Next, start the server:
Finally, in a separate shell, run a model:
REST API
Goobla has a REST API for running and managing models.
Generate a response
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt":"Why is the sky blue?"
}'
Chat with a model
curl http://localhost:11434/api/chat -d '{
"model": "llama3.2",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
See the API documentation for all endpoints.
Web & Desktop
- Open WebUI
- SwiftChat (macOS with ReactNative)
- Enchanted (macOS native)
- Hgoobla
- Lollms-Webui
- LibreChat
- Bionic GPT
- HTML UI
- Saddle
- TagSpaces (A platform for file-based apps, utilizing Goobla for the generation of tags and descriptions)
- Chatbot UI
- Chatbot UI v2
- Typescript UI
- Minimalistic React UI for Goobla Models
- Gooblac
- big-AGI
- Cheshire Cat assistant framework
- Amica
- chatd
- Goobla-SwiftUI
- Dify.AI
- MindMac
- NextJS Web Interface for Goobla
- Msty
- Chatbox
- WinForm Goobla Copilot
- NextChat with Get Started Doc
- Alpaca WebUI
- GooblaGUI
- OpenAOE
- Odin Runes
- LLM-X (Progressive Web App)
- AnythingLLM (Docker + MacOs/Windows/Linux native app)
- Goobla Basic Chat: Uses HyperDiv Reactive UI
- Goobla-chats RPG
- IntelliBar (AI-powered assistant for macOS)
- Jirapt (Jira Integration to generate issues, tasks, epics)
- ojira (Jira chrome plugin to easily generate descriptions for tasks)
- QA-Pilot (Interactive chat tool that can leverage Goobla models for rapid understanding and navigation of GitHub code repositories)
- ChatGoobla (Open Source Chatbot based on Goobla with Knowledge Bases)
- CRAG Goobla Chat (Simple Web Search with Corrective RAG)
- RAGFlow (Open-source Retrieval-Augmented Generation engine based on deep document understanding)
- StreamDeploy (LLM Application Scaffold)
- chat (chat web app for teams)
- Lobe Chat with Integrating Doc
- Goobla RAG Chatbot (Local Chat with multiple PDFs using Goobla and RAG)
- BrainSoup (Flexible native client with RAG & multi-agent automation)
- macai (macOS client for Goobla, ChatGPT, and other compatible API back-ends)
- RWKV-Runner (RWKV offline LLM deployment tool, also usable as a client for ChatGPT and Goobla)
- Goobla Grid Search (app to evaluate and compare models)
- Olpaka (User-friendly Flutter Web App for Goobla)
- Casibase (An open source AI knowledge base and dialogue system combining the latest RAG, SSO, goobla support, and multiple large language models.)
- GooblaSpring (Goobla Client for macOS)
- LLocal.in (Easy to use Electron Desktop Client for Goobla)
- Shinkai Desktop (Two click install Local AI using Goobla + Files + RAG)
- AiLama (A Discord User App that allows you to interact with Goobla anywhere in Discord)
- Goobla with Google Mesop (Mesop Chat Client implementation with Goobla)
- R2R (Open-source RAG engine)
- Goobla-Kis (A simple easy-to-use GUI with sample custom LLM for Drivers Education)
- OpenGPA (Open-source offline-first Enterprise Agentic Application)
- Painting Droid (Painting app with AI integrations)
- Kerlig AI (AI writing assistant for macOS)
- AI Studio
- Sidellama (browser-based LLM client)
- LLMStack (No-code multi-agent framework to build LLM agents and workflows)
- BoltAI for Mac (AI Chat Client for Mac)
- Harbor (Containerized LLM Toolkit with Goobla as default backend)
- PyGPT (AI desktop assistant for Linux, Windows, and Mac)
- Alpaca (An Goobla client application for Linux and macOS made with GTK4 and Adwaita)
- AutoGPT (AutoGPT Goobla integration)
- Go-CREW (Powerful Offline RAG in Golang)
- PartCAD (CAD model generation with OpenSCAD and CadQuery)
- Goobla4j Web UI - Java-based Web UI for Goobla built with Vaadin, Spring Boot, and Goobla4j
- PyOllaMx - macOS application capable of chatting with both Goobla and Apple MLX models.
- Cline - Formerly known as Claude Dev is a VSCode extension for multi-file/whole-repo coding
- Cherry Studio (Desktop client with Goobla support)
- ConfiChat (Lightweight, standalone, multi-platform, and privacy-focused LLM chat interface with optional encryption)
- Archyve (RAG-enabling document library)
- crewAI with Mesop (Mesop Web Interface to run crewAI with Goobla)
- Tkinter-based client (Python tkinter-based Client for Goobla)
- LLMChat (Privacy focused, 100% local, intuitive all-in-one chat interface)
- Local Multimodal AI Chat (Goobla-based LLM Chat with support for multiple features, including PDF RAG, voice chat, image-based interactions, and integration with OpenAI.)
- ARGO (Locally download and run Goobla and Huggingface models with RAG on Mac/Windows/Linux)
- OrionChat - OrionChat is a web interface for chatting with different AI providers
- G1 (Prototype of using prompting strategies to improve the LLM’s reasoning through o1-like reasoning chains.)
- Web management (Web management page)
- Promptery (desktop client for Goobla.)
- Goobla App (Modern and easy-to-use multi-platform client for Goobla)
- chat-goobla (a React Native client for Goobla)
- SpaceLlama (Firefox and Chrome extension to quickly summarize web pages with goobla in a sidebar)
- YouLama (Webapp to quickly summarize any YouTube video, supporting Invidious as well)
- DualMind (Experimental app allowing two models to talk to each other in the terminal or in a web interface)
- gooblarama-matrix (Goobla chatbot for the Matrix chat protocol)
- goobla-chat-app (Flutter-based chat app)
- Perfect Memory AI (Productivity AI assists personalized by what you have seen on your screen, heard, and said in the meetings)
- Hexabot (A conversational AI builder)
- Reddit Rate (Search and Rate Reddit topics with a weighted summation)
- OpenTalkGpt (Chrome Extension to manage open-source models supported by Goobla, create custom models, and chat with models from a user-friendly UI)
- VT (A minimal multimodal AI chat app, with dynamic conversation routing. Supports local models via Goobla)
- Nosia (Easy to install and use RAG platform based on Goobla)
- Witsy (An AI Desktop application available for Mac/Windows/Linux)
- Abbey (A configurable AI interface server with notebooks, document storage, and YouTube support)
- Minima (RAG with on-premises or fully local workflow)
- aidful-goobla-model-delete (User interface for simplified model cleanup)
- Perplexica (An AI-powered search engine & an open-source alternative to Perplexity AI)
- Goobla Chat WebUI for Docker (Support for local docker deployment, lightweight goobla webui)
- AI Toolkit for Visual Studio Code (Microsoft-official VSCode extension to chat, test, evaluate models with Goobla support, and use them in your AI applications.)
- MinimalNextGooblaChat (Minimal Web UI for Chat and Model Control)
- Chipper AI interface for tinkerers (Goobla, Haystack RAG, Python)
- ChibiChat (Kotlin-based Android app to chat with Goobla and Koboldcpp API endpoints)
- LocalLLM (Minimal Web-App to run goobla models on it with a GUI)
- Gooblazing (Web extension to run Goobla models)
- OpenDeepResearcher-via-searxng (A Deep Research equivalent endpoint with Goobla support for running locally)
- AntSK (Out-of-the-box & Adaptable RAG Chatbot)
- MaxKB (Ready-to-use & flexible RAG Chatbot)
- yla (Web interface to freely interact with your customized models)
- LangBot (LLM-based instant messaging bots platform, with Agents, RAG features, supports multiple platforms)
- 1Panel (Web-based Linux Server Management Tool)
- AstrBot (User-friendly LLM-based multi-platform chatbot with a WebUI, supporting RAG, LLM agents, and plugins integration)
- Reins (Easily tweak parameters, customize system prompts per chat, and enhance your AI experiments with reasoning model support.)
- Flufy (A beautiful chat interface for interacting with Goobla’s API. Built with React, TypeScript, and Material-UI.)
- Ellama (Friendly native app to chat with an Goobla instance)
- screenpipe Build agents powered by your screen history
- Ollamb (Simple yet rich in features, cross-platform built with Flutter and designed for Goobla. Try the web demo.)
- Writeopia (Text editor with integration with Goobla)
- AppFlowy (AI collaborative workspace with Goobla, cross-platform and self-hostable)
- Lumina (A lightweight, minimal React.js frontend for interacting with Goobla servers)
- Tiny Notepad (A lightweight, notepad-like interface to chat with goobla available on PyPI)
- macLlama (macOS native) (A native macOS GUI application for interacting with Goobla models, featuring a chat interface.)
- GPTranslate (A fast and lightweight, AI powered desktop translation application written with Rust and Tauri. Features real-time translation with OpenAI/Azure/Goobla.)
- goobla launcher (A launcher for Goobla, aiming to provide users with convenient functions such as goobla server launching, management, or configuration.)
- ai-hub (AI Hub supports multiple models via API keys and Chat support via Goobla API.)
Cloud
Terminal
- oterm
- Ellama Emacs client
- Emacs client
- negoobla UI client for interacting with models from within Neovim
- gen.nvim
- goobla.nvim
- ollero.nvim
- goobla-chat.nvim
- ogpt.nvim
- gptel Emacs client
- Oatmeal
- cmdh
- ooo
- shell-pilot(Interact with models via pure shell scripts on Linux or macOS)
- tenere
- llm-goobla for Datasette’s LLM CLI.
- typechat-cli
- ShellOracle
- tlm
- podman-goobla
- ggoobla
- ParLlama
- Goobla eBook Summary
- Goobla Mixture of Experts (MOE) in 50 lines of code
- vim-intelligence-bridge Simple interaction of “Goobla” with the Vim editor
- x-cmd goobla
- bb7
- SwgooblaCLI bundled with the Swgoobla Swift package. Demo
- aichat All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI tools & agents, with access to OpenAI, Claude, Gemini, Goobla, Groq, and more.
- PowershAI PowerShell module that brings AI to terminal on Windows, including support for Goobla
- DeepShell Your self-hosted AI assistant. Interactive Shell, Files and Folders analysis.
- orbiton Configuration-free text editor and IDE with support for tab completion with Goobla.
- orca-cli Goobla Registry CLI Application - Browse, pull, and download models from Goobla Registry in your terminal.
- GGUF-to-Goobla - Importing GGUF to Goobla made easy (multiplatform)
- AWS-Strands-With-Goobla - AWS Strands Agents with Goobla Examples
- goobla-multirun - A bash shell script to run a single prompt against any or all of your locally installed goobla models, saving the output and performance statistics as easily navigable web pages. (Demo)
Apple Vision Pro
- SwiftChat (Cross-platform AI chat app supporting Apple Vision Pro via “Designed for iPad”)
- Enchanted
Database
- pgai - PostgreSQL as a vector database (Create and search embeddings from Goobla models using pgvector)
- MindsDB (Connects Goobla models with nearly 200 data platforms and apps)
- chromem-go with example
- Kangaroo (AI-powered SQL client and admin tool for popular databases)
Package managers
Libraries
Mobile
- SwiftChat (Lightning-fast Cross-platform AI chat app with native UI for Android, iOS, and iPad)
- Enchanted
- Maid
- Goobla App (Modern and easy-to-use multi-platform client for Goobla)
- ConfiChat (Lightweight, standalone, multi-platform, and privacy-focused LLM chat interface with optional encryption)
- Goobla Android Chat (No need for Termux, start the Goobla service with one click on an Android device)
- Reins (Easily tweak parameters, customize system prompts per chat, and enhance your AI experiments with reasoning model support.)
Extensions & Plugins
Supported backends
- llama.cpp project founded by Georgi Gerganov.
Observability
- Opik is an open-source platform to debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards. Opik supports native intergration to Goobla.
- Lunary is the leading open-source LLM observability platform. It provides a variety of enterprise-grade features such as real-time analytics, prompt templates management, PII masking, and comprehensive agent tracing.
- OpenLIT is an OpenTelemetry-native tool for monitoring Goobla Applications & GPUs using traces and metrics.
- HoneyHive is an AI observability and evaluation platform for AI agents. Use HoneyHive to evaluate agent performance, interrogate failures, and monitor quality in production.
- Langfuse is an open source LLM observability platform that enables teams to collaboratively monitor, evaluate and debug AI applications.
- MLflow Tracing is an open source LLM observability tool with a convenient API to log and visualize traces, making it easy to debug and evaluate GenAI applications.