LocalMOD - Run AI Models Locally On Your Own Computer

EVERYTHING IN ONE APP

A Lot Of Power.
Kept Very Simple.

LocalMOD does a lot, but it never gets in your way. You install one app and you get a full local AI setup. Chat, agentic tasks, models, search, files, images and an API server all live in the same clean window. Here is what you can do with it.

Local Chat

Talk to AI models right on your computer. You get streaming replies, full chat history, system prompts, a thinking mode toggle and per chat model choice. You can edit, copy, regenerate and delete any message.

Agent Mode

Turn on agent mode and the AI can do real work on your machine. Run terminal commands, read files, write files, edit them, delete them, create folders and more. Built for local developer workflows when you want agentic tasks, not just chat.

Local GGUF Models

Import any GGUF model from a file on your disk, or pull one straight from a Hugging Face URL or model ID. LocalMOD loads it and runs it through the bundled llama.cpp runtime with no setup hell.

Cloud Models

Add any cloud model you want. OpenAI, Anthropic, OpenRouter or your own custom provider. Paste your key and use big remote models in the same window as your local ones. Mix and match per chat.

Reference Files And RAG

Add files and text to a simple reference system. Drop in notes, docs, code snippets or research and let the AI use them as context while you chat. A clean local RAG setup with no extra services to run.

Web Search

Let the AI search the web while you chat. Ask about fresh topics and get answers that pull in real results instead of stale model knowledge. It stays simple and stays in the same chat box.

Image Generation

Make images straight from chat using cloud models. Type what you want and get a picture back. This one is for cloud models only so your local runtime stays light and fast for text.

Built In API Server

Flip one switch and LocalMOD runs an OpenAI compatible API server. Other apps can connect to your local models. It keeps running behind the scenes, even after you close the window.

Model Benchmarking

Test one model or compare two side by side. LocalMOD checks speed, latency, basic reasoning, coding output and memory use so you know which model actually runs well on your machine.

Download Manager

Grab models with a real download manager. Long downloads can pause, resume, cancel and dismiss. You never have to babysit a giant file or restart from zero when something drops.

Settings And Context Limit

Set the context limit and tune the app the way you like. Change behavior, control how much the model remembers and keep things running the way that fits your computer and your work.

WHY LOCALMOD EXISTS

Local AI Is Great.
The Setup Is Not.

Local AI is powerful. The problem is that getting it to work is a mess for most people. You end up downloading random binaries, editing config by hand, running terminal commands and guessing which port a server is on. You manage model folders by yourself and you deal with crashes when a model fails to load. Then you have to figure out which model is fast, which is slow, which is good and which is just broken.

LocalMOD was made to fix that. You install one app. You add models. You chat with them. You test them. You turn on an API server for your other apps. That is the whole idea. One clean window that feels like a normal app, with deeper control sitting right there when you need it.

And because the whole thing is open source under the MIT license, you are never stuck. Developers can fork the project and build their own local AI runtime on top of it. Normal users get a tool that just works. Nobody pays a cent and nobody gets locked in.

BUILT ON SOLID GROUND

The Tech Stack.

LocalMOD is fast and light because it is built on tools that are made for the job. The desktop shell is Tauri. The brain is Rust. The interface is Svelte. The local model engine is llama.cpp. Here is the full set.

RustThe backend and the API server. Fast, safe and rock solid.

Tauri 2The desktop shell. Small, native and quick to start.

Svelte 5The frontend. Clean, snappy and easy on the eyes.

SvelteKitPage routing for chats, models, settings and more.

TypeScriptType safe glue across the whole frontend.

Tailwind CSSStyling that stays tidy as the app grows.

SQLiteLocal storage for chats, models and settings.

AxumThe web layer that powers the API server.

llama.cppThe bundled runtime that runs your local models.

YOUR OWN API SERVER

Turn Your Computer
Into An AI Server.

LocalMOD ships with an OpenAI compatible API server. Open the settings page, turn it on and your local models are ready for any app that speaks the OpenAI style. You can set a port, pick who is allowed to connect, copy the URL and switch API key auth on or off.

The server runs as its own binary, so it can keep working after you close the desktop window. Run it on a home machine, a LAN box or a VPS. Point your tools at it and you are done.

localmod api

# list your models
GET  /v1/models

# send a chat message
POST /v1/chat/completions

{
  "model": "Your Model Name",
  "messages": [
    { "role": "user",
      "content": "Hello" }
  ],
  "stream": false
}

GET GOING IN MINUTES

How It Works.

There is no long setup and no terminal needed. The bundled runtime ships inside the installer, so you go from download to chatting in four simple steps.

01

Download The Installer

Grab the Windows setup file from GitHub. It is one installer that carries the app and the bundled llama.cpp runtime inside it.

02

Install The App

Run the setup and let it install. No separate llama.cpp download and no manual DLL hunting. Everything you need lands in one place.

03

Add A Model

Import a GGUF file, pull one from Hugging Face or add a cloud model. Pick the one you want and load it in a click.

04

Start Chatting

Open a chat and talk to your model. Turn on web search, add reference files or flip on the API server whenever you like.

COMMON QUESTIONS

Good To Know.

Is LocalMOD really free?

Yes. The whole app is free and fully open source under the MIT license. There is no paid tier, no trial and no account needed. You download it and you use everything.

Does it run fully offline?

Local GGUF chat runs on your computer with no internet. Cloud models, web search and image generation need a connection because they reach out to remote services.

Do I need to download llama.cpp myself?

No. The installer carries the llama.cpp runtime inside it. You install the app and local models just work. There is nothing extra to set up by hand.

Which models can I use?

Any GGUF model from a file or a Hugging Face URL or ID for local use. For cloud, you can add any provider you want, such as OpenAI, Anthropic, OpenRouter or your own custom endpoint.

Can other apps connect to it?

Yes. Turn on the built in API server and any OpenAI compatible client can connect through the standard endpoints. It can keep running in the background after you close the window.

What platforms are supported?

Windows 10 and Windows 11 right now. Linux and macOS support is planned. Since the code is open, anyone can help push those builds along.

Chat And Do
Agentic Tasks Locally.

A Lot Of Power.
Kept Very Simple.

Local Chat

Agent Mode

Local GGUF Models

Cloud Models

Reference Files And RAG

Web Search

Image Generation

Built In API Server

Model Benchmarking

Download Manager

Settings And Context Limit

Local AI Is Great.
The Setup Is Not.

The Tech Stack.

Turn Your Computer
Into An AI Server.

How It Works.

Download The Installer

Install The App

Add A Model

Start Chatting

Good To Know.

Get LocalMOD.
It Is Yours.

Chat And DoAgentic Tasks Locally.

A Lot Of Power.Kept Very Simple.

Local Chat

Agent Mode

Local GGUF Models

Cloud Models

Reference Files And RAG

Web Search

Image Generation

Built In API Server

Model Benchmarking

Download Manager

Settings And Context Limit

Local AI Is Great.The Setup Is Not.

The Tech Stack.

Turn Your ComputerInto An AI Server.

How It Works.

Download The Installer

Install The App

Add A Model

Start Chatting

Good To Know.

Get LocalMOD.It Is Yours.

Chat And Do
Agentic Tasks Locally.

A Lot Of Power.
Kept Very Simple.

Local AI Is Great.
The Setup Is Not.

Turn Your Computer
Into An AI Server.

Get LocalMOD.
It Is Yours.