ForeLLM | Find What Runs On Your Hardware

500+ models. One command. Find what runs in a good interface.

ForeLLM reads your RAM, CPU, and GPU, scores hundreds of models by fit and speed, and tells you which will run well locally. TUI, CLI, REST API, and an optional desktop GUI.

How it works

Three steps from hardware to running models.

1) Detect hardware

Reads RAM, CPU cores, GPU backend, and VRAM so recommendations match your machine.

2) Score models

Ranks by quality, speed, fit, and context so the list is useful, not generic.

3) Run locally

Use top picks with Ollama, llama.cpp, or MLX—copy the command and go.

Features

What you get out of the box.

TUIInteractive terminal: navigate models, filter, and check fit in real time.

CLISystem info, fit ranking, search. Script and automate easily.

REST APIRun forellm serve for integrations and automation.

Desktop GUIElectron dashboard: Model Explorer, What-If simulator, Multi-Model Cart, per-row download.

Agent ForeAI chat with real-time streaming, file attachments, and tools: read documents, run Python, web search, run terminal commands (with your Allow/Deny). Multiple agents and reply buttons. Full ForeLLM and model context.

Multi-GPUCombined GPU memory awareness for multi-card setups.

MoECorrect memory and run-mode handling for Mixtral, DeepSeek-V2, etc.

ProvidersOllama, llama.cpp, MLX—detect installed models and run commands.

QuantizationPicks the best quant that fits your VRAM/RAM.

Install

Copy the command for your platform and run it.

Windows (Scoop)

scoop install forellm

macOS / Linux (Homebrew)

brew install forellm

Docker

docker build -t forellm . && docker run --rm -it forellm

From repo root. Run CLI: docker run --rm forellm fit --json

Quick install (Linux/macOS)

curl -fsSL https://raw.githubusercontent.com/emireln/forellm/main/install.sh | sh

From source

git clone https://github.com/emireln/forellm.git && cd forellm && cargo build --release

Requires Rust. Binary: target/release/forellm