LLM-Manager: Orchestrating Ollama and Llama.cpp with Pure Bash

LLM-Manager is a lightweight, modular Bash suite with a dual JSON/Interactive interface designed to manage local and remote inference engines across Linux and WSL2.

When I started experimenting with Large Language Models (LLMs) to build an On-Premise RAG (Retrieval-Augmented Generation) application, I hit a massive roadblock: environment fragmentation.

Managing multiple inference engines like Ollama and Llama.cpp meant memorizing different command-line flags, environment variables, and configurations. Once my frontend and backend prototypes were ready for testing, I realized I was spending too much time manually starting, stopping, loading, and unloading models.

I looked online for solutions. Most people suggested complex Python scripts, heavy Docker setups, n8n workflows, or complicated web dashboards.

I didn't want the bloat. I wanted something lightweight that executed commands as if I were doing them manually, but with zero cognitive load.

LLM-Manager is a lightweight, modular Bash suite with a dual JSON/Interactive interface designed to manage local and remote inference engines across Linux and WSL2.

When I started experimenting with Large Language Models (LLMs) to build an On-Premise RAG (Retrieval-Augmented Generation) application, I hit a massive roadblock: environment fragmentation.

I looked online for solutions. Most people suggested complex Python scripts, heavy Docker setups, n8n workflows, or complicated web dashboards.

I didn't want the bloat. I wanted something lightweight that executed commands as if I were doing them manually, but with zero cognitive load.

LLM-Manager: Orchestrating Ollama and Llama.cpp with Pure Bash

LLM-Manager: Orchestrating Ollama and Llama.cpp with Pure Bash

Related reading

What Is Ollama? The Complete Guide to Running LLMs Locally in 2026

Setting Up a Local AI Coding Agent with Ollama and Aider

Build a Unified AI Gateway with LiteLLM and Ollama

Stop Your LLM From Getting Owned

Getting Started: Run Your First Local LLM in 5 Minutes

Llamafile vs vLLM: Two Ways to Serve a Local Model, and When Each Makes Sense

Related reading

What Is Ollama? The Complete Guide to Running LLMs Locally in 2026

Setting Up a Local AI Coding Agent with Ollama and Aider

Build a Unified AI Gateway with LiteLLM and Ollama

Stop Your LLM From Getting Owned

Getting Started: Run Your First Local LLM in 5 Minutes

Llamafile vs vLLM: Two Ways to Serve a Local Model, and When Each Makes Sense