Systems·April 23, 2026·9 min

Self-Hosted AI Automation With n8n: A Practical Setup

A practical self-hosted AI automation setup with n8n: webhooks, model calls, review gates, workflow logs, and the parts I keep outside SaaS.

Patch-bay automation board with trigger nodes, cable routes, and event-flow rails.

Self-Hosted AI Automation With n8n: A Practical Setup

Most AI automation tutorials assume you're happy routing everything through someone else's API, paying per workflow execution, and trusting a hosted service with your data.

I wasn't. I wanted my automation stack self-hosted, inspectable, and running on models I control. n8n turned out to be the right tool for that, but the setup path was rougher than the docs suggest.

This post covers what I actually did to get n8n running with local and cloud AI models, where the common setup guides fall short, and the decisions that mattered most.

Why n8n and not the alternatives

I looked at Make, Zapier, and a few purpose-built AI platforms before landing on n8n. The reasons:

Self-hostable. n8n runs on my hardware. No per-execution billing, no vendor lock-in on my workflow definitions, no surprise pricing changes.

Visual workflow editor with real code escape hatches. The visual editor handles 80% of what I need. When it doesn't, I drop into JavaScript or Python nodes. I don't have to choose between "no-code" and "write everything from scratch."

AI agent nodes built in. n8n's AI Agent node supports memory, tools, and multi-step reasoning. It's not a bolt-on. The agent infrastructure is part of the platform, including sub-agent delegation for complex tasks.

Fair-code license. Not fully open source, but the source is visible and you can self-host freely. The tradeoff: you can't resell n8n as a service without a commercial license. For personal and internal use, it's unrestricted.

The main alternatives: Make and Zapier are SaaS-only with per-execution pricing. Dify is closer to n8n but leans toward RAG app builders more than general workflow automation. AutoGPT and similar agent frameworks are interesting but lack the workflow editor and integration library that n8n provides out of the box.

What you need before you start

A VPS or local machine with at least 2GB RAM. Docker and Docker Compose installed. Basic command-line comfort. A domain name if you want HTTPS (recommended but not required for local testing).

If you're planning to run local models through Ollama alongside n8n, budget more RAM. A 7B parameter model needs about 8GB. A 13B model needs roughly 16GB. You can run n8n itself on very little; the model inference is what eats resources.

Setting up n8n with Docker Compose

The official docs walk you through a basic Docker setup. Here's what they don't emphasize enough: use PostgreSQL, not SQLite, for anything beyond testing.

SQLite works for experimenting. It will eventually corrupt under concurrent writes, which happen more often than you'd expect when AI agents are making decisions and triggering workflows in parallel. PostgreSQL is the production choice.

Create a directory and a docker-compose.yml:

version: '3.8'

services:
  n8n:
    image: n8nio/n8n:latest
    restart: always
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=${N8N_HOST}
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://${N8N_HOST}/
      - GENERIC_TIMEZONE=${TIMEZONE}
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_DATABASE=${POSTGRES_DB}
      - DB_POSTGRESDB_USER=${POSTGRES_USER}
      - DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
      - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
    volumes:
      - n8n_data:/home/node/.n8n
    depends_on:
      - postgres

  postgres:
    image: postgres:15
    restart: always
    environment:
      - POSTGRES_DB=${POSTGRES_DB}
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  n8n_data:
  postgres_data:

And a .env file in the same directory:

N8N_HOST=n8n.yourdomain.com
TIMEZONE=America/Los_Angeles
N8N_ENCRYPTION_KEY=generate_a_random_32_char_key_here
POSTGRES_DB=n8n
POSTGRES_USER=n8n
POSTGRES_PASSWORD=use_a_strong_password

Then start it:

docker compose up -d

n8n will be available at http://localhost:5678 or your configured domain. The first visit prompts you to create an admin account.

For HTTPS, put Caddy or Nginx in front as a reverse proxy. Caddy is simpler: it handles TLS certificates automatically. Nginx gives you more control if you need it.

Adding local models with Ollama

This is where most guides stop at "add an Ollama container to your compose file." That works, but there are a few things worth knowing upfront.

Add Ollama to your docker-compose.yml:

  ollama:
    image: ollama/ollama:latest
    restart: always
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    # Uncomment for GPU support:
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: 1
    #           capabilities: [gpu]

volumes:
  ollama_data:

After starting Ollama, pull the models you want:

docker exec -it n8n-docker-ollama-1 ollama pull llama3.1
docker exec -it n8n-docker-ollama-1 ollama pull qwen2.5

In n8n, configure an Ollama credentials entry pointing to http://ollama:11434 (using the Docker service name, not localhost). Then you can use Ollama models in any AI node.

The gotcha: if you're running n8n and Ollama on the same machine without a GPU, inference will be slow. Slow enough that workflow timeouts become a real issue. Either run Ollama on a machine with a GPU, use a cloud GPU provider, or stick to small models (7B parameters or less) for local inference and route heavier tasks to cloud APIs.

The n8n AI Starter Kit

n8n publishes a self-hosted AI starter kit that bundles n8n, Ollama, Qdrant (vector database), and PostgreSQL in one Docker Compose configuration. If you want to skip the manual setup, this is the fastest path to a working AI automation stack.

The starter kit is good for getting something running quickly. The tradeoff is that it's a pre-configured bundle. When you need to customize model choices, swap Qdrant for a different vector store, or adjust resource limits, you'll be editing their compose file rather than building from scratch.

I started with the starter kit and then rebuilt piece by piece once I understood what each component was doing. That's probably the right progression for most people: start with the kit, learn the pieces, then customize.

Building your first AI workflow

The simplest useful AI workflow I built: receive a webhook, classify the incoming content with an LLM, route it based on the classification, and store the result.

Here's the structure:

Webhook node receives the incoming request
AI Agent node classifies the content (using a local or cloud model)
Switch node routes based on the classification output
Action nodes execute based on the route (save to database, send notification, create task)

The AI Agent node in n8n is more than a prompt wrapper. It supports memory (conversation context persists across turns), tools (the agent can call other n8n nodes or external APIs), and sub-agent delegation (one agent can hand off to a specialist).

For the classification step, I use a local model through Ollama. It's fast, it's free per token, and classification doesn't need frontier reasoning. For the action steps that need higher-quality output (drafting responses, making judgment calls), I route to a cloud model.

This is the same model-routing principle I wrote about in the cost breakdown post: put the cheap, fast model on the routine work and the expensive model on the work that actually benefits from it.

Where things break

The setup is the easy part. The hard part is building workflows that hold up over time.

Timeouts. AI model calls are slow compared to traditional API calls. If you're using Ollama without a GPU, a single inference call can take 30 seconds to two minutes. n8n's default timeout is often too short for this. Increase the execution timeout in your n8n settings.

Error handling. AI outputs are non-deterministic. The same input can produce different classifications, different formats, different error modes. Your Switch node needs a fallback branch. Your AI Agent node needs retry logic. And you need to log outputs somewhere inspectable so you can find out why a workflow started producing garbage.

Rate limits. Cloud API providers rate-limit. n8n's built-in retry handles transient failures, but if you're running parallel workflows that all hit the same provider, you'll hit limits faster than you expect. Consider queuing or rate-limiting at the workflow level.

Credential management. n8n stores credentials encrypted in its database. This is fine for most setups, but if you're running multiple environments (dev, staging, production), you need a strategy for credential rotation and environment-specific configuration. Don't hardcode API keys in workflow nodes.

Version control. n8n workflows live in its database, not in files by default. Export them regularly using n8n export:workflow and commit the JSON to git. When a workflow breaks and you need to roll back, you'll be glad you did.

What I'd do differently next time

I'd start with the AI starter kit and customize from there instead of building the compose file from scratch. The incremental learning from modifying a working system beats building one from zero.

I'd set up PostgreSQL, proper error handling, and workflow exports to git on day one instead of adding them after things broke.

I'd also think harder about model routing before building workflows. The temptation is to use the best model everywhere and optimize later. The right move is to decide which workflow steps need frontier reasoning, which need competent language generation, and which just need a fast classification. Build the routing into the workflow from the start.

Running costs

The whole stack runs on a $5/month VPS. The only variable cost is cloud model API calls, which I keep minimal by routing most traffic through local models and Ollama's cloud subscription.

For comparison: Zapier charges per task. Make charges per operation. n8n self-hosted charges nothing per execution. The tradeoff is you're running your own infrastructure, which means you handle updates, backups, and security. If you're comfortable with that, the cost savings are significant.

The real cost isn't the hosting. It's the time to build, debug, and maintain workflows. Budget more time than you think for getting AI outputs into a consistent format and keeping them there as models change.

Who this is for

n8n self-hosted is for people who want control over their automation data, want to avoid per-execution pricing, and are comfortable running their own infrastructure. If you're a solo operator or small team automating internal processes, the economics work. If you're building customer-facing automation that needs five-nines uptime, you'll want managed infrastructure on top.

Need to migrate before a platform deadline? See the Pipedrive Zapier V1 API Migration and OpenAI Assistants to Responses API Migration services.

The AI agent nodes make n8n a serious option for people who would otherwise be writing custom Python orchestration from scratch. You get the visual editor for common patterns and the code nodes for everything else. The agent infrastructure handles memory, tools, and delegation without you having to build that layer yourself.

Start simple. One webhook. One classification. One action. Add complexity after the basic loop works reliably.

Some links on this site may be affiliate links. I only recommend tools I use. If you click through and make a purchase, I may earn a small commission at no extra cost to you.