AI Assistent for Translators

How to Build Your Local Translation Assistant

A Practical Guide for Professional Translators

This quick guide explains how to create a fully local translation assistant that runs on your own computer, respects your glossaries and reference materials, and does not send any data to external servers.

The setup is suitable for translators working with confidential, regulated, or proprietary content.

1. What This Translation Assistant Does

This assistant allows you to:

Translate texts using a local large language model (LLM)
Reference your own glossaries, previous translations, and style guides
Enforce consistent terminology and style via system prompts
Work entirely offline (except for initial downloads)

It is designed as a translation assistant, not a fully automated CAT tool.

2. System Overview

The system consists of four components:

Ollama: runs language models locally
Open WebUI: browser interface (similar to ChatGPT)
Embedding model (nomic-embed-text): indexes your documents
Adaptive Memory add-on: stores your preferences and habits

3. Prerequisites

You need:

Windows, macOS, or Linux
A terminal (Command Prompt / PowerShell / Terminal);
At least 16 GB RAM recommended
Optional GPU for larger models (strongly recommended for professional use)

No programming skills are required.

4. Install Ollama (Local Model Runtime)

Ollama runs the AI models locally.

Download and install:

https://ollama.com/download

Follow the installer for your operating system.

5. Download a Translation-Capable Model

For professional translation quality, avoid very small models.

Recommended options:

General-purpose, high quality: ollama pull gpt-oss-20b

Faster, lighter alternative: ollama pull llama3.1:8b

6. Install Open WebUI (User Interface)

Open WebUI provides a browser-based interface to interact with your local models.

Download Docker Desktop:
https://www.docker.com/products/docker-desktop/

Then follow the Open WebUI Docker quick start:
https://docs.openwebui.com/getting-started/quick-start/

After starting the container, open:

http://localhost:3000

7. Install the Embedding Model (Required for Documents)

To enable document-aware translation, install the embedding model: ollama pull nomic-embed-text

This model:

Does not appear in chat
Is used internally to index and retrieve document fragments

8. Configure Document Processing in Open WebUI

Open Admin Panel
Go to Settings → Documents
Configure Embedding:
- Embedding Model Engine: Ollama
- Embedding Model: nomic-embed-text
- API key: leave empty
Under Retrieval, enable Full Context Mode
Save the settings.

9. Chunking Settings (Important for Translation)

Documents are split into chunks before indexing.

Recommended values for translation work:

Chunk size: 256–384 tokens
Overlap: 15–20%

This improves terminological consistency and reduces context loss.

⚠️ Changing these values later requires re-uploading documents.

10. Create a Knowledge Base

Go to Workspace → Knowledge
Click Create Knowledge
Give it a name (for example): Client Glossaries and Style Guides
Upload:

- Glossaries
- Previous translations
- Style guides
- Client instructions

When you upload documents to Knowledge, Open WebUI always performs background indexing. This process is automatic and consists of several internal steps:

The uploaded file is parsed (for example, PDF or DOCX is converted into plain text)
The extracted text is split into smaller chunks (chunking)
An embedding model (such as nomic-embed-text) is applied to each chunk
The resulting vectors are stored in an internal vector database
When you submit a query, semantic retrieval is performed to find the most relevant chunks, which are then injected into the model’s context.

This indexing step is essential for Knowledge to work. Without it, document-based retrieval would not be possible.

There may be no explicit “indexing complete” or “ready” indicator in the interface, depending on the Open WebUI version. Indexing happens in the background, so users should allow some time after uploading documents and watch for any error messages during processing.

11. Create a Custom Translation Model

Go to Workspace → Models
Click Create Model
Select your base model (e.g. qwen2.5:14b)
Attach your Knowledge base
Optionally enable Adaptive Memory
Save the model.

This ensures that relevant document fragments are automatically included whenever you translate.

12. Install Adaptive Memory

Adaptive Memory stores your working preferences (not documents).

Adaptive Memory download: https://openwebui.com/posts/9b50f29d-92c2-4028-b94e-78cead0d8c88

In Open WebUI:

Go to Workspace → Functions
Import or enable Adaptive Memory V3
Activate it for your custom model

Examples of what Adaptive Memory can store:

Preferred target language
Tone requirements
Formatting habits.

13. Add a Translation-Focused System Prompt

Define strict translation behaviour in the model settings.

Example system prompt:

You are a professional translator. Translate texts, preserving terminology and phrasing from the provided documents. Prefer terms used in my reference materials. Do not paraphrase unless explicitly requested.

Do not invent terminology. Output only the translation, without commentary.

This prompt applies automatically to every chat using the custom model.

14. Using the Assistant for Translation

Start a new chat
Select your custom translation model
Provide clear instructions, for example:

Translate the following text into Italian. Use my documents for terminology and style consistency.

The assistant will:

Retrieve relevant document fragments
Inject them into the context
Generate a translation aligned with your materials.

15. Best Practices for Translators

Create separate Knowledge bases per client or domain.
Keep system prompts strict and unambiguous.
Always review output professionally.
Treat this as an assistant, not an autonomous translator.

16. Result

You now have:

A fully local translation assistant
Document-aware translation support
No data leakage
A setup that any professional translator can maintain independently.

This guide is inspired by the freeCodeCamp guide on running local LLMs for document interaction, adapted here for the needs of professional translators.

Page updated

Google Sites

Report abuse