Build an AI Translation App with Python: Beyond Google Translate
Build a context-aware translation app using LLMs that handles idioms, cultural context, and domain-specific terminology. Full Python code with Flask API and simple web frontend.
Google Translate handles 100+ billion words per day. It’s fast, free, and good enough for getting the gist of a menu in Tokyo. But “good enough” isn’t good enough for business documents, marketing copy, technical documentation, or any text where nuance matters.
The problem with traditional machine translation: it translates words, not meaning. “It’s raining cats and dogs” becomes literal animals falling from the sky. Technical jargon gets mangled. Cultural context is ignored. Tone is lost.
LLM-based translation solves these problems by understanding context before translating. In this tutorial, we’ll build a translation app that handles idioms, domain terminology, and cultural adaptation — the things Google Translate gets wrong.
Architecture
User Input (text + source lang + target lang + context)
→ Preprocessing (language detection, text normalization)
→ Translation Engine (Claude API or local model)
→ Post-processing (formatting, quality check)
→ Output (translated text + alternatives + notes)
Setup
mkdir ai-translator && cd ai-translator
python3 -m venv venv
source venv/bin/activate
pip install flask anthropic langdetect python-dotenv
The Translation Engine
# translator.py
"""AI-powered translation engine with context awareness."""
import json
import anthropic
from langdetect import detect
class AITranslator:
def __init__(self, api_key: str):
self.client = anthropic.Anthropic(api_key=api_key)
def detect_language(self, text: str) -> str:
"""Detect the language of input text."""
try:
return detect(text)
except Exception:
return "unknown"
def translate(
self,
text: str,
target_lang: str,
source_lang: str = "auto",
context: str = "",
domain: str = "general",
formality: str = "neutral"
) -> dict:
"""
Translate text with context awareness.
Args:
text: Text to translate
target_lang: Target language (e.g., "Japanese", "Spanish")
source_lang: Source language ("auto" for detection)
context: Additional context about the text
domain: Domain for terminology (general, medical, legal, tech)
formality: Formality level (formal, neutral, casual)
Returns:
dict with translation, alternatives, and notes
"""
if source_lang == "auto":
detected = self.detect_language(text)
source_lang = detected
domain_instructions = {
"general": "",
"medical": "Use precise medical terminology. Maintain clinical accuracy.",
"legal": "Use exact legal terminology. Preserve legal meaning precisely.",
"tech": "Use standard technical terminology. Keep code/commands untranslated.",
"marketing": "Adapt for cultural resonance. Prioritize impact over literal meaning.",
"academic": "Maintain academic register. Preserve citation formats.",
}
formality_instructions = {
"formal": "Use formal register and honorifics where appropriate.",
"neutral": "Use standard register.",
"casual": "Use conversational, casual language.",
}
prompt = f"""Translate the following text from {source_lang} to {target_lang}.
Text to translate:
"{text}"
{f'Context: {context}' if context else ''}
Domain: {domain}
{domain_instructions.get(domain, '')}
Formality: {formality}
{formality_instructions.get(formality, '')}
Return ONLY valid JSON:
{{
"translation": "the translated text",
"alternatives": ["1-2 alternative translations if applicable"],
"notes": ["any important translation notes, cultural context, or ambiguities"],
"idioms_adapted": ["list any idioms that were culturally adapted rather than literally translated"],
"confidence": 0.0 to 1.0
}}
Rules:
- Translate meaning, not just words
- Adapt idioms and cultural references for the target culture
- Preserve formatting (paragraphs, lists, emphasis)
- For technical terms with no standard translation, keep original in parentheses
- Flag any ambiguous passages in notes"""
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
system=(
"You are a professional translator with expertise in "
"cultural adaptation and domain-specific terminology. "
"Always return valid JSON."
),
messages=[{"role": "user", "content": prompt}]
)
result_text = response.content[0].text
if "```json" in result_text:
result_text = result_text.split("```json")[1].split("```")[0]
elif "```" in result_text:
result_text = result_text.split("```")[1].split("```")[0]
try:
result = json.loads(result_text.strip())
except json.JSONDecodeError:
result = {
"translation": result_text.strip(),
"alternatives": [],
"notes": ["JSON parsing failed; raw translation returned"],
"idioms_adapted": [],
"confidence": 0.5
}
result['source_language'] = source_lang
result['target_language'] = target_lang
result['original'] = text
return result
def translate_batch(
self,
texts: list[str],
target_lang: str,
**kwargs
) -> list[dict]:
"""Translate multiple texts."""
return [
self.translate(text, target_lang, **kwargs)
for text in texts
]
def compare_translations(
self,
text: str,
target_lang: str,
source_lang: str = "auto"
) -> dict:
"""Generate multiple translation styles for comparison."""
styles = {
"literal": "Translate as literally as possible while remaining grammatically correct.",
"natural": "Translate for natural, fluent reading in the target language.",
"adapted": "Freely adapt for maximum cultural resonance and impact.",
}
results = {}
for style_name, instruction in styles.items():
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{
"role": "user",
"content": (
f"Translate from {source_lang} to {target_lang}. "
f"Style: {instruction}\n\n"
f"Text: \"{text}\"\n\n"
f"Return only the translation, nothing else."
)
}]
)
results[style_name] = response.content[0].text.strip().strip('"')
return {
"original": text,
"source_language": source_lang,
"target_language": target_lang,
"translations": results
}
The Web API
# app.py
"""Flask API for the translation service."""
import os
from flask import Flask, request, jsonify, render_template_string
from dotenv import load_dotenv
from translator import AITranslator
load_dotenv()
app = Flask(__name__)
translator = AITranslator(api_key=os.getenv('ANTHROPIC_API_KEY'))
SUPPORTED_LANGUAGES = [
"English", "Spanish", "French", "German", "Italian",
"Portuguese", "Japanese", "Korean", "Chinese (Simplified)",
"Chinese (Traditional)", "Arabic", "Hindi", "Russian",
"Dutch", "Swedish", "Polish", "Turkish", "Thai",
"Vietnamese", "Indonesian"
]
@app.route('/')
def index():
return render_template_string(HTML_TEMPLATE, languages=SUPPORTED_LANGUAGES)
@app.route('/api/translate', methods=['POST'])
def api_translate():
data = request.json
if not data or 'text' not in data or 'target_lang' not in data:
return jsonify({'error': 'Missing required fields: text, target_lang'}), 400
result = translator.translate(
text=data['text'],
target_lang=data['target_lang'],
source_lang=data.get('source_lang', 'auto'),
context=data.get('context', ''),
domain=data.get('domain', 'general'),
formality=data.get('formality', 'neutral')
)
return jsonify(result)
@app.route('/api/compare', methods=['POST'])
def api_compare():
data = request.json
if not data or 'text' not in data or 'target_lang' not in data:
return jsonify({'error': 'Missing required fields: text, target_lang'}), 400
result = translator.compare_translations(
text=data['text'],
target_lang=data['target_lang'],
source_lang=data.get('source_lang', 'auto')
)
return jsonify(result)
HTML_TEMPLATE = """
<!DOCTYPE html>
<html>
<head>
<title>AI Translator</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: system-ui; max-width: 900px; margin: 0 auto; padding: 20px; background: #f5f5f5; }
h1 { margin-bottom: 20px; }
.container { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; }
textarea { width: 100%; height: 200px; padding: 12px; border: 1px solid #ddd; border-radius: 8px; font-size: 15px; resize: vertical; }
select, button { padding: 10px 16px; border-radius: 8px; font-size: 14px; }
select { border: 1px solid #ddd; background: white; }
button { background: #2563eb; color: white; border: none; cursor: pointer; }
button:hover { background: #1d4ed8; }
.controls { display: flex; gap: 12px; margin: 12px 0; flex-wrap: wrap; align-items: center; }
.result { background: white; padding: 16px; border-radius: 8px; border: 1px solid #ddd; margin-top: 8px; }
.notes { margin-top: 12px; padding: 12px; background: #f0f9ff; border-radius: 6px; font-size: 13px; }
.note-item { margin: 4px 0; color: #555; }
</style>
</head>
<body>
<h1>AI Translator</h1>
<div class="controls">
<select id="source-lang"><option value="auto">Auto-detect</option>
{% for lang in languages %}<option value="{{ lang }}">{{ lang }}</option>{% endfor %}
</select>
<span>→</span>
<select id="target-lang">
{% for lang in languages %}<option value="{{ lang }}" {{ 'selected' if lang == 'Spanish' }}>{{ lang }}</option>{% endfor %}
</select>
<select id="domain">
<option value="general">General</option>
<option value="medical">Medical</option>
<option value="legal">Legal</option>
<option value="tech">Technical</option>
<option value="marketing">Marketing</option>
</select>
<select id="formality">
<option value="neutral">Neutral</option>
<option value="formal">Formal</option>
<option value="casual">Casual</option>
</select>
<button onclick="translate()">Translate</button>
</div>
<div class="container">
<div>
<textarea id="input" placeholder="Enter text to translate..."></textarea>
</div>
<div>
<textarea id="output" placeholder="Translation will appear here..." readonly></textarea>
<div id="notes-area" class="notes" style="display:none"></div>
</div>
</div>
<script>
async function translate() {
const text = document.getElementById('input').value;
if (!text) return;
document.getElementById('output').value = 'Translating...';
const res = await fetch('/api/translate', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
text: text,
target_lang: document.getElementById('target-lang').value,
source_lang: document.getElementById('source-lang').value,
domain: document.getElementById('domain').value,
formality: document.getElementById('formality').value
})
});
const data = await res.json();
document.getElementById('output').value = data.translation || data.error;
const notesArea = document.getElementById('notes-area');
if (data.notes && data.notes.length > 0) {
notesArea.innerHTML = '<strong>Notes:</strong>' + data.notes.map(n => '<div class="note-item">• ' + n + '</div>').join('');
notesArea.style.display = 'block';
} else { notesArea.style.display = 'none'; }
}
</script>
</body>
</html>
"""
if __name__ == '__main__':
app.run(debug=True, port=5000)
Testing: Where AI Translation Wins
Idiom Handling
English: "Break a leg!"
Google Translate → Spanish: "¡Rómpete una pierna!" (literal - sounds violent)
AI Translator → Spanish: "¡Mucha mierda!" (actual Spanish theater idiom)
Note: "Spanish theater tradition uses 'mucha mierda'
as the equivalent good luck expression"
Domain-Specific Terminology
English (medical): "The patient presents with acute myocardial infarction
with ST-segment elevation in leads V1-V4."
Google Translate → Japanese: [correct medical terms but awkward phrasing]
AI Translator → Japanese: [correct terms with proper Japanese medical
register and standard clinical phrasing]
Note: "Used standard Japanese cardiology terminology
per JCS guidelines"
Formality Levels
English: "Could you send me the report?"
AI Translator → Japanese (formal):
"レポートをお送りいただけますでしょうか。"
(keigo / very polite business Japanese)
AI Translator → Japanese (casual):
"レポート送ってくれる?"
(casual friend-to-friend)
Google Translate → Japanese:
"レポートを送ってもらえますか?"
(one option, mid-formality, often inappropriate)
Cost Analysis
Per translation (average 200 words):
- Input: ~300 tokens
- Output: ~500 tokens
- Cost (Claude Sonnet): ~$0.009
Comparison:
- Google Translate API: $20 per million characters (~$0.004 per translation)
- DeepL API: $25 per million characters (~$0.005 per translation)
- This AI translator: ~$0.009 per translation
AI translation costs ~2x more but provides:
- Context-aware translation
- Cultural adaptation
- Domain terminology
- Formality control
- Translation notes
- Multiple alternatives
When to Use What
| Scenario | Best Tool |
|---|---|
| Quick understanding of foreign text | Google Translate (free, fast) |
| Bulk translation of simple content | DeepL API (cost-effective, good quality) |
| Business documents and contracts | This AI translator (context, formality) |
| Marketing copy localization | This AI translator (cultural adaptation) |
| Medical/legal translation | This AI translator (domain expertise) |
| Real-time conversation | Google Translate (speed) |
| Website localization (1000+ pages) | DeepL + human review (cost at scale) |
The AI translation app we built handles the cases where nuance matters — and those are exactly the cases where cheap, fast translation fails most dangerously. A mistranslated marketing slogan is embarrassing. A mistranslated medical instruction is dangerous. A mistranslated legal clause is expensive.
Build for the cases that matter. Leave the menu translations to Google.
Sources
> Want more like this?
Get the best AI insights delivered weekly.
> Related Articles
Web Scraping with AI: Build a Smart Data Extraction Pipeline
Traditional web scraping breaks when websites change layouts. AI-powered scraping understands page structure and extracts data intelligently. Here's how to build one using Python, Beautiful Soup, and Claude.
Create an AI Art Portfolio: From Generation to Gallery in One Weekend
Build a professional AI art portfolio website with curated collections, consistent style, and proper attribution. Covers prompt engineering, style consistency, curation, and deployment.
Build an AI Chrome Extension: Add Claude to Any Webpage in 60 Minutes
Build a Chrome extension that summarizes web pages, answers questions about content, and rewrites selected text — all powered by Claude. Full source code and step-by-step instructions included.
Tags
> Stay in the loop
Weekly AI tools & insights.