7.0 KiB
Multilingual Support
News Agent supports articles in multiple languages, including Norwegian, English, and others.
Configuration
1. Add Norwegian RSS Sources
In config.yaml:
sources:
rss:
- name: "NRK Nyheter"
url: "https://www.nrk.no/toppsaker.rss"
category: "tech"
- name: "Digi.no"
url: "https://www.digi.no/rss"
category: "tech"
- name: "Kode24"
url: "https://www.kode24.no/rss"
category: "development"
2. Add Multilingual Interests
In config.yaml under ai.interests:
ai:
interests:
- "Technology news from Norway (Norwegian articles welcome)"
- "Norwegian tech industry and startups"
- "General news from Norway in Norwegian"
- "AI and machine learning developments"
- "Self-hosting solutions"
# ... other interests
Key points:
- Be explicit: "Norwegian articles welcome" or "in Norwegian"
- Mention the country/region for local news
- Use natural language
3. Language Handling
The AI prompts now explicitly support multiple languages:
Filtering:
- Articles evaluated regardless of language
- Norwegian content given equal consideration
- Interest matching works across languages
Summarization:
- Summaries written in the SAME language as the source
- Norwegian articles → Norwegian summaries
- English articles → English summaries
Tips for Better Norwegian Content
1. Be Specific with Interests
❌ Too vague:
- "News from Norway"
✅ Better:
- "Norwegian technology news (accept Norwegian language)"
- "Politik og samfunn fra Norge"
- "Norsk tech-industri og oppstartselskaper"
2. Lower Filtering Threshold
Norwegian content might score slightly lower initially. Try:
ai:
filtering:
min_score: 5.0 # Lower threshold (was 5.5)
3. Use Norwegian News Sources
Tech/Development:
- Digi.no:
https://www.digi.no/rss - Kode24:
https://www.kode24.no/rss - Tek.no:
https://www.tek.no/rss
General News:
- NRK:
https://www.nrk.no/toppsaker.rss - VG:
https://www.vg.no/rss/feed/ - Aftenposten:
https://www.aftenposten.no/rss
Business/Tech:
- DN (Dagens Næringsliv): Check their RSS feeds
- E24:
https://e24.no/rss
4. Mixed Language Email
Your email will contain both English and Norwegian articles:
TECH
----
• Norwegian startup raises $10M (English summary)
• Norsk AI-selskap lanserer ny tjeneste (Norwegian summary)
DEVELOPMENT
-----------
• New Python framework released (English summary)
• Kode24: Sånn bruker du GitHub Copilot (Norwegian summary)
Troubleshooting
Not Getting Norwegian Articles?
1. Check if articles are being fetched:
# Enable debug logging
# In config.yaml:
logging:
level: "DEBUG"
# Run and check
python -m src.main
grep -i "norwegian\|norge" data/logs/news-agent.log
2. Check filtering scores:
sqlite3 data/articles.db "SELECT title, relevance_score, source FROM articles WHERE source LIKE '%norsk%' OR source LIKE '%norweg%' ORDER BY fetched_at DESC LIMIT 10;"
3. Verify RSS feed works:
curl -s "https://www.digi.no/rss" | head -50
4. Manually check if feed has recent content: Visit the RSS URL in your browser
Norwegian Articles Scoring Low?
Possible reasons:
-
Interest not specific enough
- Add: "Norwegian technology and business news in Norwegian language"
-
Threshold too high
- Lower to 4.5 or 5.0
-
Content too general
- Norwegian general news might not match "tech" interests
- Add specific Norwegian interests
-
Article content is short
- Some RSS feeds only include headlines
- AI can't judge relevance from title alone
Mixed Results?
If you're getting English but not Norwegian:
-
Check the interest phrasing:
# Add to top of interests list: interests: - "Norwegian news and technology (Norwegian language accepted)" - "Norge: teknologi, samfunn, og næringsliv" -
Use a more permissive model:
ai: model: "anthropic/claude-3.5-haiku" # Better with multiple languages -
Test with debug mode:
# Enable debug logging and run python -m src.main 2>&1 | grep -A 3 -B 3 "Norwegian\|Norge"
Example Configuration
Complete example supporting both English and Norwegian:
sources:
rss:
# English sources
- name: "Hacker News"
url: "https://news.ycombinator.com/rss"
category: "tech"
- name: "TechCrunch"
url: "https://techcrunch.com/feed/"
category: "tech"
# Norwegian sources
- name: "Digi.no"
url: "https://www.digi.no/rss"
category: "tech"
- name: "Kode24"
url: "https://www.kode24.no/rss"
category: "development"
- name: "NRK Nyheter"
url: "https://www.nrk.no/toppsaker.rss"
category: "tech"
ai:
model: "openai/gpt-4o-mini"
filtering:
enabled: true
min_score: 5.0 # Slightly lower for Norwegian content
max_articles: 20 # More articles to ensure Norwegian included
interests:
# Norwegian-specific
- "Norwegian technology news and developments (Norwegian language)"
- "Norsk tech-industri, oppstartselskaper, og innovasjon"
- "General news from Norway or about Norway"
# General (works for both languages)
- "AI and machine learning developments"
- "Open source projects and tools"
- "Self-hosting solutions"
- "Python and software development"
Language Statistics
After running, check language distribution:
sqlite3 data/articles.db "
SELECT
CASE
WHEN source LIKE '%norsk%' OR source LIKE '%digi%' OR source LIKE '%kode%' THEN 'Norwegian'
ELSE 'English'
END as language,
COUNT(*) as count,
AVG(relevance_score) as avg_score
FROM articles
WHERE processed = 1
GROUP BY language;
"
Best Practices
- Be explicit - Tell AI that Norwegian is welcome
- Lower threshold - 5.0 instead of 5.5 or 6.5
- More articles - Increase
max_articlesto 20 - Specific interests - Mention Norwegian topics explicitly
- Good sources - Use active Norwegian tech/news RSS feeds
- Test first - Run manually with debug logging
Models and Multilingual Support
All modern models support multiple languages well:
| Model | Norwegian Support | Recommendation |
|---|---|---|
openai/gpt-4o-mini |
Excellent | ✅ Recommended |
anthropic/claude-3.5-haiku |
Excellent | ✅ Best for multilingual |
google/gemini-2.0-flash-exp:free |
Good | ⚠️ Has rate limits |
Claude models are particularly good with Scandinavian languages.
Summary
To get Norwegian articles in your digest:
- ✅ Add Norwegian RSS sources
- ✅ Add explicit Norwegian interests ("Norwegian language accepted")
- ✅ Lower filtering threshold to 5.0
- ✅ Updated prompts (already done!)
- ✅ Test with:
python -m src.main
The summaries will automatically be in the same language as the source article!