376 lines
8.5 KiB
Markdown
376 lines
8.5 KiB
Markdown
# News Agent
|
|
|
|
An AI-powered daily tech news aggregator that fetches articles from RSS feeds, filters them by relevance to your interests, generates AI summaries, and emails you a beautifully formatted digest every morning.
|
|
|
|
## Features
|
|
|
|
- **RSS Aggregation**: Fetches from 15+ tech news sources covering Development, Self-hosting, Enterprise Architecture, and Gadgets
|
|
- **AI Filtering**: Uses OpenRouter AI to score articles based on your interests (0-10 scale)
|
|
- **Smart Summarization**: Generates concise 2-3 sentence summaries of each relevant article
|
|
- **Beautiful Emails**: HTML email with responsive design, categorized sections, and relevance scores
|
|
- **Deduplication**: SQLite database prevents duplicate articles
|
|
- **Automated Scheduling**: Runs daily at 07:00 Europe/Oslo time via systemd timer
|
|
- **Production Ready**: Error handling, logging, resource limits, and monitoring
|
|
|
|
## Architecture
|
|
|
|
```
|
|
news-agent/
|
|
├── src/
|
|
│ ├── aggregator/ # RSS feed fetching
|
|
│ ├── ai/ # OpenRouter client, filtering, summarization
|
|
│ ├── storage/ # SQLite database operations
|
|
│ ├── email/ # Email generation and sending
|
|
│ └── main.py # Main orchestrator
|
|
├── config.yaml # Configuration
|
|
├── .env # Secrets (API keys)
|
|
└── systemd/ # Service and timer files
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
- **Fedora Linux** (or other systemd-based distribution)
|
|
- **Python 3.11+**
|
|
- **SMTP Server** (your own mail server or service like Gmail, Outlook, etc.)
|
|
- **OpenRouter API Key** (get from https://openrouter.ai)
|
|
|
|
## Installation
|
|
|
|
### 1. Clone/Copy Project
|
|
|
|
```bash
|
|
# Copy this project to your home directory
|
|
mkdir -p ~/news-agent
|
|
cd ~/news-agent
|
|
```
|
|
|
|
### 2. Install Python and Dependencies
|
|
|
|
```bash
|
|
# Install Python 3.11+ if not already installed
|
|
sudo dnf install python3.11 python3-pip
|
|
|
|
# Create virtual environment
|
|
python3.11 -m venv .venv
|
|
source .venv/bin/activate
|
|
|
|
# Install dependencies
|
|
pip install -e .
|
|
```
|
|
|
|
### 3. Configure News Agent
|
|
|
|
```bash
|
|
# Copy environment template
|
|
cp .env.example .env
|
|
|
|
# Edit .env and add your credentials
|
|
nano .env
|
|
```
|
|
|
|
**Required in `.env`:**
|
|
```bash
|
|
# OpenRouter API Key
|
|
OPENROUTER_API_KEY=sk-or-v1-...your-key-here...
|
|
|
|
# SMTP Credentials for your mail server
|
|
SMTP_USERNAME=your-email@yourdomain.com
|
|
SMTP_PASSWORD=your-smtp-password
|
|
```
|
|
|
|
**Edit `config.yaml`:**
|
|
```bash
|
|
nano config.yaml
|
|
```
|
|
|
|
Update the email section:
|
|
```yaml
|
|
email:
|
|
to: "your-email@example.com" # Where to receive the digest
|
|
from: "news-agent@yourdomain.com" # Sender address
|
|
smtp:
|
|
host: "mail.yourdomain.com" # Your mail server hostname
|
|
port: 587 # 587 for TLS, 465 for SSL
|
|
use_tls: true # true for port 587
|
|
use_ssl: false # true for port 465
|
|
```
|
|
|
|
**Common SMTP Settings:**
|
|
- **Your own server**: Use your mail server hostname and credentials
|
|
- **Gmail**: `smtp.gmail.com:587`, use App Password
|
|
- **Outlook/Office365**: `smtp.office365.com:587`
|
|
- **SendGrid**: `smtp.sendgrid.net:587`, use API key as password
|
|
|
|
Optionally adjust:
|
|
- AI model (default: `google/gemini-flash-1.5-8b` - fast and cheap)
|
|
- Filtering threshold (default: 6.5/10)
|
|
- Max articles per digest (default: 15)
|
|
- RSS sources (add/remove feeds)
|
|
- Your interests for AI filtering
|
|
|
|
### 4. Test Run
|
|
|
|
```bash
|
|
# Activate virtual environment
|
|
source .venv/bin/activate
|
|
|
|
# Run manually to test
|
|
python -m src.main
|
|
```
|
|
|
|
Check:
|
|
- Console output for progress
|
|
- Logs in `data/logs/news-agent.log`
|
|
- Your email inbox for the digest
|
|
|
|
### 5. Set Up Systemd Timer
|
|
|
|
```bash
|
|
# Copy systemd files to user systemd directory
|
|
mkdir -p ~/.config/systemd/user
|
|
cp systemd/news-agent.service ~/.config/systemd/user/
|
|
cp systemd/news-agent.timer ~/.config/systemd/user/
|
|
|
|
# Edit service file to update paths if needed
|
|
nano ~/.config/systemd/user/news-agent.service
|
|
|
|
# Reload systemd
|
|
systemctl --user daemon-reload
|
|
|
|
# Enable and start timer
|
|
systemctl --user enable news-agent.timer
|
|
systemctl --user start news-agent.timer
|
|
|
|
# Check timer status
|
|
systemctl --user list-timers
|
|
systemctl --user status news-agent.timer
|
|
```
|
|
|
|
**Enable lingering** (allows user services to run when not logged in):
|
|
```bash
|
|
sudo loginctl enable-linger $USER
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Manual Run
|
|
|
|
```bash
|
|
cd ~/news-agent
|
|
source .venv/bin/activate
|
|
python -m src.main
|
|
```
|
|
|
|
### Check Status
|
|
|
|
```bash
|
|
# Check timer status
|
|
systemctl --user status news-agent.timer
|
|
|
|
# View logs
|
|
journalctl --user -u news-agent.service -f
|
|
|
|
# Or check log file
|
|
tail -f data/logs/news-agent.log
|
|
```
|
|
|
|
### Trigger Manually
|
|
|
|
```bash
|
|
# Run service immediately (without waiting for timer)
|
|
systemctl --user start news-agent.service
|
|
```
|
|
|
|
### View Last Run
|
|
|
|
```bash
|
|
systemctl --user status news-agent.service
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### RSS Sources
|
|
|
|
Add or remove sources in `config.yaml`:
|
|
|
|
```yaml
|
|
sources:
|
|
rss:
|
|
- name: "Your Source"
|
|
url: "https://example.com/feed.xml"
|
|
category: "tech" # tech, development, selfhosting, architecture, gadgets
|
|
```
|
|
|
|
### AI Configuration
|
|
|
|
**Models** (from cheap to expensive):
|
|
- `google/gemini-flash-1.5-8b` - Fast, cheap, good quality (recommended)
|
|
- `google/gemini-2.0-flash-exp:free` - Free, experimental (good for testing)
|
|
- `meta-llama/llama-3.1-8b-instruct:free` - Free, decent quality
|
|
- `anthropic/claude-3.5-haiku` - Better quality, slightly more expensive
|
|
- `openai/gpt-4o-mini` - Good quality, moderate price
|
|
|
|
See **[MODELS.md](MODELS.md)** for detailed comparison and selection guide.
|
|
Full list at: https://openrouter.ai/models
|
|
|
|
**Filtering:**
|
|
```yaml
|
|
ai:
|
|
filtering:
|
|
enabled: true
|
|
min_score: 6.5 # Articles below this score are filtered out
|
|
max_articles: 15 # Maximum articles in daily digest
|
|
```
|
|
|
|
**Interests:**
|
|
```yaml
|
|
ai:
|
|
interests:
|
|
- "Your interest here"
|
|
- "Another topic"
|
|
```
|
|
|
|
### Schedule
|
|
|
|
Change time in `~/.config/systemd/user/news-agent.timer`:
|
|
|
|
```ini
|
|
[Timer]
|
|
OnCalendar=07:00 # 24-hour format
|
|
```
|
|
|
|
Then reload:
|
|
```bash
|
|
systemctl --user daemon-reload
|
|
systemctl --user restart news-agent.timer
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### No Email Received
|
|
|
|
1. **Check logs:**
|
|
```bash
|
|
journalctl --user -u news-agent.service -n 50
|
|
```
|
|
|
|
2. **Check Postfix:**
|
|
```bash
|
|
sudo systemctl status postfix
|
|
sudo tail -f /var/log/maillog
|
|
```
|
|
|
|
3. **Test email manually:**
|
|
```bash
|
|
echo "Test email" | mail -s "Test" your-email@example.com
|
|
```
|
|
|
|
### API Errors
|
|
|
|
**Error: "No endpoints found for [model]"**
|
|
- The model name is incorrect
|
|
- Check correct model names in `MODELS.md`
|
|
- Update `config.yaml` with correct model ID
|
|
|
|
**Other API errors:**
|
|
1. **Verify API key in `.env`**
|
|
2. **Check OpenRouter credit balance:** https://openrouter.ai/credits
|
|
3. **Check rate limits in logs**
|
|
4. **Try a different model** (see `MODELS.md`)
|
|
|
|
### Service Not Running
|
|
|
|
```bash
|
|
# Check service status
|
|
systemctl --user status news-agent.service
|
|
|
|
# Check timer status
|
|
systemctl --user status news-agent.timer
|
|
|
|
# View detailed logs
|
|
journalctl --user -xe -u news-agent.service
|
|
```
|
|
|
|
### Database Issues
|
|
|
|
```bash
|
|
# Reset database (WARNING: deletes all history)
|
|
rm data/articles.db
|
|
python -m src.main
|
|
```
|
|
|
|
## Cost Estimation
|
|
|
|
Using `google/gemini-flash-1.5-8b` (recommended):
|
|
|
|
- **Daily:** ~$0.05-0.15 (varies by article count)
|
|
- **Monthly:** ~$1.50-4.50
|
|
- **Yearly:** ~$18-54
|
|
|
|
**Free Options:**
|
|
- `google/gemini-2.0-flash-exp:free` - Completely free (experimental)
|
|
- `meta-llama/llama-3.1-8b-instruct:free` - Completely free
|
|
|
|
Factors affecting cost:
|
|
- Number of new articles
|
|
- Content length
|
|
- Filtering threshold (lower = more articles = higher cost)
|
|
|
|
## Maintenance
|
|
|
|
### Update Dependencies
|
|
|
|
```bash
|
|
cd ~/news-agent
|
|
source .venv/bin/activate
|
|
pip install --upgrade -e .
|
|
```
|
|
|
|
### View Statistics
|
|
|
|
```bash
|
|
# Check database
|
|
sqlite3 data/articles.db "SELECT COUNT(*) FROM articles;"
|
|
sqlite3 data/articles.db "SELECT category, COUNT(*) FROM articles GROUP BY category;"
|
|
```
|
|
|
|
### Logs Rotation
|
|
|
|
Logs automatically rotate at 10MB with 5 backups (configured in `config.yaml`).
|
|
|
|
## Advanced Features
|
|
|
|
### Add API News Sources
|
|
|
|
Extend `src/aggregator/api_fetcher.py` to support NewsAPI, Google News API, etc.
|
|
|
|
### Customize Email Template
|
|
|
|
Edit `src/email/templates/daily_digest.html` for different styling.
|
|
|
|
### Web Dashboard
|
|
|
|
Add Flask/FastAPI to create a web interface for viewing past digests.
|
|
|
|
## Contributing
|
|
|
|
This is a personal project template. Feel free to fork and customize to your needs.
|
|
|
|
## License
|
|
|
|
MIT License - Free to use and modify
|
|
|
|
## Support
|
|
|
|
For issues with:
|
|
- **OpenRouter:** https://openrouter.ai/docs
|
|
- **Postfix:** Fedora documentation
|
|
- **This code:** Check logs and configuration
|
|
|
|
## Credits
|
|
|
|
Built with:
|
|
- Python 3.11+
|
|
- OpenRouter AI (https://openrouter.ai)
|
|
- Feedparser, Jinja2, Pydantic, and other open-source libraries
|