diff --git a/COST_TRACKING.md b/COST_TRACKING.md new file mode 100644 index 0000000..8b0b8f9 --- /dev/null +++ b/COST_TRACKING.md @@ -0,0 +1,258 @@ +# Cost Tracking + +News Agent automatically tracks and displays the cost of each run in your daily email digest. + +## Features + +- **Per-Run Cost**: Shows exactly how much each digest cost to generate +- **Cumulative Total**: Tracks total spending across all runs +- **Database Storage**: All cost data saved in SQLite database +- **Email Display**: Costs shown at the bottom of every digest email + +## How It Works + +1. **OpenRouter API Returns Costs**: Each API call includes cost information +2. **Client Tracks Session Cost**: Accumulates cost during the run +3. **Database Stores Run Data**: Saves cost along with article counts +4. **Email Shows Costs**: Displays both session and cumulative costs + +## Email Display + +At the bottom of each digest email, you'll see: + +``` +💰 Cost Information +This digest: $0.0234 | Total spent: $1.2456 +``` + +- **This digest**: Cost for generating this specific email (filtering + summarization) +- **Total spent**: Cumulative cost across all runs since you started using the system + +## View Cost Statistics + +Run the cost viewer script: + +```bash +cd ~/news-agent +source .venv/bin/activate +python -m src.view_costs +``` + +**Example output:** +``` +============================================================ +News Agent Cost Statistics +============================================================ + +💰 Total Cumulative Cost: $1.2456 + +Recent Runs (last 20): +------------------------------------------------------------ +Date Articles Included Cost +------------------------------------------------------------ +2026-01-26 152 15 $0.0234 +2026-01-25 143 12 $0.0198 +2026-01-24 167 15 $0.0245 +... +------------------------------------------------------------ + +Averages (last 20 runs): + Cost per run: $0.0226 + Articles per digest: 14.2 + Cost per article: $0.0016 + +============================================================ +``` + +## Cost Breakdown + +### Typical Costs (with `openai/gpt-4o-mini`) + +| Operation | Articles | Cost | +|-----------|----------|------| +| Filtering | 150 articles | ~$0.015-0.020 | +| Summarization | 15 articles | ~$0.005-0.008 | +| **Total per run** | - | **~$0.020-0.028** | + +### Monthly Estimates + +| Frequency | Cost/Run | Monthly Cost | +|-----------|----------|--------------| +| Daily | $0.025 | ~$0.75/month | +| Daily | $0.030 | ~$0.90/month | + +**Note:** Actual costs vary based on: +- Number of articles fetched +- Content length +- Model used +- Number of articles passing filter + +## Cost Optimization Tips + +### 1. Adjust Filtering Threshold + +Higher threshold = fewer articles = lower cost: + +```yaml +ai: + filtering: + min_score: 7.0 # Stricter (was 5.5) + max_articles: 10 # Fewer articles (was 15) +``` + +### 2. Use Cheaper Models + +```yaml +ai: + model: "google/gemini-2.0-flash-exp:free" # FREE (has rate limits) + # OR + model: "openai/gpt-4o-mini" # Cheap and reliable +``` + +### 3. Reduce RSS Sources + +Fewer sources = fewer articles = lower cost: + +```yaml +sources: + rss: + # Comment out sources you don't need +``` + +### 4. Track and Set Budget Alerts + +Monitor costs and adjust: + +```bash +# View costs regularly +python -m src.view_costs + +# If costs are too high, adjust config.yaml settings +``` + +## Database Schema + +The `runs` table stores: + +```sql +CREATE TABLE runs ( + id INTEGER PRIMARY KEY, + run_date TEXT NOT NULL, + articles_fetched INTEGER NOT NULL, + articles_processed INTEGER NOT NULL, + articles_included INTEGER NOT NULL, + total_cost REAL NOT NULL, + filtering_cost REAL DEFAULT 0, + summarization_cost REAL DEFAULT 0, + created_at TEXT NOT NULL +); +``` + +## Query Costs Manually + +Using SQLite: + +```bash +sqlite3 data/articles.db + +-- Total cost +SELECT SUM(total_cost) FROM runs; + +-- Cost by date +SELECT run_date, total_cost FROM runs ORDER BY run_date DESC; + +-- Average cost +SELECT AVG(total_cost) FROM runs; + +-- Cost this month +SELECT SUM(total_cost) FROM runs +WHERE run_date >= date('now', 'start of month'); +``` + +## Troubleshooting + +### Cost Shows $0.0000 + +**Possible causes:** + +1. **Using free model** - Free models may not report costs +2. **API response format** - OpenRouter might not include cost field +3. **First run** - Database just initialized + +**Check:** +```bash +# View logs for cost tracking +grep -i "cost" data/logs/news-agent.log + +# Check if runs are being saved +sqlite3 data/articles.db "SELECT * FROM runs ORDER BY created_at DESC LIMIT 5;" +``` + +### Cost Seems Wrong + +**Verify model pricing:** +- Check https://openrouter.ai/models for current pricing +- Different models have different costs +- Costs change over time + +**Check API responses:** +```bash +# Enable debug logging +# Edit config.yaml: +logging: + level: "DEBUG" + +# Run and check logs +python -m src.main +grep "cost" data/logs/news-agent.log +``` + +## Export Cost Data + +Export to CSV: + +```bash +sqlite3 -header -csv data/articles.db \ + "SELECT run_date, articles_processed, articles_included, total_cost + FROM runs ORDER BY run_date" > costs.csv +``` + +Import into Excel/Sheets for analysis. + +## Cost Projections + +Based on current usage, project future costs: + +```python +# Get average cost +python -m src.view_costs + +# Multiply by days +# If avg = $0.025/day: +# - Weekly: $0.025 × 7 = $0.175 +# - Monthly: $0.025 × 30 = $0.75 +# - Yearly: $0.025 × 365 = $9.13 +``` + +## Privacy Note + +All cost data is stored **locally** in your SQLite database (`data/articles.db`). No cost information is sent anywhere except: +- Displayed in your email (which you control) +- Stored in your local database + +## Future Enhancements + +Potential improvements: + +1. **Split costs** - Separate filtering vs summarization costs +2. **Budget alerts** - Email warning if cost exceeds threshold +3. **Cost predictions** - Estimate next run cost +4. **Web dashboard** - Visualize cost trends over time +5. **Cost per category** - Track which categories cost most + +## Need Help? + +- View current costs: `python -m src.view_costs` +- Check database: `sqlite3 data/articles.db "SELECT * FROM runs;"` +- Review logs: `tail -f data/logs/news-agent.log` +- OpenRouter pricing: https://openrouter.ai/models diff --git a/COST_TRACKING_SUMMARY.md b/COST_TRACKING_SUMMARY.md new file mode 100644 index 0000000..e29dda4 --- /dev/null +++ b/COST_TRACKING_SUMMARY.md @@ -0,0 +1,247 @@ +# Cost Tracking Implementation Summary + +## ✅ What Was Added + +### 1. Database Schema +- New `runs` table tracks each execution +- Fields: articles_fetched, articles_processed, articles_included, costs, timestamps +- Indexes for efficient queries + +### 2. Cost Extraction +- OpenRouter API responses include cost in `usage.cost` field +- AI client accumulates costs during session +- Automatic tracking without manual intervention + +### 3. Database Methods +- `save_run()` - Store run statistics and costs +- `get_total_cost()` - Calculate cumulative spending +- `get_run_stats()` - Retrieve recent run history + +### 4. Email Display +- **HTML Email**: Nice formatted cost box at bottom +- **Plain Text**: Cost information in footer +- Shows both session and cumulative costs + +### 5. Cost Viewer Script +- `python -m src.view_costs` - View statistics +- Shows recent runs, averages, totals +- Export capability for analysis + +## 📊 What You'll See + +### In Your Email +Bottom of every digest: + +``` +💰 Cost Information +This digest: $0.0234 | Total spent: $1.2456 +``` + +### In Cost Viewer +```bash +python -m src.view_costs +``` + +Output: +``` +============================================================ +News Agent Cost Statistics +============================================================ + +💰 Total Cumulative Cost: $1.2456 + +Recent Runs (last 20): +------------------------------------------------------------ +Date Articles Included Cost +------------------------------------------------------------ +2026-01-26 152 15 $0.0234 +2026-01-25 143 12 $0.0198 +... + +Averages (last 20 runs): + Cost per run: $0.0226 + Articles per digest: 14.2 + Cost per article: $0.0016 +``` + +## 🔍 How It Works + +1. **During Filtering**: AI scores each article + - OpenRouter returns cost per API call + - Client tracks: `self.total_cost += cost` + +2. **During Summarization**: AI summarizes selected articles + - More API calls, more cost + - Accumulated in same session + +3. **After Processing**: Save to database + ```python + await db.save_run( + articles_fetched=152, + articles_processed=25, + articles_included=15, + total_cost=0.0234, + ) + ``` + +4. **Before Email**: Calculate totals + ```python + session_cost = ai_client.get_session_cost() # This run + cumulative_cost = await db.get_total_cost() # All time + ``` + +5. **In Email**: Display both values + - Session cost: Just this digest + - Cumulative cost: Total since start + +## 💰 Expected Costs + +### With `openai/gpt-4o-mini` + +| Scenario | Cost/Run | Monthly | Yearly | +|----------|----------|---------|--------| +| 150 articles, 15 selected | $0.02-0.03 | $0.60-0.90 | $7-11 | +| 200 articles, 15 selected | $0.03-0.04 | $0.90-1.20 | $11-15 | +| 100 articles, 10 selected | $0.01-0.02 | $0.30-0.60 | $4-7 | + +**Note:** Costs vary based on article length and model used. + +## 📁 Files Modified + +### Core Tracking +- `src/storage/database.py` - Added runs table and cost methods +- `src/ai/client.py` - Track costs from API responses +- `src/main.py` - Save costs and pass to email generator + +### Email Display +- `src/email/generator.py` - Accept cost parameters +- `src/email/templates/daily_digest.html` - Display costs nicely + +### Utilities +- `src/view_costs.py` - NEW: Cost statistics viewer +- `COST_TRACKING.md` - NEW: Complete documentation + +## 🎯 Benefits + +1. **Transparency**: Know exactly what you're spending +2. **Budgeting**: Track costs over time +3. **Optimization**: Identify expensive runs +4. **Accountability**: See if changes save/cost money +5. **Planning**: Estimate future costs accurately + +## 🔧 Usage + +### View Costs Anytime +```bash +python -m src.view_costs +``` + +### Query Database +```bash +sqlite3 data/articles.db + +-- Total spending +SELECT SUM(total_cost) FROM runs; + +-- Last 10 runs +SELECT run_date, articles_included, total_cost +FROM runs ORDER BY run_date DESC LIMIT 10; + +-- This month +SELECT SUM(total_cost) FROM runs +WHERE run_date >= date('now', 'start of month'); +``` + +### Export to CSV +```bash +sqlite3 -header -csv data/articles.db \ + "SELECT * FROM runs ORDER BY run_date" > costs.csv +``` + +## ⚙️ Cost Optimization + +### Reduce Costs by: + +1. **Higher filter threshold** (fewer articles): + ```yaml + ai: + filtering: + min_score: 7.0 # Was 5.5 + ``` + +2. **Fewer max articles**: + ```yaml + ai: + filtering: + max_articles: 10 # Was 15 + ``` + +3. **Cheaper model**: + ```yaml + ai: + model: "google/gemini-2.0-flash-exp:free" # FREE + ``` + +4. **Fewer RSS sources** (less to process) + +## 🐛 Troubleshooting + +### Cost Shows $0.0000 + +**Likely causes:** +- Using free model (doesn't report costs) +- OpenRouter API response doesn't include cost field +- First run (no history yet) + +**Check logs:** +```bash +grep -i "cost" data/logs/news-agent.log +``` + +### Cost Seems High + +**Review:** +1. Number of articles being processed +2. Model being used (check pricing at openrouter.ai) +3. Article content length +4. Recent runs: `python -m src.view_costs` + +## 📝 Notes + +- Costs stored in cents/dollars (OpenRouter credits = USD) +- Database never deleted (unless you manually delete it) +- Cumulative cost includes ALL runs since database creation +- Costs shown with 4 decimal places ($0.0234) +- Free models may show $0.00 (rate limited instead) + +## 🚀 Future Enhancements + +Possible additions: + +1. **Budget alerts** - Email if cost exceeds threshold +2. **Cost breakdown** - Separate filtering vs summarization +3. **Cost predictions** - Estimate before running +4. **Monthly reports** - Summary email at month end +5. **Cost per category** - Track which topics cost most +6. **Web dashboard** - Visualize trends + +## 📚 Documentation + +See full details in: +- **COST_TRACKING.md** - Complete documentation +- **README.md** - Updated with cost tracking feature +- **src/view_costs.py** - Source code for viewer + +## ✨ Example Output + +The cost information appears at the bottom of every email: + +``` +┌─────────────────────────────────────┐ +│ 💰 Cost Information │ +│ This digest: $0.0234 │ +│ Total spent: $1.2456 │ +└─────────────────────────────────────┘ +``` + +Clean, simple, and informative! diff --git a/README.md b/README.md index 5cd15c1..9230b43 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,7 @@ An AI-powered daily tech news aggregator that fetches articles from RSS feeds, f - **AI Filtering**: Uses OpenRouter AI to score articles based on your interests (0-10 scale) - **Smart Summarization**: Generates concise 2-3 sentence summaries of each relevant article - **Beautiful Emails**: HTML email with responsive design, categorized sections, and relevance scores +- **Cost Tracking**: Automatic tracking and display of per-run and cumulative costs in emails - **Deduplication**: SQLite database prevents duplicate articles - **Automated Scheduling**: Runs daily at 07:00 Europe/Oslo time via systemd timer - **Production Ready**: Error handling, logging, resource limits, and monitoring diff --git a/src/ai/client.py b/src/ai/client.py index c7168f4..fec4c44 100644 --- a/src/ai/client.py +++ b/src/ai/client.py @@ -28,6 +28,7 @@ class OpenRouterClient: ) self.model = config.ai.model + self.total_cost = 0.0 # Track cumulative cost for this session logger.debug(f"Initialized OpenRouter client with model: {self.model}") async def chat_completion( @@ -73,7 +74,7 @@ class OpenRouterClient: if not content: raise ValueError("Empty response from API") - # Log token usage + # Track cost from OpenRouter response if response.usage: logger.debug( f"Tokens used - Prompt: {response.usage.prompt_tokens}, " @@ -81,6 +82,15 @@ class OpenRouterClient: f"Total: {response.usage.total_tokens}" ) + # OpenRouter returns cost in credits (1 credit = $1) + # The usage object has a 'cost' field in newer responses + if hasattr(response.usage, "cost") and response.usage.cost is not None: + cost = float(response.usage.cost) + self.total_cost += cost + logger.debug( + f"Request cost: ${cost:.6f}, Session total: ${self.total_cost:.6f}" + ) + return content except Exception as e: @@ -115,3 +125,11 @@ class OpenRouterClient: except json.JSONDecodeError as e: logger.error(f"Failed to parse JSON response: {content}") raise ValueError(f"Invalid JSON response: {e}") + + def get_session_cost(self) -> float: + """Get total cost accumulated during this session""" + return self.total_cost + + def reset_session_cost(self): + """Reset session cost counter""" + self.total_cost = 0.0 diff --git a/src/email/generator.py b/src/email/generator.py index 6b386f8..9ad3925 100644 --- a/src/email/generator.py +++ b/src/email/generator.py @@ -22,7 +22,12 @@ class EmailGenerator: self.env = Environment(loader=FileSystemLoader(template_dir)) def generate_digest_email( - self, entries: list[DigestEntry], date_str: str, subject: str + self, + entries: list[DigestEntry], + date_str: str, + subject: str, + session_cost: float = 0.0, + cumulative_cost: float = 0.0, ) -> tuple[str, str]: """ Generate HTML email for daily digest @@ -31,6 +36,8 @@ class EmailGenerator: entries: List of digest entries (articles with summaries) date_str: Date string for the digest subject: Email subject line + session_cost: Cost for this digest generation + cumulative_cost: Total cost across all runs Returns: Tuple of (html_content, text_content) @@ -54,6 +61,8 @@ class EmailGenerator: "total_sources": unique_sources, "total_categories": len(sorted_categories), "articles_by_category": {cat: articles_by_category[cat] for cat in sorted_categories}, + "session_cost": session_cost, + "cumulative_cost": cumulative_cost, } # Render HTML template @@ -64,14 +73,21 @@ class EmailGenerator: html_inlined = transform(html) # Generate plain text version - text = self._generate_text_version(entries, date_str, subject) + text = self._generate_text_version( + entries, date_str, subject, session_cost, cumulative_cost + ) logger.debug(f"Generated email with {len(entries)} articles") return html_inlined, text def _generate_text_version( - self, entries: list[DigestEntry], date_str: str, subject: str + self, + entries: list[DigestEntry], + date_str: str, + subject: str, + session_cost: float = 0.0, + cumulative_cost: float = 0.0, ) -> str: """Generate plain text version of email""" lines = [ @@ -110,5 +126,9 @@ class EmailGenerator: lines.append("") lines.append("---") lines.append("Generated by News Agent | Powered by OpenRouter AI") + lines.append("") + lines.append("COST INFORMATION") + lines.append(f"This digest: ${session_cost:.4f}") + lines.append(f"Total spent: ${cumulative_cost:.4f}") return "\n".join(lines) diff --git a/src/email/templates/daily_digest.html b/src/email/templates/daily_digest.html index 5cb21f6..33405a6 100644 --- a/src/email/templates/daily_digest.html +++ b/src/email/templates/daily_digest.html @@ -188,6 +188,15 @@