Cost control

This commit is contained in:
2026-01-27 09:25:11 +01:00
parent 37eb03583c
commit 0ed89a7045
9 changed files with 733 additions and 6 deletions

258
COST_TRACKING.md Normal file
View File

@@ -0,0 +1,258 @@
# Cost Tracking
News Agent automatically tracks and displays the cost of each run in your daily email digest.
## Features
- **Per-Run Cost**: Shows exactly how much each digest cost to generate
- **Cumulative Total**: Tracks total spending across all runs
- **Database Storage**: All cost data saved in SQLite database
- **Email Display**: Costs shown at the bottom of every digest email
## How It Works
1. **OpenRouter API Returns Costs**: Each API call includes cost information
2. **Client Tracks Session Cost**: Accumulates cost during the run
3. **Database Stores Run Data**: Saves cost along with article counts
4. **Email Shows Costs**: Displays both session and cumulative costs
## Email Display
At the bottom of each digest email, you'll see:
```
💰 Cost Information
This digest: $0.0234 | Total spent: $1.2456
```
- **This digest**: Cost for generating this specific email (filtering + summarization)
- **Total spent**: Cumulative cost across all runs since you started using the system
## View Cost Statistics
Run the cost viewer script:
```bash
cd ~/news-agent
source .venv/bin/activate
python -m src.view_costs
```
**Example output:**
```
============================================================
News Agent Cost Statistics
============================================================
💰 Total Cumulative Cost: $1.2456
Recent Runs (last 20):
------------------------------------------------------------
Date Articles Included Cost
------------------------------------------------------------
2026-01-26 152 15 $0.0234
2026-01-25 143 12 $0.0198
2026-01-24 167 15 $0.0245
...
------------------------------------------------------------
Averages (last 20 runs):
Cost per run: $0.0226
Articles per digest: 14.2
Cost per article: $0.0016
============================================================
```
## Cost Breakdown
### Typical Costs (with `openai/gpt-4o-mini`)
| Operation | Articles | Cost |
|-----------|----------|------|
| Filtering | 150 articles | ~$0.015-0.020 |
| Summarization | 15 articles | ~$0.005-0.008 |
| **Total per run** | - | **~$0.020-0.028** |
### Monthly Estimates
| Frequency | Cost/Run | Monthly Cost |
|-----------|----------|--------------|
| Daily | $0.025 | ~$0.75/month |
| Daily | $0.030 | ~$0.90/month |
**Note:** Actual costs vary based on:
- Number of articles fetched
- Content length
- Model used
- Number of articles passing filter
## Cost Optimization Tips
### 1. Adjust Filtering Threshold
Higher threshold = fewer articles = lower cost:
```yaml
ai:
filtering:
min_score: 7.0 # Stricter (was 5.5)
max_articles: 10 # Fewer articles (was 15)
```
### 2. Use Cheaper Models
```yaml
ai:
model: "google/gemini-2.0-flash-exp:free" # FREE (has rate limits)
# OR
model: "openai/gpt-4o-mini" # Cheap and reliable
```
### 3. Reduce RSS Sources
Fewer sources = fewer articles = lower cost:
```yaml
sources:
rss:
# Comment out sources you don't need
```
### 4. Track and Set Budget Alerts
Monitor costs and adjust:
```bash
# View costs regularly
python -m src.view_costs
# If costs are too high, adjust config.yaml settings
```
## Database Schema
The `runs` table stores:
```sql
CREATE TABLE runs (
id INTEGER PRIMARY KEY,
run_date TEXT NOT NULL,
articles_fetched INTEGER NOT NULL,
articles_processed INTEGER NOT NULL,
articles_included INTEGER NOT NULL,
total_cost REAL NOT NULL,
filtering_cost REAL DEFAULT 0,
summarization_cost REAL DEFAULT 0,
created_at TEXT NOT NULL
);
```
## Query Costs Manually
Using SQLite:
```bash
sqlite3 data/articles.db
-- Total cost
SELECT SUM(total_cost) FROM runs;
-- Cost by date
SELECT run_date, total_cost FROM runs ORDER BY run_date DESC;
-- Average cost
SELECT AVG(total_cost) FROM runs;
-- Cost this month
SELECT SUM(total_cost) FROM runs
WHERE run_date >= date('now', 'start of month');
```
## Troubleshooting
### Cost Shows $0.0000
**Possible causes:**
1. **Using free model** - Free models may not report costs
2. **API response format** - OpenRouter might not include cost field
3. **First run** - Database just initialized
**Check:**
```bash
# View logs for cost tracking
grep -i "cost" data/logs/news-agent.log
# Check if runs are being saved
sqlite3 data/articles.db "SELECT * FROM runs ORDER BY created_at DESC LIMIT 5;"
```
### Cost Seems Wrong
**Verify model pricing:**
- Check https://openrouter.ai/models for current pricing
- Different models have different costs
- Costs change over time
**Check API responses:**
```bash
# Enable debug logging
# Edit config.yaml:
logging:
level: "DEBUG"
# Run and check logs
python -m src.main
grep "cost" data/logs/news-agent.log
```
## Export Cost Data
Export to CSV:
```bash
sqlite3 -header -csv data/articles.db \
"SELECT run_date, articles_processed, articles_included, total_cost
FROM runs ORDER BY run_date" > costs.csv
```
Import into Excel/Sheets for analysis.
## Cost Projections
Based on current usage, project future costs:
```python
# Get average cost
python -m src.view_costs
# Multiply by days
# If avg = $0.025/day:
# - Weekly: $0.025 × 7 = $0.175
# - Monthly: $0.025 × 30 = $0.75
# - Yearly: $0.025 × 365 = $9.13
```
## Privacy Note
All cost data is stored **locally** in your SQLite database (`data/articles.db`). No cost information is sent anywhere except:
- Displayed in your email (which you control)
- Stored in your local database
## Future Enhancements
Potential improvements:
1. **Split costs** - Separate filtering vs summarization costs
2. **Budget alerts** - Email warning if cost exceeds threshold
3. **Cost predictions** - Estimate next run cost
4. **Web dashboard** - Visualize cost trends over time
5. **Cost per category** - Track which categories cost most
## Need Help?
- View current costs: `python -m src.view_costs`
- Check database: `sqlite3 data/articles.db "SELECT * FROM runs;"`
- Review logs: `tail -f data/logs/news-agent.log`
- OpenRouter pricing: https://openrouter.ai/models