Files
NewsAgent/COST_TRACKING_SUMMARY.md
2026-01-27 09:25:11 +01:00

6.2 KiB

Cost Tracking Implementation Summary

What Was Added

1. Database Schema

  • New runs table tracks each execution
  • Fields: articles_fetched, articles_processed, articles_included, costs, timestamps
  • Indexes for efficient queries

2. Cost Extraction

  • OpenRouter API responses include cost in usage.cost field
  • AI client accumulates costs during session
  • Automatic tracking without manual intervention

3. Database Methods

  • save_run() - Store run statistics and costs
  • get_total_cost() - Calculate cumulative spending
  • get_run_stats() - Retrieve recent run history

4. Email Display

  • HTML Email: Nice formatted cost box at bottom
  • Plain Text: Cost information in footer
  • Shows both session and cumulative costs

5. Cost Viewer Script

  • python -m src.view_costs - View statistics
  • Shows recent runs, averages, totals
  • Export capability for analysis

📊 What You'll See

In Your Email

Bottom of every digest:

💰 Cost Information
This digest: $0.0234  |  Total spent: $1.2456

In Cost Viewer

python -m src.view_costs

Output:

============================================================
News Agent Cost Statistics
============================================================

💰 Total Cumulative Cost: $1.2456

Recent Runs (last 20):
------------------------------------------------------------
Date         Articles   Included   Cost      
------------------------------------------------------------
2026-01-26   152        15         $0.0234   
2026-01-25   143        12         $0.0198   
...

Averages (last 20 runs):
  Cost per run: $0.0226
  Articles per digest: 14.2
  Cost per article: $0.0016

🔍 How It Works

  1. During Filtering: AI scores each article

    • OpenRouter returns cost per API call
    • Client tracks: self.total_cost += cost
  2. During Summarization: AI summarizes selected articles

    • More API calls, more cost
    • Accumulated in same session
  3. After Processing: Save to database

    await db.save_run(
        articles_fetched=152,
        articles_processed=25,
        articles_included=15,
        total_cost=0.0234,
    )
    
  4. Before Email: Calculate totals

    session_cost = ai_client.get_session_cost()  # This run
    cumulative_cost = await db.get_total_cost()  # All time
    
  5. In Email: Display both values

    • Session cost: Just this digest
    • Cumulative cost: Total since start

💰 Expected Costs

With openai/gpt-4o-mini

Scenario Cost/Run Monthly Yearly
150 articles, 15 selected $0.02-0.03 $0.60-0.90 $7-11
200 articles, 15 selected $0.03-0.04 $0.90-1.20 $11-15
100 articles, 10 selected $0.01-0.02 $0.30-0.60 $4-7

Note: Costs vary based on article length and model used.

📁 Files Modified

Core Tracking

  • src/storage/database.py - Added runs table and cost methods
  • src/ai/client.py - Track costs from API responses
  • src/main.py - Save costs and pass to email generator

Email Display

  • src/email/generator.py - Accept cost parameters
  • src/email/templates/daily_digest.html - Display costs nicely

Utilities

  • src/view_costs.py - NEW: Cost statistics viewer
  • COST_TRACKING.md - NEW: Complete documentation

🎯 Benefits

  1. Transparency: Know exactly what you're spending
  2. Budgeting: Track costs over time
  3. Optimization: Identify expensive runs
  4. Accountability: See if changes save/cost money
  5. Planning: Estimate future costs accurately

🔧 Usage

View Costs Anytime

python -m src.view_costs

Query Database

sqlite3 data/articles.db

-- Total spending
SELECT SUM(total_cost) FROM runs;

-- Last 10 runs
SELECT run_date, articles_included, total_cost 
FROM runs ORDER BY run_date DESC LIMIT 10;

-- This month
SELECT SUM(total_cost) FROM runs 
WHERE run_date >= date('now', 'start of month');

Export to CSV

sqlite3 -header -csv data/articles.db \
  "SELECT * FROM runs ORDER BY run_date" > costs.csv

⚙️ Cost Optimization

Reduce Costs by:

  1. Higher filter threshold (fewer articles):

    ai:
      filtering:
        min_score: 7.0  # Was 5.5
    
  2. Fewer max articles:

    ai:
      filtering:
        max_articles: 10  # Was 15
    
  3. Cheaper model:

    ai:
      model: "google/gemini-2.0-flash-exp:free"  # FREE
    
  4. Fewer RSS sources (less to process)

🐛 Troubleshooting

Cost Shows $0.0000

Likely causes:

  • Using free model (doesn't report costs)
  • OpenRouter API response doesn't include cost field
  • First run (no history yet)

Check logs:

grep -i "cost" data/logs/news-agent.log

Cost Seems High

Review:

  1. Number of articles being processed
  2. Model being used (check pricing at openrouter.ai)
  3. Article content length
  4. Recent runs: python -m src.view_costs

📝 Notes

  • Costs stored in cents/dollars (OpenRouter credits = USD)
  • Database never deleted (unless you manually delete it)
  • Cumulative cost includes ALL runs since database creation
  • Costs shown with 4 decimal places ($0.0234)
  • Free models may show $0.00 (rate limited instead)

🚀 Future Enhancements

Possible additions:

  1. Budget alerts - Email if cost exceeds threshold
  2. Cost breakdown - Separate filtering vs summarization
  3. Cost predictions - Estimate before running
  4. Monthly reports - Summary email at month end
  5. Cost per category - Track which topics cost most
  6. Web dashboard - Visualize trends

📚 Documentation

See full details in:

  • COST_TRACKING.md - Complete documentation
  • README.md - Updated with cost tracking feature
  • src/view_costs.py - Source code for viewer

Example Output

The cost information appears at the bottom of every email:

┌─────────────────────────────────────┐
│ 💰 Cost Information                 │
│ This digest: $0.0234                │
│ Total spent: $1.2456                │
└─────────────────────────────────────┘

Clean, simple, and informative!