162 lines
4.2 KiB
Markdown
162 lines
4.2 KiB
Markdown
# OpenRouter Model Reference
|
|
|
|
Quick reference for choosing the right AI model for News Agent.
|
|
|
|
## Recommended Models
|
|
|
|
### Best Overall: `google/gemini-flash-1.5-8b`
|
|
- **Cost:** ~$0.05-0.15/day
|
|
- **Quality:** Excellent
|
|
- **Speed:** Very fast
|
|
- **Best for:** Production use, daily digests
|
|
|
|
### Best Free: `google/gemini-2.0-flash-exp:free`
|
|
- **Cost:** FREE
|
|
- **Quality:** Good (experimental)
|
|
- **Speed:** Fast
|
|
- **Best for:** Testing, low-budget setups
|
|
- **Note:** Experimental, may change
|
|
|
|
### Budget Option: `meta-llama/llama-3.1-8b-instruct:free`
|
|
- **Cost:** FREE
|
|
- **Quality:** Decent
|
|
- **Speed:** Moderate
|
|
- **Best for:** Testing, development
|
|
|
|
### High Quality: `anthropic/claude-3.5-haiku`
|
|
- **Cost:** ~$0.10-0.25/day
|
|
- **Quality:** Excellent
|
|
- **Speed:** Fast
|
|
- **Best for:** When quality matters most
|
|
|
|
### OpenAI Option: `openai/gpt-4o-mini`
|
|
- **Cost:** ~$0.08-0.20/day
|
|
- **Quality:** Very good
|
|
- **Speed:** Fast
|
|
- **Best for:** Balanced quality/cost
|
|
|
|
## How to Change Model
|
|
|
|
Edit `config.yaml`:
|
|
|
|
```yaml
|
|
ai:
|
|
model: "google/gemini-flash-1.5-8b" # Change this line
|
|
```
|
|
|
|
## Full Model List
|
|
|
|
See all available models at: https://openrouter.ai/models
|
|
|
|
Filter by:
|
|
- **Free models** - Search for `:free` suffix
|
|
- **Context length** - Important for long articles
|
|
- **Supported features** - JSON mode, function calling, etc.
|
|
|
|
## Cost Comparison (Daily Estimates)
|
|
|
|
Based on processing ~50 articles/day with 15 summaries:
|
|
|
|
| Model | Daily Cost | Monthly Cost | Quality |
|
|
|-------|-----------|--------------|---------|
|
|
| `google/gemini-2.0-flash-exp:free` | $0.00 | $0.00 | Good |
|
|
| `meta-llama/llama-3.1-8b-instruct:free` | $0.00 | $0.00 | Decent |
|
|
| `google/gemini-flash-1.5-8b` | $0.05-0.15 | $1.50-4.50 | Excellent |
|
|
| `openai/gpt-4o-mini` | $0.08-0.20 | $2.40-6.00 | Very Good |
|
|
| `anthropic/claude-3.5-haiku` | $0.10-0.25 | $3.00-7.50 | Excellent |
|
|
| `openai/gpt-4o` | $0.50-1.50 | $15-45 | Outstanding |
|
|
|
|
*Costs vary based on article length and quantity*
|
|
|
|
## Testing a New Model
|
|
|
|
1. Update `config.yaml`:
|
|
```yaml
|
|
ai:
|
|
model: "new-model-name"
|
|
```
|
|
|
|
2. Run a test:
|
|
```bash
|
|
source .venv/bin/activate
|
|
python -m src.main
|
|
```
|
|
|
|
3. Check quality of summaries in the email
|
|
|
|
4. Monitor costs at: https://openrouter.ai/activity
|
|
|
|
## Model Selection Tips
|
|
|
|
### Choose FREE models if:
|
|
- You're testing the system
|
|
- Cost is a major concern
|
|
- Article quality/accuracy is less critical
|
|
|
|
### Choose PAID models if:
|
|
- You want best quality summaries
|
|
- You need reliable daily digests
|
|
- You value accurate relevance scoring
|
|
|
|
### Performance vs Cost:
|
|
- **Gemini Flash 1.5-8b** - Best balance
|
|
- **Claude Haiku** - Best quality for cost
|
|
- **GPT-4o-mini** - Good all-rounder
|
|
- **Free models** - Testing/development
|
|
|
|
## Troubleshooting
|
|
|
|
### Error: "No endpoints found for [model]"
|
|
|
|
The model name is incorrect or not available on OpenRouter.
|
|
|
|
**Solution:**
|
|
1. Check model name at: https://openrouter.ai/models
|
|
2. Copy exact model ID (e.g., `google/gemini-flash-1.5-8b`)
|
|
3. Update `config.yaml`
|
|
4. Restart the service
|
|
|
|
### Model Seems Slow
|
|
|
|
Some models are slower than others.
|
|
|
|
**Solutions:**
|
|
- Try `google/gemini-flash-1.5-8b` (fastest)
|
|
- Reduce `max_articles` in config (process fewer articles)
|
|
- Use cheaper/faster models for filtering, premium for summarization
|
|
|
|
### Poor Summary Quality
|
|
|
|
The model might not be good at summarization.
|
|
|
|
**Solutions:**
|
|
- Try `anthropic/claude-3.5-haiku` (excellent summarization)
|
|
- Adjust prompts in `src/ai/prompts.py`
|
|
- Increase temperature for more creative summaries
|
|
- Try different models
|
|
|
|
### High Costs
|
|
|
|
You're using an expensive model or processing too many articles.
|
|
|
|
**Solutions:**
|
|
1. Switch to cheaper model (Gemini Flash or free options)
|
|
2. Increase `min_score` to filter more aggressively
|
|
3. Reduce `max_articles` limit
|
|
4. Monitor usage: https://openrouter.ai/activity
|
|
|
|
## Advanced: Using Different Models for Different Tasks
|
|
|
|
You could modify the code to use:
|
|
- **Fast/cheap model** for filtering (scoring articles)
|
|
- **High-quality model** for summarization
|
|
|
|
Edit `src/main.py` to instantiate different clients for different tasks.
|
|
|
|
## Need Help?
|
|
|
|
- **Model pricing:** https://openrouter.ai/models (click model for details)
|
|
- **API docs:** https://openrouter.ai/docs
|
|
- **Check costs:** https://openrouter.ai/activity
|
|
- **Model comparison:** Test different models and compare results
|