Quick start
Submit an evaluation task via email or API and start testing your AI agents.- Email Method
- API Method
Email submission (easiest)
Send an email to test your first evaluation:- System creates an agent to solve the task
- AI automatically detects verification checkpoints
- Agent executes and pauses at checkpoint
- You receive email notification
- Reply “approve” to continue
- Get results email when complete
No API key needed - Just send an email!
Test modes explained
Choose the right mode for your evaluation:With Verification + Auto-detect (Recommended)
With Verification + Auto-detect (Recommended)
AI automatically determines where verification is needed.Use when:
- You don’t know exact verification points
- Task involves external actions (emails, purchases)
- You want AI to identify critical steps
With Verification + Manual Checkpoints
With Verification + Manual Checkpoints
You specify exact verification points.Use when:
- You know exactly where verification should happen
- Testing specific decision points
- Task has well-defined stages
Without Verification (Baseline)
Without Verification (Baseline)
Agent runs autonomously without human oversight.Use when:
- Speed benchmarking
- Comparing with/without human
- Low-risk tasks
Verification workflow
When using verification mode, here’s what to expect:1
Checkpoint reached
Agent pauses execution and sends email notification
2
Review agent's work
Open chat or check email to see what agent plans to do
3
Respond
- Type “approve” - Agent continues
- Type “reject” - Agent stops
- Type “hint: [guidance]” - Agent adjusts approach
4
Agent continues
After approval, agent proceeds to next step or completes task
Example: Complete flow
Here’s a complete example from start to finish:1. Submit via email
2. Confirmation received
3. Checkpoint notification
4. You approve
Reply to email:approve
5. Second checkpoint
6. You approve again
Reply:approve
7. Completion notification
Where to track evaluations
Dashboard
Visual overview of all evaluations
Chat Interface
See full conversation and agent work
API Status
Programmatic status checks
Email Notifications
Receive updates via email
Tips for success
Write clear task descriptionsGood: “Calculate 15% commission on a $50,000 sale and send result via email to manager@company.com”Bad: “Do commission stuff”
Start with auto-detect modeLet AI determine checkpoints until you understand the system better.
Test simple tasks firstStart with calculations or research before complex multi-step workflows.
Use baseline mode for comparisonsRun same task with and without verification to measure impact.
Next steps
Task Preparation
Learn how to write effective evaluation tasks
Examples
See 10+ ready-to-use example tasks
API Reference
Complete API documentation
Best Practices
Tips for getting the most from evaluations
Common questions
Do I need an API key for email submissions?
Do I need an API key for email submissions?
No! Email submissions don’t require an API key. Just send to agent@mixus.com with subject starting “Eval:”
How long does an evaluation take?
How long does an evaluation take?
Depends on task complexity and verification time. Simple tasks: 2-5 minutes. Complex tasks: 10-30 minutes.
Can I submit multiple tasks at once?
Can I submit multiple tasks at once?
Yes! Via API you can submit multiple tasks. They’ll run in parallel.
What if I don't respond to a checkpoint?
What if I don't respond to a checkpoint?
The evaluation will wait for your response. You can respond anytime via email or chat.
Can I cancel a running evaluation?
Can I cancel a running evaluation?
Yes, reply “reject” at any checkpoint or stop it from the dashboard.
Need help?
- Email: support@mixus.ai
- Dashboard: app.mixus.ai/eval
- API Keys: app.mixus.ai/integrations/api-keys