To: agent@mixus.comSubject: Eval: Sales CommissionCalculate 15% commission on a $50,000 sale.Test type: with-verificationAuto-detect: trueReviewer: your-email@company.com
To: agent@mixus.comSubject: Eval: Customer Follow-upDraft a follow-up email to customer who requested product demo.Include:- Thank them for interest- Propose 3 time slots next week- Link to product features page- Warm, professional toneTest type: with-verificationAuto-detect: trueReviewer: your-email@company.com
To: agent@mixus.comSubject: Eval: ROI CalculationCalculate ROI for marketing campaign with:- Total spend: $50,000- Revenue generated: $175,000- Campaign duration: 3 monthsShow ROI % and monthly breakdown.Test type: with-verificationAuto-detect: trueReviewer: your-email@company.com
To: agent@mixus.comSubject: Eval: Quarterly Report1. Search for our Q3 sales data2. Calculate total revenue and growth rate3. Identify top 3 products by sales4. Create executive summary5. Email summary to team@company.comCheckpoints:1. Verify data accuracy before analysis2. Review summary before sendingTest type: with-verificationReviewer: your-email@company.com
To: agent@mixus.comSubject: Eval: Quick Math BaselineCalculate:1. 15% of $50,0002. 23% of $75,0003. 8.5% of $120,000Test type: without-verificationReviewer: your-email@company.com
curl -X POST https://app.mixus.ai/api/eval/create-task-agent \ -H "Authorization: Bearer mxs_eval_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "taskName": "Quick Math Baseline", "taskDescription": "Calculate: 15% of $50k, 23% of $75k, 8.5% of $120k", "testMode": "without-verification", "assignedReviewer": "your-email@company.com" }'
To: agent@mixus.comSubject: Eval: Crypto Portfolio ValueResearch current Bitcoin and Ethereum prices from CoinMarketCap.Then calculate portfolio value for:- 0.5 BTC- 10 ETHShow individual values and total.Test type: with-verificationAuto-detect: trueReviewer: your-email@company.com
To: agent@mixus.comSubject: Eval: R&D Tax CreditCalculate R&D tax credit using Alternative Simplified Credit method.Data:- Current year QRE: $120,000- Prior 3-year average: $90,000- Formula: 14% × (current_qre - base_amount)- Base amount = 50% of prior averageCheckpoints:1. Verify base amount is $45,0002. Verify final credit is $10,500Test type: with-verificationReviewer: your-email@company.com
To: agent@mixus.comSubject: Eval: Calendar SchedulingFind 3 available 30-minute slots next week when bothjohn@company.com and sarah@company.com are free. Check theirGoogle calendars.Available hours: 9am-5pm ESTPrefer afternoon slotsTest type: with-verificationAuto-detect: trueReviewer: your-email@company.com
curl -X POST https://app.mixus.ai/api/eval/create-task-agent \ -H "Authorization: Bearer mxs_eval_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "taskName": "Calendar Scheduling", "taskDescription": "Find 3 available 30min slots next week for john@company.com and sarah@company.com. 9am-5pm EST, prefer afternoon.", "autoDetectCheckpoints": true, "testMode": "with-verification", "assignedReviewer": "your-email@company.com", "expectedTools": ["google_calendar"] }'
Expected result: 3 available time slots Duration: ~4-6 minutes Checkpoints: 1 (approve proposed times)