Task Preparation

Writing effective tasks

Good task descriptions lead to better results. Follow these guidelines:

Be specific and clear

Calculate 15% commission on a $50,000 sale and send the result 
via email to manager@company.com with subject "Q4 Commission Report"

Calculate commission and send email

Include expected outcomes

Research the current Bitcoin price from CoinMarketCap and calculate 
the value of 0.5 BTC. Expected result should be around $20,000-$25,000.

Check crypto prices

Provide necessary context

Calculate R&D tax credit using Alternative Simplified Credit method 
(IRS Form 6765 Section B). Current year QRE: $120,000. Prior 3-year 
average: $90,000. Formula: 14% × (current_qre - base_amount)

Calculate tax credit

Email format templates

Template 1: Auto-detect (Recommended)

To: agent@mixus.com
Subject: Eval: [Your Task Name]

[Clear description of what you want done]

Test type: with-verification
Auto-detect: true
Reviewer: your-email@company.com

Example:

To: agent@mixus.com
Subject: Eval: Market Research

Research the top 3 AI agent platforms by market share.
For each, find: pricing model, key features, and target market.

Test type: with-verification
Auto-detect: true
Reviewer: analyst@company.com

Template 2: Manual checkpoints

To: agent@mixus.com
Subject: Eval: [Your Task Name]

[Task description]

Checkpoints:
1. [First verification point]
2. [Second verification point]
3. [Third verification point]

Test type: with-verification
Reviewer: your-email@company.com

Example:

To: agent@mixus.com
Subject: Eval: Financial Calculation

Calculate the R&D tax credit for $120,000 in qualified expenses.

Checkpoints:
1. Verify the base amount calculation is correct
2. Verify the final tax credit amount
3. Confirm the IRS form is properly filled

Test type: with-verification
Reviewer: cfo@company.com

Template 3: Baseline (no verification)

To: agent@mixus.com
Subject: Eval: [Your Task Name]

[Task description]

Test type: without-verification
Reviewer: your-email@company.com

Example:

To: agent@mixus.com
Subject: Eval: Simple Calculation

Calculate 15% of $50,000

Test type: without-verification
Reviewer: test@company.com

API format templates

Template 1: Auto-detect (Recommended)

{
  "taskName": "Market Research",
  "taskDescription": "Research top 3 AI agent platforms, find pricing, features, and target market",
  "autoDetectCheckpoints": true,
  "testMode": "with-verification",
  "assignedReviewer": "analyst@company.com"
}

Template 2: Manual checkpoints

{
  "taskName": "Financial Calculation",
  "taskDescription": "Calculate R&D tax credit for $120,000 QRE",
  "checkpoints": [
    {
      "stage": "base_calculation",
      "description": "Calculate base amount",
      "verificationQuestion": "Is the base amount $45,000?"
    },
    {
      "stage": "final_calculation",
      "description": "Calculate final tax credit",
      "verificationQuestion": "Is the tax credit $10,500?"
    }
  ],
  "testMode": "with-verification",
  "assignedReviewer": "cfo@company.com"
}

Template 3: With webhook callback

{
  "taskName": "Automated Test",
  "taskDescription": "Calculate 15% commission on $50,000 sale",
  "autoDetectCheckpoints": true,
  "testMode": "with-verification",
  "assignedReviewer": "manager@company.com",
  "webhookUrl": "https://your-system.com/webhook/eval-complete",
  "externalId": "test-001"
}

Task categories and examples

Financial calculations

Commission Calculation

Subject: Eval: Sales Commission Q4

Calculate commission for Q4 sales:

- Base salary: $60,000/year
- Commission rate: 3% of sales over $100,000
- Q4 sales: $450,000

Calculate total Q4 compensation.

Test type: with-verification
Auto-detect: true
Reviewer: finance@company.com

ROI Analysis

Subject: Eval: Marketing ROI

Calculate ROI for Q3 marketing campaign:

- Total spend: $75,000
- Revenue generated: $225,000
- Campaign duration: 3 months

Provide ROI percentage and monthly breakdown.

Test type: with-verification
Auto-detect: true
Reviewer: cmo@company.com

Research tasks

Competitor Analysis

Subject: Eval: Competitor Research

Research OpenAI ChatGPT, Anthropic Claude, and Google Gemini.
For each, find:

1. Pricing (all tiers)
2. Key features
3. API availability
4. Target market

Create comparison table.

Test type: with-verification
Auto-detect: true
Reviewer: product@company.com

Market Trends

Subject: Eval: AI Market Trends

Research current AI agent market trends:

1. Market size and growth rate
2. Top 5 companies by market share
3. Emerging technologies
4. Future outlook (2-year)

Summarize findings in 2-page report.

Test type: with-verification
Auto-detect: true
Reviewer: strategy@company.com

Communication tasks

Customer Follow-up

Subject: Eval: Customer Email Draft

Draft follow-up email to customer who requested product demo.
Include:

- Thank them for interest
- Propose 3 time slots next week (9am, 2pm, 4pm EST)
- Link to product features page
- Professional, warm tone

Test type: with-verification
Auto-detect: true
Reviewer: sales@company.com

Meeting Summary

Subject: Eval: Meeting Summary Email

Create meeting summary email for quarterly review:

- Key decisions made
- Action items with owners
- Next meeting date
- Send to: team@company.com

Test type: with-verification
Auto-detect: true
Reviewer: manager@company.com

Best practices

Do’s ✅

Be specific about numbers and amounts

Good: "Calculate 15% commission on $50,000"
Bad: "Calculate some commission"

Include verification criteria

Good: "Expected result should be $7,500 (15% of $50,000)"
Bad: "Calculate it"

Specify data sources when relevant

Good: "Get Bitcoin price from CoinMarketCap"
Bad: "Get crypto price"

Break complex tasks into steps

Good: "1) Research competitors 2) Analyze pricing 3) Create table 4) Email summary"
Bad: "Do competitor analysis"

Don’ts ❌

Don’t use vague language

Bad: "Do some research about AI"
Good: "Research top 3 AI companies: OpenAI, Anthropic, Google DeepMind"

Don’t forget to specify reviewer

Bad: Missing "Reviewer:" field
Good: "Reviewer: manager@company.com"

Don’t mix manual checkpoints with auto-detect

Bad: Both "Checkpoints: 1, 2, 3" AND "Auto-detect: true"
Good: Choose ONE approach

Checkpoint placement guide

AI auto-detects checkpoints before:

External Communications

Sending emails
Posting to social media
Sending SMS/messages

Financial Actions

Making purchases
Transferring money
Processing payments

Irreversible Operations

Deleting data
Canceling subscriptions
Submitting forms

High-Value Decisions

Strategic recommendations
Large calculations
Important approvals

AI does NOT checkpoint for:

Web searches and research
Reading files or data
Analysis and calculations (unless result triggers action)
Information gathering
Report generation (unless sending/publishing)

Field reference

Required fields

Field	Description	Example
`taskName` / Subject	Task identifier	”Calculate Commission”
`taskDescription` / Body	What to do	”Calculate 15% of $50,000”
`testMode` / Test type	Verification mode	”with-verification”
`assignedReviewer` / Reviewer	Who verifies	”manager@company.com”

Optional fields

Field	Description	Example
`autoDetectCheckpoints` / Auto-detect	Let AI find checkpoints	true
`checkpoints` / Checkpoints	Manual verification points	See checkpoint format below
`webhookUrl`	Callback URL	”https://your-app.com/webhook”
`externalId`	Your tracking ID	”task-001”

Common mistakes

Mistake: Subject doesn't start with 'Eval:'

Wrong: Subject: Calculate Commission Right: Subject: Eval: Calculate Commission

Mistake: Vague task description

Wrong: Do some calculations Right: Calculate 15% commission on a $50,000 sale

Mistake: Missing reviewer

Wrong: No reviewer specified Right: Reviewer: manager@company.com

Mistake: Too many checkpoints

Wrong: 10 checkpoints for simple task Right: 2-4 checkpoints for most tasks, or use auto-detect

Next steps

See Examples

10+ ready-to-use example tasks

API Reference

Complete API documentation

Best Practices

Advanced tips and tricks

Get Started

Submit your first task

​Writing effective tasks

​Be specific and clear

​Include expected outcomes

​Provide necessary context

​Email format templates

​Template 1: Auto-detect (Recommended)

​Template 2: Manual checkpoints

​Template 3: Baseline (no verification)

​API format templates

​Template 1: Auto-detect (Recommended)

​Template 2: Manual checkpoints

​Template 3: With webhook callback

​Task categories and examples

​Financial calculations

​Research tasks

​Communication tasks

​Best practices

​Do’s ✅

​Don’ts ❌

​Checkpoint placement guide

​AI auto-detects checkpoints before:

External Communications

Financial Actions

Irreversible Operations

High-Value Decisions

​AI does NOT checkpoint for:

​Field reference

​Required fields

​Optional fields

​Common mistakes

​Next steps

See Examples

API Reference

Best Practices

Get Started

Writing effective tasks

Be specific and clear

Include expected outcomes

Provide necessary context

Email format templates

Template 1: Auto-detect (Recommended)

Template 2: Manual checkpoints

Template 3: Baseline (no verification)

API format templates

Template 1: Auto-detect (Recommended)

Template 2: Manual checkpoints

Template 3: With webhook callback

Task categories and examples

Financial calculations

Research tasks

Communication tasks

Best practices

Do’s ✅

Don’ts ❌

Checkpoint placement guide

AI auto-detects checkpoints before:

AI does NOT checkpoint for:

Field reference

Required fields

Optional fields

Common mistakes

Next steps