Rate Limits

Comprehensive guide to rate limiting, throttling mechanisms, and best practices for platform usage in mixus.
Overview

mixus implements intelligent rate limiting to ensure fair usage across all users while maintaining optimal performance. Our rate limiting system protects against abuse while providing generous limits for legitimate usage patterns.
Types of Rate Limits

Platform Access Limits

Limits on platform access: Authenticated Access:
Scope: Authenticated users
Method: All platform features
Identification: User account
Rate: Based on subscription plan
```text

**Public Access:**
```text
Scope: Public features only
Method: Limited functionality
Identification: IP address
Rate: Conservative limits
```text

### Interactive Usage Limits

Limits on web application usage:

**Chat Interface:**
- Message sending frequency
- File upload rates
- Search query limits
- Agent execution frequency

**User Actions:**
- Account modifications
- Settings updates
- File management operations

## Rate Limit Structure

### Current Rate Limits by Plan

**Free Plan:**
```text
API Requests:
├── Rate: 60 requests per hour
├── Burst: 10 requests per minute
├── Daily cap: 1,440 requests
└── Reset: Rolling 60-minute window

Interactive Usage:
├── Chat messages: 20 per hour
├── File uploads: 5 per month
├── Web searches: 10 per hour
└── Agent runs: 5 per day
```text

**Pro Plan:**
```text
API Requests:
├── Rate: 1,000 requests per hour
├── Burst: 100 requests per minute
├── Daily cap: 24,000 requests
└── Reset: Rolling 60-minute window

Interactive Usage:
├── Chat messages: 500 per hour
├── File uploads: 100 per day
├── Web searches: 200 per hour
└── Agent runs: 100 per day
```text

**Team Plan:**
```text
API Requests:
├── Rate: 10,000 requests per hour
├── Burst: 500 requests per minute
├── Daily cap: 240,000 requests
└── Reset: Rolling 60-minute window

Interactive Usage:
├── Chat messages: 2,000 per hour
├── File uploads: 1,000 per day
├── Web searches: 1,000 per hour
└── Agent runs: 1,000 per day
```text

**Enterprise:**
```text
API Requests:
├── Rate: Custom (typically 50,000+ per hour)
├── Burst: Custom (typically 2,000+ per minute)
├── Daily cap: Custom or unlimited
└── Reset: Configurable windows

Interactive Usage:
├── All limits: Custom or unlimited
├── Priority processing: Guaranteed
├── Dedicated resources: Available
└── SLA guarantees: Included
```text

## Rate Limit Headers

### HTTP Response Headers

Every API response includes rate limit information:

```http
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640995200
X-RateLimit-Window: 3600
X-RateLimit-Type: sliding
```text

**Header Explanations:**
- `X-RateLimit-Limit`: Maximum requests allowed in the time window
- `X-RateLimit-Remaining`: Number of requests left in current window
- `X-RateLimit-Reset`: Unix timestamp when the limit resets
- `X-RateLimit-Window`: Time window in seconds (3600 = 1 hour)
- `X-RateLimit-Type`: Type of rate limiting (sliding, fixed, token-bucket)

### Rate Limit Exceeded Response

When limits are exceeded:

```http
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640995200
Retry-After: 3600

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Try again in 3600 seconds.",
    "details": {
      "limit": 1000,
      "remaining": 0,
      "reset_at": "2022-01-01T00:00:00Z",
      "retry_after": 3600
    }
  }
}
```text

## Rate Limiting Algorithms

### Sliding Window

mixus uses a sliding window algorithm for most rate limits:

**How it works:**
```text
Time Window: 1 hour (3600 seconds)
Request Limit: 1000 requests

Example Timeline:
├── 10:00 AM: 500 requests made
├── 10:30 AM: 300 requests made
├── 11:00 AM: Check limit from 10:00-11:00 (800 total) ✅
├── 11:15 AM: 300 requests made
└── 11:30 AM: Check limit from 10:30-11:30 (900 total) ✅
```text

**Benefits:**
- Smooth traffic distribution
- Prevents request hoarding
- Fair usage across time periods
- Predictable behavior

### Token Bucket (Burst Handling)

For burst protection, mixus implements token bucket algorithm:

**Configuration:**
```text
Bucket Capacity: 100 tokens
Refill Rate: 1 token per 36 seconds (100 per hour)
Burst Allowance: Use all 100 tokens immediately
Recovery Time: 1 hour to fully refill
```text

**Example Scenario:**
```text
Initial State: 100 tokens available
├── Send 50 requests: 50 tokens remaining
├── Wait 18 minutes: 80 tokens available (30 refilled)
├── Send 80 requests: 0 tokens remaining
└── Must wait for token refill to continue
```text

## Best Practices for Rate Limits

### Client-Side Implementation

**Respect Rate Limits:**
```python
import time
import requests

def api_request_with_backoff(url, headers, data=None, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)
        
        if response.status_code == 200:
            return response.json()
        elif response.status_code == 429:
            # Rate limit exceeded
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
        else:
            response.raise_for_status()
    
    raise Exception("Max retries exceeded")
```text

**Exponential Backoff:**
```python
import random
import time

def exponential_backoff(attempt, base_delay=1, max_delay=60):
    """Calculate delay with jitter"""
    delay = min(base_delay * (2 ** attempt), max_delay)
    jitter = random.uniform(0.1, 0.9)
    return delay * jitter

def retry_with_backoff(func, max_attempts=5):
    for attempt in range(max_attempts):
        try:
            return func()
        except RateLimitError:
            if attempt == max_attempts - 1:
                raise
            delay = exponential_backoff(attempt)
            time.sleep(delay)
```text

### Batch Processing

**Efficient Batch Operations:**
```python
def batch_requests(items, batch_size=10):
    """Process items in batches to stay within rate limits"""
    results = []
    for i in range(0, len(items), batch_size):
        batch = items[i:i + batch_size]
        
        # Process batch
        batch_results = process_batch(batch)
        results.extend(batch_results)
        
        # Rate limit friendly delay
        if i + batch_size < len(items):
            time.sleep(1)  # 1 second between batches
    
    return results
```text

### Monitoring Usage

**Track Rate Limit Usage:**
```python
class RateLimitMonitor:
    def __init__(self):
        self.usage_log = []
    
    def log_request(self, response):
        rate_limit_info = {
            'timestamp': time.time(),
            'limit': response.headers.get('X-RateLimit-Limit'),
            'remaining': response.headers.get('X-RateLimit-Remaining'),
            'reset': response.headers.get('X-RateLimit-Reset')
        }
        self.usage_log.append(rate_limit_info)
    
    def get_usage_summary(self):
        if not self.usage_log:
            return None
        
        latest = self.usage_log[-1]
        usage_percentage = (
            (int(latest['limit']) - int(latest['remaining'])) / 
            int(latest['limit']) * 100
        )
        
        return {
            'usage_percentage': usage_percentage,
            'remaining_requests': latest['remaining'],
            'reset_time': latest['reset']
        }
```text

## WebSocket Rate Limits

### Real-Time Connections

**Connection Limits:**
```text
Free Plan: 1 concurrent connection
Pro Plan: 5 concurrent connections
Team Plan: 20 concurrent connections
Enterprise: Custom limits
```text

**Message Limits:**
```text
Messages per second: 10 (all plans)
Message size: 64KB maximum
Connection duration: 24 hours maximum
Reconnect throttling: 1 second minimum delay
```text

### WebSocket Events

**Event Rate Limits:**
```javascript
// Client-side rate limiting for WebSocket
class WebSocketRateLimit {
    constructor(maxMessages = 10, windowMs = 1000) {
        this.maxMessages = maxMessages;
        this.windowMs = windowMs;
        this.messages = [];
    }
    
    canSendMessage() {
        const now = Date.now();
        const cutoff = now - this.windowMs;
        
        // Remove old messages
        this.messages = this.messages.filter(time => time > cutoff);
        
        if (this.messages.length < this.maxMessages) {
            this.messages.push(now);
            return true;
        }
        
        return false;
    }
}
```text

## Regional Rate Limits

### Geographic Distribution

**Rate Limits by Region:**
```text
US East (Primary):
├── Standard rate limits apply
├── Lowest latency processing
└── Primary data center

US West:
├── Standard rate limits apply
├── Low latency processing
└── Secondary data center

Europe:
├── Standard rate limits apply
├── GDPR compliant processing
└── EU data residency

Asia-Pacific:
├── 90% of standard rate limits
├── Higher latency tolerance
└── Regional optimization ongoing
```text

### CDN and Edge Limits

**Static Content:**
- No rate limits on cached content
- 1GB bandwidth per user per day (Free)
- 10GB bandwidth per user per day (Pro)
- 100GB bandwidth per user per day (Team)

## Troubleshooting Rate Limits

### Common Issues

**1. Unexpected 429 Errors**
```text
Cause: Burst requests exceeding limits
Solution: Implement proper request spacing
Code: Add delays between requests
```text

**2. Inconsistent Rate Limit Behavior**
```text
Cause: Multiple API keys or shared IP addresses
Solution: Check API key usage in dashboard
Monitoring: Review rate limit headers
```text

**3. WebSocket Connection Drops**
```text
Cause: Exceeding connection or message limits
Solution: Implement connection pooling
Monitoring: Track WebSocket events
```text

### Debugging Tools

**Rate Limit Inspector:**
```bash
# Check current rate limit status
curl -H "Authorization: Bearer $API_KEY" \
     -I https://api.mixus.ai/v1/chat/completions

# Response headers show current limits
```text

**Usage Analytics:**
```javascript
// Browser-based rate limit monitoring
function checkRateLimit() {
    return fetch('/status', {
        method: 'HEAD',
        headers: {
            'Authorization': `Bearer ${apiKey}`
        }
    }).then(response => ({
        limit: response.headers.get('X-RateLimit-Limit'),
        remaining: response.headers.get('X-RateLimit-Remaining'),
        reset: response.headers.get('X-RateLimit-Reset')
    }));
}
```text

## Rate Limit Optimization

### Plan Upgrade Considerations

**When to Upgrade:**
- Consistently hitting 80% of rate limits
- Frequent 429 errors in logs
- Business needs require higher throughput
- Development teams need more API access

**Expected Improvements:**
```text
Free → Pro: 16.7x increase in API limits
Pro → Team: 10x increase in API limits
Team → Enterprise: Custom, typically 5x+ increase
```text

### Architecture Optimization

**Caching Strategies:**
- Implement Redis/Memcached for API responses
- Cache static content locally
- Use ETags for conditional requests

**Request Optimization:**
- Combine multiple operations in single requests
- Use GraphQL-style queries where available
- Implement request deduplication

## Support and Escalation

### Rate Limit Increases

**Temporary Increases:**
- Available for special events or launches
- Request through support channel
- Typically granted for 24-48 hours
- Requires business justification

**Permanent Increases:**
- Available for Enterprise plans
- Requires custom contract negotiation
- Includes SLA guarantees
- Dedicated support included

### Support Channels

**Rate Limit Support:**
- Documentation: This page and API docs
- Email support: support@mixus.com
- Priority support: Available for Pro+ plans
- Enterprise support: Dedicated account manager

## Next Steps

- [Understand storage limits](/limits/storage)
- [Learn about model limits](/limits/models)
- [Monitor your usage](/tokens/tracking)

## Related Resources

- [Platform Authentication](/security/authentication)
- [Token Usage](/tokens/how-it-works)
- [Billing & Pricing](/tokens/billing) 
Getting Started

AI Models

Chats

AI Agents

Micro-Agent Patterns

AI Tools

Evaluation System

Integrations

Files & Memory

Model Context Protocol

Collaboration

Security & Privacy

Account Settings

Tokens & Billing

Limits & Quotas

Videos

Support

Copy Content

Legal

API Reference

Overview

Types of Rate Limits

Platform Access Limits

Getting Started

AI Models

Chats

AI Agents

Micro-Agent Patterns

AI Tools

Evaluation System

Integrations

Files & Memory

Model Context Protocol

Collaboration

Security & Privacy

Account Settings

Tokens & Billing

Limits & Quotas

Videos

Support

Copy Content

Legal

API Reference

​Overview

​Types of Rate Limits

​Platform Access Limits

Overview

Types of Rate Limits

Platform Access Limits