Overview
mixus implements intelligent rate limiting to ensure fair usage across all users while maintaining optimal performance. Our rate limiting system protects against abuse while providing generous limits for legitimate usage patterns.Types of Rate Limits
Platform Access Limits
Limits on platform access: Authenticated Access:Copy
Ask AI
Scope: Authenticated users
Method: All platform features
Identification: User account
Rate: Based on subscription plan
```text
**Public Access:**
```text
Scope: Public features only
Method: Limited functionality
Identification: IP address
Rate: Conservative limits
```text
### Interactive Usage Limits
Limits on web application usage:
**Chat Interface:**
- Message sending frequency
- File upload rates
- Search query limits
- Agent execution frequency
**User Actions:**
- Account modifications
- Settings updates
- File management operations
## Rate Limit Structure
### Current Rate Limits by Plan
**Free Plan:**
```text
API Requests:
├── Rate: 60 requests per hour
├── Burst: 10 requests per minute
├── Daily cap: 1,440 requests
└── Reset: Rolling 60-minute window
Interactive Usage:
├── Chat messages: 20 per hour
├── File uploads: 5 per month
├── Web searches: 10 per hour
└── Agent runs: 5 per day
```text
**Pro Plan:**
```text
API Requests:
├── Rate: 1,000 requests per hour
├── Burst: 100 requests per minute
├── Daily cap: 24,000 requests
└── Reset: Rolling 60-minute window
Interactive Usage:
├── Chat messages: 500 per hour
├── File uploads: 100 per day
├── Web searches: 200 per hour
└── Agent runs: 100 per day
```text
**Team Plan:**
```text
API Requests:
├── Rate: 10,000 requests per hour
├── Burst: 500 requests per minute
├── Daily cap: 240,000 requests
└── Reset: Rolling 60-minute window
Interactive Usage:
├── Chat messages: 2,000 per hour
├── File uploads: 1,000 per day
├── Web searches: 1,000 per hour
└── Agent runs: 1,000 per day
```text
**Enterprise:**
```text
API Requests:
├── Rate: Custom (typically 50,000+ per hour)
├── Burst: Custom (typically 2,000+ per minute)
├── Daily cap: Custom or unlimited
└── Reset: Configurable windows
Interactive Usage:
├── All limits: Custom or unlimited
├── Priority processing: Guaranteed
├── Dedicated resources: Available
└── SLA guarantees: Included
```text
## Rate Limit Headers
### HTTP Response Headers
Every API response includes rate limit information:
```http
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640995200
X-RateLimit-Window: 3600
X-RateLimit-Type: sliding
```text
**Header Explanations:**
- `X-RateLimit-Limit`: Maximum requests allowed in the time window
- `X-RateLimit-Remaining`: Number of requests left in current window
- `X-RateLimit-Reset`: Unix timestamp when the limit resets
- `X-RateLimit-Window`: Time window in seconds (3600 = 1 hour)
- `X-RateLimit-Type`: Type of rate limiting (sliding, fixed, token-bucket)
### Rate Limit Exceeded Response
When limits are exceeded:
```http
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640995200
Retry-After: 3600
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 3600 seconds.",
"details": {
"limit": 1000,
"remaining": 0,
"reset_at": "2022-01-01T00:00:00Z",
"retry_after": 3600
}
}
}
```text
## Rate Limiting Algorithms
### Sliding Window
mixus uses a sliding window algorithm for most rate limits:
**How it works:**
```text
Time Window: 1 hour (3600 seconds)
Request Limit: 1000 requests
Example Timeline:
├── 10:00 AM: 500 requests made
├── 10:30 AM: 300 requests made
├── 11:00 AM: Check limit from 10:00-11:00 (800 total) ✅
├── 11:15 AM: 300 requests made
└── 11:30 AM: Check limit from 10:30-11:30 (900 total) ✅
```text
**Benefits:**
- Smooth traffic distribution
- Prevents request hoarding
- Fair usage across time periods
- Predictable behavior
### Token Bucket (Burst Handling)
For burst protection, mixus implements token bucket algorithm:
**Configuration:**
```text
Bucket Capacity: 100 tokens
Refill Rate: 1 token per 36 seconds (100 per hour)
Burst Allowance: Use all 100 tokens immediately
Recovery Time: 1 hour to fully refill
```text
**Example Scenario:**
```text
Initial State: 100 tokens available
├── Send 50 requests: 50 tokens remaining
├── Wait 18 minutes: 80 tokens available (30 refilled)
├── Send 80 requests: 0 tokens remaining
└── Must wait for token refill to continue
```text
## Best Practices for Rate Limits
### Client-Side Implementation
**Respect Rate Limits:**
```python
import time
import requests
def api_request_with_backoff(url, headers, data=None, max_retries=3):
for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
# Rate limit exceeded
retry_after = int(response.headers.get('Retry-After', 60))
print(f"Rate limited. Waiting {retry_after} seconds...")
time.sleep(retry_after)
else:
response.raise_for_status()
raise Exception("Max retries exceeded")
```text
**Exponential Backoff:**
```python
import random
import time
def exponential_backoff(attempt, base_delay=1, max_delay=60):
"""Calculate delay with jitter"""
delay = min(base_delay * (2 ** attempt), max_delay)
jitter = random.uniform(0.1, 0.9)
return delay * jitter
def retry_with_backoff(func, max_attempts=5):
for attempt in range(max_attempts):
try:
return func()
except RateLimitError:
if attempt == max_attempts - 1:
raise
delay = exponential_backoff(attempt)
time.sleep(delay)
```text
### Batch Processing
**Efficient Batch Operations:**
```python
def batch_requests(items, batch_size=10):
"""Process items in batches to stay within rate limits"""
results = []
for i in range(0, len(items), batch_size):
batch = items[i:i + batch_size]
# Process batch
batch_results = process_batch(batch)
results.extend(batch_results)
# Rate limit friendly delay
if i + batch_size < len(items):
time.sleep(1) # 1 second between batches
return results
```text
### Monitoring Usage
**Track Rate Limit Usage:**
```python
class RateLimitMonitor:
def __init__(self):
self.usage_log = []
def log_request(self, response):
rate_limit_info = {
'timestamp': time.time(),
'limit': response.headers.get('X-RateLimit-Limit'),
'remaining': response.headers.get('X-RateLimit-Remaining'),
'reset': response.headers.get('X-RateLimit-Reset')
}
self.usage_log.append(rate_limit_info)
def get_usage_summary(self):
if not self.usage_log:
return None
latest = self.usage_log[-1]
usage_percentage = (
(int(latest['limit']) - int(latest['remaining'])) /
int(latest['limit']) * 100
)
return {
'usage_percentage': usage_percentage,
'remaining_requests': latest['remaining'],
'reset_time': latest['reset']
}
```text
## WebSocket Rate Limits
### Real-Time Connections
**Connection Limits:**
```text
Free Plan: 1 concurrent connection
Pro Plan: 5 concurrent connections
Team Plan: 20 concurrent connections
Enterprise: Custom limits
```text
**Message Limits:**
```text
Messages per second: 10 (all plans)
Message size: 64KB maximum
Connection duration: 24 hours maximum
Reconnect throttling: 1 second minimum delay
```text
### WebSocket Events
**Event Rate Limits:**
```javascript
// Client-side rate limiting for WebSocket
class WebSocketRateLimit {
constructor(maxMessages = 10, windowMs = 1000) {
this.maxMessages = maxMessages;
this.windowMs = windowMs;
this.messages = [];
}
canSendMessage() {
const now = Date.now();
const cutoff = now - this.windowMs;
// Remove old messages
this.messages = this.messages.filter(time => time > cutoff);
if (this.messages.length < this.maxMessages) {
this.messages.push(now);
return true;
}
return false;
}
}
```text
## Regional Rate Limits
### Geographic Distribution
**Rate Limits by Region:**
```text
US East (Primary):
├── Standard rate limits apply
├── Lowest latency processing
└── Primary data center
US West:
├── Standard rate limits apply
├── Low latency processing
└── Secondary data center
Europe:
├── Standard rate limits apply
├── GDPR compliant processing
└── EU data residency
Asia-Pacific:
├── 90% of standard rate limits
├── Higher latency tolerance
└── Regional optimization ongoing
```text
### CDN and Edge Limits
**Static Content:**
- No rate limits on cached content
- 1GB bandwidth per user per day (Free)
- 10GB bandwidth per user per day (Pro)
- 100GB bandwidth per user per day (Team)
## Troubleshooting Rate Limits
### Common Issues
**1. Unexpected 429 Errors**
```text
Cause: Burst requests exceeding limits
Solution: Implement proper request spacing
Code: Add delays between requests
```text
**2. Inconsistent Rate Limit Behavior**
```text
Cause: Multiple API keys or shared IP addresses
Solution: Check API key usage in dashboard
Monitoring: Review rate limit headers
```text
**3. WebSocket Connection Drops**
```text
Cause: Exceeding connection or message limits
Solution: Implement connection pooling
Monitoring: Track WebSocket events
```text
### Debugging Tools
**Rate Limit Inspector:**
```bash
# Check current rate limit status
curl -H "Authorization: Bearer $API_KEY" \
-I https://api.mixus.ai/v1/chat/completions
# Response headers show current limits
```text
**Usage Analytics:**
```javascript
// Browser-based rate limit monitoring
function checkRateLimit() {
return fetch('/status', {
method: 'HEAD',
headers: {
'Authorization': `Bearer ${apiKey}`
}
}).then(response => ({
limit: response.headers.get('X-RateLimit-Limit'),
remaining: response.headers.get('X-RateLimit-Remaining'),
reset: response.headers.get('X-RateLimit-Reset')
}));
}
```text
## Rate Limit Optimization
### Plan Upgrade Considerations
**When to Upgrade:**
- Consistently hitting 80% of rate limits
- Frequent 429 errors in logs
- Business needs require higher throughput
- Development teams need more API access
**Expected Improvements:**
```text
Free → Pro: 16.7x increase in API limits
Pro → Team: 10x increase in API limits
Team → Enterprise: Custom, typically 5x+ increase
```text
### Architecture Optimization
**Caching Strategies:**
- Implement Redis/Memcached for API responses
- Cache static content locally
- Use ETags for conditional requests
**Request Optimization:**
- Combine multiple operations in single requests
- Use GraphQL-style queries where available
- Implement request deduplication
## Support and Escalation
### Rate Limit Increases
**Temporary Increases:**
- Available for special events or launches
- Request through support channel
- Typically granted for 24-48 hours
- Requires business justification
**Permanent Increases:**
- Available for Enterprise plans
- Requires custom contract negotiation
- Includes SLA guarantees
- Dedicated support included
### Support Channels
**Rate Limit Support:**
- Documentation: This page and API docs
- Email support: support@mixus.com
- Priority support: Available for Pro+ plans
- Enterprise support: Dedicated account manager
## Next Steps
- [Understand storage limits](/limits/storage)
- [Learn about model limits](/limits/models)
- [Monitor your usage](/tokens/tracking)
## Related Resources
- [Platform Authentication](/security/authentication)
- [Token Usage](/tokens/how-it-works)
- [Billing & Pricing](/tokens/billing)