Skip to main content

Batch Processing & Priority System

ObjectWeaver provides a sophisticated batch processing and priority system that allows you to optimize costs, manage processing speed, and control the execution order of field generation based on your application's requirements.

Overview

The priority system enables you to:

  • Reduce API costs by 50% using OpenAI's Batch API for non-urgent requests
  • Control processing order by assigning priority levels to fields
  • Balance speed vs. cost by routing urgent requests to real-time processing and background tasks to batch processing
  • Optimize resource usage with automatic batching based on size, memory, and time thresholds

How It Works

Priority-Based Routing

Each field in your schema can have a Priority value that determines how it's processed:

Priority RangeProcessing ModeUse CaseResponse TimeCost
< 0 (e.g., -1, -2, -3)Batch ProcessingBackground tasks, reports, bulk operations5+ minutes50% cheaper
≥ 0 (e.g., 0, 1, 2, 3)Real-time ProcessingUser-facing content, time-sensitive dataImmediateStandard cost

Batch Processing Flow

Flush Triggers

Batches are automatically sent when ANY of these conditions are met:

  1. Request Count: Reaches LLM_BATCH_MAX_REQUESTS (default: 50 jobs)
  2. Memory Size: Exceeds LLM_BATCH_MAX_MEMORY_MB (default: 190 MB)
  3. Time Elapsed: LLM_BATCH_FLUSH_INTERVAL_SEC passes (default: 300 seconds / 5 minutes)

Configuration

Environment Variables

Configure the batch processing system using these environment variables:

# Enable batch processing
LLM_ENABLE_BATCH=true

# Priority threshold: jobs with priority below this value will be batched
LLM_BATCH_PRIORITY_THRESHOLD=0

# Maximum requests per batch
LLM_BATCH_MAX_REQUESTS=50

# Maximum memory per batch (in MB)
LLM_BATCH_MAX_MEMORY_MB=190

# Flush interval (in seconds)
LLM_BATCH_FLUSH_INTERVAL_SEC=300

# Optional: Webhook for batch completion notifications
LLM_BATCH_WEBHOOK_URL=https://your-server.com/webhook
LLM_BATCH_WEBHOOK_API_KEY=your-api-key

Tuning for Different Workloads

High-Volume Background Processing:

LLM_BATCH_MAX_REQUESTS=100      # Larger batches
LLM_BATCH_FLUSH_INTERVAL_SEC=600 # Wait longer (10 min)

Low-Latency Requirements:

LLM_BATCH_MAX_REQUESTS=20       # Smaller batches
LLM_BATCH_FLUSH_INTERVAL_SEC=60 # Flush faster (1 min)

Examples

Example 1: Analytics Dashboard

Generate a comprehensive analytics report with mixed priorities:

package main

import (
"github.com/objectweaver/go-sdk/jsonSchema"
)

// Analytics report with mixed priorities
analyticsDefinition := jsonSchema.Definition{
Type: jsonSchema.Object,
Instruction: "Generate a comprehensive analytics report with real-time and background data.",
Model: jsonSchema.GPT4oMini,
Properties: map[string]jsonSchema.Definition{
// Real-time data - high priority
"currentMetrics": {
Type: jsonSchema.Object,
Instruction: "Provide current key metrics that users need to see immediately.",
Priority: 2, // Real-time processing
Properties: map[string]jsonSchema.Definition{
"activeUsers": {
Type: jsonSchema.Integer,
Instruction: "Current number of active users.",
},
"systemStatus": {
Type: jsonSchema.String,
Instruction: "Current system health status.",
},
},
},
// Background analysis - low priority
"detailedAnalysis": {
Type: jsonSchema.String,
Instruction: "Provide a detailed statistical analysis of trends over the past month. Include insights and recommendations.",
Priority: -2, // Batch processing - can wait
},
"historicalComparison": {
Type: jsonSchema.String,
Instruction: "Compare current performance with historical data from the past year.",
Priority: -1, // Batch processing
},
// Summary - medium priority
"executiveSummary": {
Type: jsonSchema.String,
Instruction: "Create a concise executive summary highlighting key findings.",
Priority: 1, // Real-time but lower than currentMetrics
},
},
}

Expected Behavior:

  • currentMetrics and executiveSummary return immediately (real-time)
  • detailedAnalysis and historicalComparison are batched and return after 5+ minutes
  • Cost savings: ~50% reduction on the batch-processed fields

Example 2: Content Generation Pipeline

Generate blog content with optimized priorities:

package main

import (
"github.com/objectweaver/go-sdk/jsonSchema"
)

// Blog post generation with priority optimization
blogDefinition := jsonSchema.Definition{
Type: jsonSchema.Object,
Instruction: "Generate a complete blog post with optimized processing priorities.",
Model: jsonSchema.GPT4oMini,
Properties: map[string]jsonSchema.Definition{
// User-facing content - immediate
"title": {
Type: jsonSchema.String,
Instruction: "Create an engaging, SEO-optimized title (max 60 characters).",
Priority: 3, // Highest priority - needed first
},
"excerpt": {
Type: jsonSchema.String,
Instruction: "Write a compelling excerpt (150-200 words) that hooks the reader.",
Priority: 2, // High priority - shows in previews
},
"mainContent": {
Type: jsonSchema.String,
Instruction: "Write the main blog post content (800-1200 words) with proper structure and engaging narrative.",
Priority: 1, // Medium priority - important but can wait slightly
},
// Background content - can be batch processed
"seoKeywords": {
Type: jsonSchema.Array,
Instruction: "Generate 10-15 relevant SEO keywords and phrases.",
Priority: -1, // Batch - not immediately visible
Items: &jsonSchema.Definition{
Type: jsonSchema.String,
},
},
"metaDescription": {
Type: jsonSchema.String,
Instruction: "Create an SEO-optimized meta description (150-160 characters).",
Priority: -1, // Batch - backend optimization
},
"relatedTopics": {
Type: jsonSchema.Array,
Instruction: "Suggest 5 related topics for future blog posts.",
Priority: -2, // Batch - planning content
Items: &jsonSchema.Definition{
Type: jsonSchema.String,
},
},
"socialMediaSnippets": {
Type: jsonSchema.Object,
Instruction: "Create social media promotional content.",
Priority: -1, // Batch - used later for promotion
Properties: map[string]jsonSchema.Definition{
"twitter": {
Type: jsonSchema.String,
Instruction: "Twitter/X post (max 280 characters).",
},
"linkedin": {
Type: jsonSchema.String,
Instruction: "LinkedIn post (max 3000 characters).",
},
},
},
},
}

Expected Behavior:

  • title, excerpt, and mainContent return immediately for user display
  • SEO and social media content are batched and available 5+ minutes later
  • Result: Fast user experience with 50% cost savings on background processing

Example 3: E-commerce Product Enhancement

Bulk product data enrichment with batch processing:

package main

import (
"github.com/objectweaver/go-sdk/jsonSchema"
)

// Product enhancement - perfect for bulk batch processing
productDefinition := jsonSchema.Definition{
Type: jsonSchema.Object,
Instruction: "Enhance product information with AI-generated content.",
Model: jsonSchema.GPT4oMini,
Properties: map[string]jsonSchema.Definition{
// All fields use batch processing for cost optimization
"enhancedDescription": {
Type: jsonSchema.String,
Instruction: "Create a compelling, detailed product description (200-300 words) that highlights features and benefits.",
Priority: -1, // Batch processing
},
"bulletPoints": {
Type: jsonSchema.Array,
Instruction: "Generate 5-7 concise bullet points highlighting key product features.",
Priority: -1, // Batch processing
Items: &jsonSchema.Definition{
Type: jsonSchema.String,
},
},
"targetAudience": {
Type: jsonSchema.String,
Instruction: "Describe the ideal customer for this product.",
Priority: -2, // Batch processing - lower priority
},
"useCases": {
Type: jsonSchema.Array,
Instruction: "List 3-5 common use cases or scenarios for this product.",
Priority: -1, // Batch processing
Items: &jsonSchema.Definition{
Type: jsonSchema.String,
},
},
"seoTags": {
Type: jsonSchema.Array,
Instruction: "Generate 15-20 relevant search tags for better discoverability.",
Priority: -2, // Batch processing - lower priority
Items: &jsonSchema.Definition{
Type: jsonSchema.String,
},
},
},
}

Expected Behavior:

  • All fields are batch processed for maximum cost savings
  • Perfect for overnight bulk processing of product catalogs
  • Cost savings: 50% reduction on all product enhancements

Best Practices

1. Choose the Right Priority

ScenarioRecommended PriorityReasoning
User-facing content displayed immediately2-3Users are waiting for response
Important but not urgent content0-1Balance between speed and cost
SEO metadata, tags, keywords-1Not immediately visible to users
Analytics, reports, bulk operations-2 to -3Can wait, maximize cost savings

2. Avoid Batch Processing for Streaming

Important

Batch processing is incompatible with streaming requests. If you're using streaming endpoints, set all field priorities to ≥ 0 to ensure immediate response.

Streaming requires immediate, incremental responses, while batch processing is asynchronous and can take 5+ minutes to complete.

3. Monitor Batch Performance

Track your batch processing metrics:

// Get batch statistics (if self-hosting)
stats := orchestrator.GetBatchManager().GetStats()

log.Printf(`
Batch Processing Stats:
Pending Jobs: %d
Total Queued: %d
Batches Sent: %d
Memory Usage: %.2f MB
Last Flush: %s
`,
stats.PendingJobs,
stats.TotalJobsQueued,
stats.TotalBatchesSent,
stats.CurrentMemoryMB,
stats.LastFlushTime.Format(time.RFC3339),
)

4. Cost Calculation Example

Scenario: Processing 1,000 product descriptions

Real-time API:

  • Cost per request: $0.002
  • Total: $2.00

Batch API:

  • Cost per request: $0.001 (50% off)
  • Total: $1.00
  • Savings: $1.00 (50%)

For high-volume applications processing thousands of requests daily, this can save hundreds or thousands of dollars per month.

Limitations and Considerations

Batch Processing Limitations

  1. Latency: Batch requests take 5+ minutes to complete
  2. No Streaming: Cannot use with streaming endpoints
  3. Memory Limits: Individual batches limited to 190-200 MB
  4. Request Limits: Maximum 50,000 requests per batch (configurable to lower values)