Tasks

Collection Tasks

Collection tasks provide powerful iteration patterns for processing arrays and collections in Compozy workflows. They transform array data into parallel or sequential task executions, enabling efficient batch processing with sophisticated filtering, error handling, and result aggregation capabilities.

Overview

Collection tasks enable sophisticated array processing with enterprise-grade capabilities. They automatically transform input arrays into individual task executions, providing powerful orchestration patterns for batch operations.

Loading diagram...

Key Capabilities

Intelligent Processing Modes

Choose between sequential ordering and parallel performance based on your workflow requirements

Advanced Filtering

Use CEL expressions to process only items that meet specific conditions, reducing unnecessary processing

Batch Optimization

Process large datasets efficiently with configurable batch sizes and memory management

Rich Context Access

Access current item, processing index, collection metadata, and parent workflow context

Failure Resilience

Sophisticated error handling with partial results, best-effort strategies, and graceful degradation

Result Aggregation

Automatically collect and organize results from all iterations with template-based transformations

Task Structure

Basic Collection Task

id: process-users
mode: parallel
strategy: best_effort

# Source collection
items: "{{ .workflow.input.users }}"

# Optional filtering
filter: "{{ ne .item.status 'inactive' }}"

# Task template applied to each item
task:
  id: "process-user-{{ .index }}"
  $use: agent(local::agents.#(id=="user-processor"))
  action: process_user
  with:
    user_id: "{{ .item.id }}"
    user_data: "{{ .item }}"
    processing_index: "{{ .index }}"

outputs:
  processed_users: "{{ .output }}"
  total_processed: "{{ len .output }}"

Configuration Options

Control how collection items are processed:

Sequential Processing
id: sequential-collection
mode: sequential
items: "{{ .workflow.input.documents }}"

task:
  id: "process-doc-{{ .index }}"
  $use: tool(local::tools.#(id=="document-processor"))
  with:
    document: "{{ .item }}"
    sequence_number: "{{ .index }}"
Parallel Processing
id: parallel-collection
mode: parallel
strategy: wait_all
max_workers: 8
items: "{{ .workflow.input.images }}"

task:
  id: "process-image-{{ .index }}"
  $use: tool(local::tools.#(id=="image-processor"))
  with:
    image: "{{ .item }}"
    parallel_index: "{{ .index }}"
Batched Processing
id: batched-collection
mode: parallel
batch_size: 5
items: "{{ .workflow.input.records }}"

task:
  id: "process-batch-{{ .batch_index }}"
  $use: tool(local::tools.#(id=="batch-processor"))
  with:
    records: "{{ .batch }}"
    batch_number: "{{ .batch_index }}"

Processing Patterns

Sequential Processing

Process items one after another when order matters or when each task depends on previous results:

When to Use Sequential Processing:

  • Order-dependent operations (document processing, data transformations)
  • Resource-constrained environments with limited parallel capacity
  • Operations that build on previous results
  • Rate-limited external APIs that require sequential calls

Parallel Processing

Process items concurrently when order doesn't matter and you need maximum throughput:

When to Use Parallel Processing:

  • Independent operations that don't depend on each other
  • CPU or I/O intensive tasks that benefit from concurrency
  • Large datasets where processing time is a constraint
  • Multiple API calls that can be made simultaneously

Performance Considerations:

  • Use max_workers to control resource usage and prevent overwhelming external services
  • Consider memory usage when processing large items in parallel
  • Monitor external API rate limits and adjust concurrency accordingly

Batch Processing

Process large datasets in manageable chunks to optimize memory usage and provide better control over resource consumption:

When to Use Batch Processing:

  • Processing thousands or millions of items
  • Memory-constrained environments
  • External APIs with bulk operation support
  • Database operations that benefit from batch inserts/updates

Batch Size Guidelines:

  • Small batches (1-10): Real-time processing, low latency requirements
  • Medium batches (10-100): Balanced performance and resource usage
  • Large batches (100-1000+): Maximum throughput for bulk operations

Learn more about deployment options and advanced patterns.

Advanced Features

Conditional Processing

Best Practices

Choose the Right Processing Mode

Use sequential for order-dependent processing, parallel for independent operations, and batch for large datasets

Implement Early Filtering

Filter items before processing to reduce load and improve performance using CEL expressions

Handle Partial Failures Gracefully

Use best_effort strategy for non-critical operations and implement proper error handling

Optimize Batch Configurations

Balance memory usage with processing efficiency by choosing appropriate batch sizes

Monitor Performance Metrics

Track processing times, success rates, and resource usage to optimize collection performance

Set Realistic Timeouts

Configure appropriate timeouts for individual items and overall collection operations

Performance Guidelines

Memory Management

  • Use batch processing for datasets larger than 1000 items

  • Monitor memory usage in production environments

  • Consider streaming patterns for extremely large datasets

Concurrency Control

  • Set max_workers based on external service limits

  • Use sequential mode for rate-limited APIs

  • Implement backoff strategies for failed requests

Error Recovery

  • Always use best_effort for non-critical operations

  • Implement proper logging for failed items

  • Consider retry mechanisms for transient failures

Next Steps


Collection tasks are the backbone of batch processing in Compozy, transforming arrays into sophisticated workflows with enterprise-grade reliability, performance, and flexibility. Master these patterns to build scalable data processing solutions that grow with your needs.