: Keep default XCom payloads under a few kilobytes.
"Airflow XCom Exclusive" does not refer to a specific standalone product, but rather to the exclusive control and management of data shared between tasks within Apache Airflow In Airflow,
XCom operations involve two main actions: (sending data) and Pulling (retrieving data). 1. Pushing Data
Even with a powerful custom backend, pushing and pulling tens of thousands of XComs can create performance issues. Be strategic about what you share and how often. airflow xcom exclusive
To save storage costs and improve transfer speeds, you can enable compression for the data stored in your object store.
def push_explicit(**context): context['ti'].xcom_push(key='my_key', value='my_value')
This keeps your database clean and light while allowing tasks to share massive amounts of data seamlessly. 2. XCom Cleaners : Keep default XCom payloads under a few kilobytes
The keyword "exclusive" in the context of XCom is not an official Airflow term, but it encapsulates several important characteristics that make XCom a unique and mechanism for data exchange.
This article explores the concept of data sharing—how to push and pull XCom data that is uniquely targeted to a specific task, limiting visibility and ensuring clean, decoupled workflow design. What is Airflow XCom? (The Foundation)
If you do not want to set up a Custom XCom Backend, you can use the "Cloud Path" strategy. Task A uploads a file to S3 and returns the file path string. Task B pulls the file path string via XCom and opens the file directly from S3. Best Practices for Using XCom Pushing Data Even with a powerful custom backend,
Any serializable object, typically strings, numbers, or small JSON-compatible dictionaries.
def consume_metadata(**kwargs): ti = kwargs['ti'] # Pull from specific task with explicit key file_path = ti.xcom_pull(task_ids='push_metadata', key='source_file_path') record_count = ti.xcom_pull(task_ids='push_metadata', key='record_count') # Pull the return_value (default XCom) from another task result = ti.xcom_pull(task_ids='another_task') # key='return_value' is implicit
For structured data that needs to be shared between pipelines, consider writing directly to a staging table in your data warehouse and passing the table name via XCom.
def extract(**context): context['ti'].xcom_push(key='user_id', value=42) return "raw": "data"