Destinations¶
base
¶
Abstract base classes for destination writers and load strategies.
BaseDestinationWriter[DF] uses the Template Method pattern for
writes and the Strategy pattern for load-type dispatch.
BaseLoadStrategy defines the contract for individual load strategies
(overwrite, append, merge_upsert, merge_overwrite, scd2).
BaseDestinationWriter
¶
BaseDestinationWriter(engine: BaseEngine[DF])
Bases: ABC, Generic[DF]
Abstract destination writer with Template Method lifecycle.
Subclasses implement :meth:_write_internal. The base class
manages timing, error wrapping, runtime info, and maintenance.
get_runtime_info
¶
Return runtime information from the most recent write.
run_maintenance
¶
run_maintenance(dataflow: DataFlow, *, do_compact: bool = True, do_cleanup: bool = True, retention_hours: Optional[int] = None) -> DestinationRuntimeInfo
Run maintenance operations (compact + cleanup) as a single result.
Sub-operation outcomes are stored in operation_details as a list
of dicts, each with an "operation" key ("compact" / "cleanup"
/ "history"). File/byte metrics are summed across both operations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataflow
|
DataFlow
|
Pipeline configuration. |
required |
do_compact
|
bool
|
Whether to run compaction. |
True
|
do_cleanup
|
bool
|
Whether to run cleanup. |
True
|
retention_hours
|
Optional[int]
|
Cleanup retention override (default: 168 h). |
None
|
Returns:
| Type | Description |
|---|---|
DestinationRuntimeInfo
|
A single :class: |
write
¶
Write data to the destination (Template Method).
- Validate destination path.
- Delegate to :meth:
_write_internal. - Populate and return runtime info.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DF
|
DataFrame to write. |
required |
dataflow
|
DataFlow
|
Full pipeline configuration. |
required |
Returns:
| Type | Description |
|---|---|
DestinationRuntimeInfo
|
Runtime metrics for the write operation. |
Raises:
| Type | Description |
|---|---|
DestinationError
|
On any write failure. |
BaseLoadStrategy
¶
Bases: ABC
Abstract load strategy (Strategy pattern).
Each strategy knows how to write a DataFrame to a destination path using a specific load type (overwrite, append, merge, etc.).
load_type
abstractmethod
property
¶
The load type this strategy handles (e.g. "overwrite").
execute
abstractmethod
¶
execute(df: DF, table_name: str, dataflow: DataFlow, engine: BaseEngine[DF], path: Optional[str] = None) -> None
Execute the load strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DF
|
DataFrame to write. |
required |
table_name
|
str
|
Fully qualified table name (always available). |
required |
dataflow
|
DataFlow
|
Pipeline configuration. |
required |
engine
|
BaseEngine[DF]
|
DataFrame engine. |
required |
path
|
Optional[str]
|
Optional destination path for path-based fallback. |
None
|
file_writer
¶
File-format destination writer (Parquet, CSV, JSON, JSONL, Avro).
Writes DataFrames to storage paths, supporting date-folder partitioning
via connection.date_folder_partitions resolved with the current UTC
timestamp. Only append, overwrite, and full_load load types
are supported. Maintenance (compact / cleanup) is not applicable.
FileWriter
¶
FileWriter(engine: BaseEngine[DF])
Bases: BaseDestinationWriter[DF]
Destination writer for flat-file formats (Parquet, CSV, JSON, …).
Delegates to :func:get_load_strategy for the actual write logic.
Date-folder partitioning is resolved at write time using utc_now().
Maintenance operations (compact / cleanup) are not supported for
flat files and will raise :class:DestinationError if invoked.
delta_writer
¶
Delta Lake destination writer.
Writes DataFrames to Delta Lake tables, dispatching to the appropriate
load strategy based on dataflow.load_type.
DeltaWriter
¶
DeltaWriter(engine: BaseEngine[DF])
Bases: BaseDestinationWriter[DF]
Destination writer for Delta Lake tables.
Delegates to :func:get_load_strategy for the actual write logic
based on the dataflow's load type.
Also supports: * Maintenance operations (compact, cleanup) via the base class.
iceberg_writer
¶
Iceberg destination writer.
Writes DataFrames to Iceberg tables, preferring table-name-based operations (DataFrameWriterV2 / SQL MERGE) via catalog. Falls back to path-based operations when no catalog is configured.
IcebergWriter
¶
IcebergWriter(engine: BaseEngine[DF])
Bases: BaseDestinationWriter[DF]
Destination writer for Apache Iceberg tables.
Prefers table-name-based operations (catalog-aware) for Iceberg. Falls back to path-based operations when no catalog is configured.
load_strategies
¶
Load strategies for destination writers.
Each strategy implements a specific write mode (overwrite, append,
merge_upsert, merge_overwrite, scd2) via the :class:BaseLoadStrategy
interface.
The :data:LOAD_STRATEGIES registry and :func:get_load_strategy helper
provide lookup by load-type string.
AppendStrategy
¶
MergeOverwriteStrategy
¶
MergeUpsertStrategy
¶
OverwriteStrategy
¶
SCD2Strategy
¶
Bases: BaseLoadStrategy
Slowly Changing Dimension Type 2.
Closes existing current records and inserts new versions using
the staged-updates MERGE pattern. Requires merge_keys and
scd2_effective_column in destination.configure.
get_load_strategy
¶
get_load_strategy(load_type: str) -> BaseLoadStrategy
Look up a load strategy by type string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
load_type
|
str
|
Load type (e.g. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Matching |
BaseLoadStrategy
|
class: |
Raises:
| Type | Description |
|---|---|
DestinationError
|
If the load type is not registered. |