Concepts¶
Explanation-tier documentation. Each page answers "why does DataCoolie work this way?" rather than "how do I do X?". For task recipes see How-to; for precise field-level contracts see Reference.
This is the best starting section if you want to understand the workflow, terminology, and operating model without running code.
Start with Architecture if you're new to the framework. Otherwise jump to the concept you need:
- Architecture — component diagram, dependency directions, runtime flow.
- Engines —
BaseEngine[DF], thefmtparameter contract, format dispatch. - Platforms — file I/O and secret-provider responsibilities, per-cloud specifics.
- Metadata model — connections, dataflows, transforms.
- Metadata providers — file vs database vs API, picking the right one.
- Sources & destinations — plugin registries, format → reader mapping.
- Transformers & pipeline — ordering slots, tracking labels.
- Load strategies — append / overwrite / merge / SCD2.
- Watermarks — raw-JSON contract,
__datetime__sentinel. - Orchestration — driver, job distributor, parallel executor, retry handler.
- Logging — ETL logger vs system logger,
LogPurpose, partitioning. - Secrets — provider vs resolver,
secrets_refschema.