How-to guides¶
Task-oriented recipes. Each guide opens with Prerequisites + End state so you can tell at a glance whether it fits your situation.
Most guides reference scenarios from the usecase-sim testbed — we
link to the scenarios rather than copy their contents so they stay
executable and in sync with the framework.
New to DataCoolie? Start here¶
Metadata guide for new users
If you are new to DataCoolie and are not sure how to configure your first pipeline, the metadata guide walks through everything field by field — connections, sources, destinations, transforms, load strategies, and validation — in a logical beginner sequence.
| Step | Page |
|---|---|
| 1 | Build your first metadata file |
| 2 | Source patterns |
| 3 | Destination & load patterns |
| 4 | Transform patterns |
| 5 | Validation checklist |
After the guide, choose your next task¶
- The sample pipeline already works and I need my own input next: Use your own data after the quickstart
- I need a specific metadata storage backend: File metadata, Database metadata, or API metadata
- I need runtime behavior recipes such as merge, partitioning, or maintenance: start in Authoring pipelines below
- I need platform deployment steps: jump to Deploying below
Configuring metadata (by storage backend)¶
Once you understand the metadata shape, choose a storage backend:
- File metadata — JSON as canonical, YAML/Excel generated.
- Database metadata — SQLAlchemy schema, DDL, concurrency.
- API metadata — bringing your own metadata service.
Authoring pipelines¶
- Run a stage
- Merge & SCD2
- Partitioning & column sanitization
- Maintenance (vacuum / optimize)
- Expected-failure scenarios