Getting started¶

For most new users, the smoothest path is:

Installation — install the smallest useful engine + table-format extras.
Quickstart · Polars — laptop-only, no Docker, no JVM. Fastest way to see a pipeline run.
Use your own data after the quickstart — keep the same runner and swap the sample input for your own files.
Your first dataflow — move from one stage to an ordered bronze→silver flow.

Choose Quickstart · Spark instead of Polars only if Spark is already your target runtime or you want early parity with Fabric, Databricks, or another Spark-first environment.

Choose your route¶

Situation	Start here	Then go to
I am new and want the first successful run	Installation	Quickstart · Polars
I already know my runtime will be Spark	Installation	Quickstart · Spark
The sample worked and now I need my own files	Use your own data after the quickstart	Metadata guide for new users
I want the deeper workflow model before building	Concepts	Metadata guide or the quickstarts
I need multi-stage orchestration	Your first dataflow	How-to guides

Audience

This documentation is for both hands-on builders and readers who mainly want to understand the workflow. The getting-started pages are written for new data engineers, analytics engineers, and adjacent backend teams who want a concrete starting point before learning the full model.

If you do not need to run code yet, start with Concepts, especially Architecture, Metadata model, and Orchestration.

If you do want to run the examples, basic Python and command-line familiarity will help, and unfamiliar framework terms are defined or linked on first use.