Secrets¶
TL;DR DataCoolie has two secret interfaces: BaseSecretProvider
(fetches secrets from a backend) and BaseSecretResolver (resolves
placeholder strings inside a Connection.configure dict). The platform is
the default provider; resolvers are per-syntax (env:FOO_PASSWORD, vault:…).
Provider vs resolver¶
flowchart LR
A[Connection.configure] --> B{Resolver matches?}
B -->|env:…| C[EnvResolver]
B -->|vault:…| D[Custom resolver]
B -->|plain string| E[Pass through]
C --> F[os.environ]
D --> G[BaseSecretProvider.fetch_secret]
G --> H[(Fabric Key Vault / AWS SM / dbutils.secrets / env)]
- Provider (
BaseSecretProvider) = where secrets live. Each platform is a native provider by subclassingBasePlatform, so every platform brings its own secret backend: Local usesos.environ, Fabric uses Azure Key Vault throughnotebookutils.credentials, Databricks usesdbutils.secrets, and AWS uses AWS Secrets Manager. - Resolver (
BaseSecretResolver) = how to interpret a placeholder string. Built-inEnvResolverhandlesenv:VAR_NAME; you can add more.
This split means a third-party plugin can define its own placeholder syntax without caring about the vault backend, and vice versa. See ADR-0002.
secrets_ref schema¶
Connection.secrets_ref maps each secret source to the configure fields that
should be resolved from that source. Each listed field must already exist in
configure, and its current value must be the vault key or secret name to look
up:
{
"configure": {
"host": "db.internal",
"port": 5432,
"username": "db-user-secret",
"password": "db-password-secret"
},
"secrets_ref": {
"keyvault": ["password"],
"env": ["username"]
}
}
At resolve time DataCoolie:
- For each
source, for eachfield: fetch the secret value from the provider and replaceconfigure[field]with the resolved value. - Calls
connection.refresh_from_configure()so first-class attributes (database,catalog) pick up resolved values.
If a field is listed in secrets_ref but missing from configure, DataCoolie
raises an error instead of guessing where the secret should be written.
Constraint: a field must appear under exactly one source. Listing the
same field under two sources is ambiguous and raises ConfigurationError.
Built-in resolvers¶
Only one: EnvResolver for env:* lookups. Register more via the
datacoolie.resolvers entry-point group.
SecretStr — Opaque secret wrapper¶
Resolved secret values are wrapped in SecretStr, an opaque object that
prevents accidental exposure through str(), repr(), print(),
f-strings, and tracebacks. All public representations render ***.
There is no public method to extract the underlying string. Framework internals use two private helpers at I/O boundaries:
| Helper | Purpose |
|---|---|
unwrap_secret(value) |
Extract the raw str from a SecretStr (identity for plain strings) |
unwrap_configure(configure) |
Deep-copy a configure dict, unwrapping all SecretStr values |
This replaces the earlier SensitiveValueFilter log filter approach. Instead
of scrubbing secrets from log messages after the fact, the framework now
ensures secrets never reach log formatters in the first place.
Extension authors
If your plugin receives a Connection.configure dict, call
unwrap_configure(configure) before passing values to external clients
(HTTP auth, JDBC connection strings, etc.). The wrapped values will not
work as raw strings.
Built-in providers¶
All four platforms. AWSPlatform._fetch_secret goes to AWS Secrets Manager;
FabricPlatform uses notebookutils.credentials; DatabricksPlatform uses
dbutils.secrets; LocalPlatform reads os.environ.