dbt™️ Modifiers

generate-missing-sources

What it does

If any source is missing this hook tries to create it.

When to use it

You are too lazy to define schemas manually :D.

Arguments

--manifest: location of manifest.json file. Usually target/manifest.json. This file contains a full representation of dbt project. Default: target/manifest.json. --schema-file: Location of schema.yml file. Where new source tables should be created.

Example

repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-missing-sources
        args: ["--schema-file", "models/schema.yml", "--"]

⚠️ do not forget to include -- as the last argument. Otherwise pre-commit would not be able to separate a list of files with args.

Requirements

Model exists in manifest.json1

Model exists in catalog.json 2

❌ Not needed since this hook tries to generate even non-existent source

❌ Not needed

1 It means that you need to run dbt parse before run this hook (dbt >= 1.5). 2 It means that you need to run dbt docs generate before run this hook.

How it works

  • Hook takes all changed SQL files.

  • SQL is parsed to find all sources.

  • If the source exists in the manifest, nothing is done.

  • If not, a new source is created in specified schema-file and the hook fails.

Known limitations

Source "envelope" has to exist in specified schema-file, something like this:

version: 2
sources:
- name: <source_name>

Otherwise, it is not possible to automatically generate a new source table.

Unfortunately, this hook breaks your formatting.


unify-column-description

What it does

Unify column descriptions across all models.

When to use it

You want the descriptions of the same columns to be the same. E.g. in two of your models, you have customer_id with the description This is cutomer_id, but there is one model where column customer_id has a description Something else.

This hook finds discrepancies between column descriptions and replaces them. So as the results all columns going to have the description This is customer_id

Arguments

--ignore: Columns for which do not check whether have a different description.

Example

repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-missing-sources
        args: ["--schema-file", "models/schema.yml", "--"]

⚠️ do not forget to include -- as the last argument. Otherwise pre-commit would not be able to separate a list of files with args.

Requirements

Model exists in manifest.json 1

Model exists in catalog.json 2

❌ Not needed since this hook is using only yaml files

❌ Not needed

1 It means that you need to run dbt parse before run this hook (dbt >= 1.5). 2 It means that you need to run dbt docs generate before run this hook.

How it works

  • Hook takes all changed YAML files.

  • From those files columns are parsed and compared.

  • If one column name has more than one (not empty) description, the description with the most occurrences is taken and the hook fails.

  • If it is not possible to decide which description is dominant, no changes are made.

Known limitations

If it is not possible to decide which description is dominant, no changes are made.


replace-script-table-names

What it does

Replace table names with source or ref macros in the script.

When to use it

You are running and debugging your SQL in the editor. This editor does not know source or ref macros. So every time you copy the script from the editor into dbt project you need to rewrite all table names to source or ref. That's boring and error-prone. If you run this hook it will replace all table names with macros instead of you.

Arguments

--manifest: location of manifest.json file. Usually target/manifest.json. This file contains a full representation of dbt project. Default: target/manifest.json.

Example

repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: replace-script-table-names

⚠️ do not forget to include -- as the last argument. Otherwise pre-commit would not be able to separate a list of files with args.

Requirements

Model exists in manifest.json 1

Model exists in catalog.json 2

✅ Yes

❌ Not needed

1 It means that you need to run dbt parse before run this hook (dbt >= 1.5). 2 It means that you need to run dbt docs generate before run this hook.

How it works

  • Hook takes all changed SQL files.

  • SQL is parsed and table names are found.

  • Firstly it tries to find table name in models - ref.

  • Then it tries to find a table in sources - source.

  • If nothing is found it creates unknown source as source('<schema_name>', '<table_name>')

  • If the script contains only ref and source macros, the hook success.


generate-model-properties-file

What it does

Generate model properties file if does not exist.

When to use it

You are running and debugging your SQL in the editor. This editor does not know source or ref macros. So every time you copy the script from the editor into dbt project you need to rewrite all table names to source or ref. That's boring and error-prone. If you run this hook it will replace all table names with macros instead of you.

Arguments

--manifest: location of manifest.json file. Usually target/manifest.json. This file contains a full representation of dbt project. Default: target/manifest.json. --catalog: location of catalog.json file. Usually target/catalog.json. dbt uses this file to render information like column types and table statistics into the docs site. In dbt-checkpoint is used for column operations. Default: target/catalog.json --properties-file: Location of file where new model properties should be generated. Suffix has to be yml or yaml. It can also include {database}, {schema}, {name} and {alias} variables. E.g. /models/{schema}/{name}.yml for model foo.bar will create properties file in /models/foo/bar.yml. If path already exists, properties are appended.

Example

repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-model-properties-file
        args: ["--properties-file", "models/{schema}/{name}.yml", "--"]

⚠️ do not forget to include -- as the last argument. Otherwise pre-commit would not be able to separate a list of files with args.

Requirements

Model exists in manifest.json 1

Model exists in catalog.json 2

✅ Yes

❌ Yes

1 It means that you need to run dbt parse before run this hook (dbt >= 1.5). 2 It means that you need to run dbt docs generate before run this hook.

How it works

  • Hook takes all changed SQL files.

  • The model name is obtained from the SQL file name.

  • The manifest is scanned for a model.

  • The catalog is scanned for a model.

  • If the model does not have patch_path in the manifest, the new schema is written to the specified path. The hook fails.

Known limitations

Unfortunately, this hook breaks your formatting in the written file.


remove-script-semicolon

What it does

Remove the semicolon at the end of the script.

When to use it

You are too lazy or forgetful to delete one character at the end of the script.

Example

repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: remove-script-semicolon

Requirements

Model exists in manifest.json 1

Model exists in catalog.json 2

❌ Not needed

❌ Not needed

1 It means that you need to run dbt parse before run this hook (dbt >= 1.5). 2 It means that you need to run dbt docs generate before run this hook.

How it works

  • Hook takes all changed SQL files.

  • If the file contains a semicolon at the end of the file, it is removed and the hook fails.

Last updated

Was this helpful?

OSZAR »