Hooks & Operational Tasks
Hooks in dbt allow you to execute SQL statements or custom logic at specific points in your dbt workflow. They're powerful tools for automating operational tasks, managing permissions, and enhancing your data pipeline with custom functionality.
What Are Hooks?
Hooks are snippets of SQL that run at predefined moments during the dbt execution process. They help automate operational tasks like:
Granting permissions on objects
Setting table properties or attributes
Adding comments or metadata
Creating indexes or optimizing data structures
Executing database-specific operations
There are four main types of hooks in dbt:
pre-hook
Before a model, seed, or snapshot is built
Model-specific
Data validation, temporary table setup
post-hook
After a model, seed, or snapshot is built
Model-specific
Adding indexes, granting permissions
on-run-start
At the start of dbt commands (run, build, test, etc.)
Project/folder wide
Session configuration, global setup
on-run-end
At the end of dbt commands (run, build, test, etc.)
Project/folder wide
Cleanup, notifications, audit logging
Using Model-Level Hooks
Pre-hooks and Post-hooks
You can define pre-hooks and post-hooks directly in your model SQL files using the config()
function:
-- models/customers.sql
{{
config(
materialized='table',
pre_hook="DELETE FROM {{ this }} WHERE customer_id < 0",
post_hook="GRANT SELECT ON {{ this }} TO ROLE analyst"
)
}}
SELECT * FROM {{ ref('stg_customers') }}
In this example:
The pre-hook deletes invalid records before the model builds
The post-hook grants select permissions after the model is created
Multiple Hooks
You can specify multiple hooks as a list:
{{
config(
post_hook=[
"GRANT SELECT ON {{ this }} TO ROLE analyst",
"ANALYZE TABLE {{ this }}",
"ALTER TABLE {{ this }} ADD COMMENT '{{ doc('customers_table_description') }}'"
]
)
}}
Project and Folder-Level Hooks
You can configure hooks that apply to groups of models in your dbt_project.yml
file:
models:
my_project:
# Hooks for all models
+post_hook: "GRANT SELECT ON {{ this }} TO ROLE reporter"
# Hooks for specific folders
marts:
+post_hook: "ANALYZE TABLE {{ this }}"
staging:
+pre_hook: "SET query_tag = 'staging_models'"
On-Run-Start and On-Run-End Hooks
These hooks run once at the beginning or end of your dbt run:
on-run-start:
- "SET timezone = 'America/Los_Angeles'"
- "SET query_tag = 'dbt_run_{{ run_started_at.strftime('%Y%m%d_%H%M%S') }}'"
on-run-end:
- "CALL audit.log_dbt_run('{{ run_started_at }}', '{{ invocation_id }}')"
Hook Context Variables
In hooks, you can access several useful context variables:
{{ this }}
- The relation being built (table/view){{ target }}
- Information about the current target database{{ run_started_at }}
- Timestamp when the run started{{ invocation_id }}
- Unique ID for the current dbt run
Using Macros in Hooks
You can make your hooks more reusable by calling macros:
-- models/customers.sql
{{
config(
post_hook="{{ grant_select(this, 'analyst') }}"
)
}}
SELECT * FROM {{ ref('stg_customers') }}
With a corresponding macro defined:
-- macros/grant_select.sql
{% macro grant_select(relation, role) %}
grant select on {{ relation }} to role {{ role }};
{% endmacro %}
This approach allows you to centralize and reuse your hook logic across multiple models.
You can also configure hooks with macros in your YAML files:
# models/schema.yml
models:
- name: customers
config:
post_hook: "{{ grant_select(this, 'analyst') }}"
# dbt_project.yml
models:
my_project:
+post_hook: "{{ grant_select(this, 'reporter') }}"
Common Hook Use Cases
Permission Management
A common use for hooks is automating permission grants:
-- Granting multiple permissions
{{
config(
post_hook=[
"GRANT SELECT ON {{ this }} TO ROLE analyst",
"GRANT SELECT ON {{ this }} TO ROLE reporter"
]
)
}}
Database-Specific Operations
Perform operations specific to your data warehouse:
-- Snowflake example
{{
config(
post_hook="ALTER TABLE {{ this }} SET DATA_RETENTION_TIME_IN_DAYS = 90"
)
}}
-- Redshift example
{{
config(
post_hook="VACUUM {{ this }}"
)
}}
Environment-Specific Actions
Apply different hooks based on your deployment environment:
{{
config(
post_hook=
{% if target.name == 'prod' %}
"GRANT SELECT ON {{ this }} TO ROLE business_users"
{% else %}
"GRANT SELECT ON {{ this }} TO ROLE dbt_developers"
{% endif %}
)
}}
Operations with run-operation
Operations are a way to execute standalone macros using the run-operation
command. This is useful for administrative tasks that you want to run on demand, rather than as part of a model build.
Creating an Operation Macro
To create an operation, define a macro that performs the desired actions:
-- macros/grant_select.sql
{% macro grant_select(role) %}
{% set sql %}
grant usage on schema {{ target.schema }} to role {{ role }};
grant select on all tables in schema {{ target.schema }} to role {{ role }};
grant select on all views in schema {{ target.schema }} to role {{ role }};
{% endset %}
{% do run_query(sql) %}
{% do log("Privileges granted", info=True) %}
{% endmacro %}
Note two important points:
The SQL is defined within a
{% set sql %}
blockThe
run_query()
function is used to actually execute the SQL
Running an Operation
To run this operation from the command line:
dbt run-operation grant_select --args '{role: reporter}'
This would grant select privileges on all tables in your schema to the 'reporter' role.
Key Difference Between Hooks and Operations
Hooks are automatically executed at specific times during dbt runs
Operations are explicitly run on-demand using the
run-operation
commandWith operations, you must use
run_query()
or a statement block to execute the SQL
Operation Examples
Refreshing a Snowflake Pipe
{% macro refresh_pipe(pipe_name) %}
{% set sql %}
ALTER PIPE {{ pipe_name }} REFRESH;
{% endset %}
{% do run_query(sql) %}
{% do log("Pipe refreshed: " ~ pipe_name, info=True) %}
{% endmacro %}
Creating Multiple Objects
{% macro setup_monitoring() %}
{% set sql %}
CREATE SCHEMA IF NOT EXISTS {{ target.schema }}_monitor;
CREATE TABLE IF NOT EXISTS {{ target.schema }}_monitor.audit_log (
event_time TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
event_type VARCHAR(100),
model_name VARCHAR(100),
duration_seconds FLOAT
);
{% endset %}
{% do run_query(sql) %}
{% do log("Monitoring setup complete", info=True) %}
{% endmacro %}
Passing Complex Arguments
You can pass complex arguments to operations:
dbt run-operation create_test_data --args '{"schema": "analytics", "tables": ["customers", "orders"], "row_count": 1000}'
With a corresponding macro:
{% macro create_test_data(schema, tables, row_count) %}
{% for table in tables %}
{% set sql %}
INSERT INTO {{ schema }}.{{ table }} /* Generate test data SQL here */
{% endset %}
{% do run_query(sql) %}
{% do log("Generated " ~ row_count ~ " rows for " ~ schema ~ "." ~ table, info=True) %}
{% endfor %}
{% endmacro %}
Best Practices
Hooks
Use macros for repeated hook logic
Create reusable macro functions instead of duplicating hook SQL.
Keep hooks focused
Each hook should do one thing well.
Consider hook execution order
Remember that project-level hooks run before/after model-level hooks.
Be careful with transactions
Understand your database's transaction behavior with hooks.
Test hooks in development
Verify hook behavior before deploying to production.
Operations
Always use run_query()
or statement blocks
Operations must explicitly execute SQL.
Add logging
Use log()
to provide feedback about operation progress.
Handle errors gracefully
Consider try/except
patterns for complex operations.
Document operation parameters
Make it clear what arguments your operations accept.
Use for administrative tasks
Operations are perfect for one-off maintenance tasks.
When to Use Hooks vs. Operations
Use hooks when you need to:
Execute SQL automatically at specific points in your dbt workflow
Apply consistent actions across multiple models
Implement pre/post processing that's tightly coupled to models
Use operations when you need to:
Run administrative tasks on-demand
Perform one-off database maintenance
Execute complex logic that doesn't fit into the model build process
Create setup/teardown scripts for your environment
By mastering hooks and operations, you can significantly extend dbt's capabilities and automate many aspects of database administration and data pipeline management.
Last updated
Was this helpful?