Skip to main content

Teradata configurations

General​

  • Set quote_columns - to prevent a warning, make sure to explicitly set a value for quote_columns in your dbt_project.yml. See the doc on quote_columns for more information.

    seeds:
    +quote_columns: false #or `true` if you have csv column headers with spaces
  • Enable view column types in docs - Teradata Vantage has a dbscontrol configuration flag called DisableQVCI. This flag instructs the database to create DBC.ColumnsJQV with view column type definitions. To enable this functionality you need to:

    1. Enable QVCI mode in Vantage. Use dbscontrol utility and then restart Teradata. Run these commands as a privileged user on a Teradata node:

      # option 551 is DisableQVCI. Setting it to false enables QVCI.
      dbscontrol << EOF
      M internal 551=false
      W
      EOF

      # restart Teradata
      tpareset -y Enable QVCI
    2. Instruct dbt to use QVCI mode. Include the following variable in your dbt_project.yml:

      vars:
      use_qvci: true

      For example configuration, see dbt_project.yml in dbt-teradata QVCI tests.

Models​

table​

  • table_kind - define the table kind. Legal values are MULTISET (default for ANSI transaction mode required by dbt-teradata) and SET, e.g.:

    {{
    config(
    materialized="table",
    table_kind="SET"
    )
    }}

    For details, see CREATE TABLE documentation.

  • table_option - defines table options. The config supports multiple statements. The definition below uses the Teradata syntax definition to explain what statements are allowed. Square brackets [] denote optional parameters. The pipe symbol | separates statements. Use commas to combine multiple statements as shown in the examples below:

    { MAP = map_name [COLOCATE USING colocation_name] |
    [NO] FALLBACK [PROTECTION] |
    WITH JOURNAL TABLE = table_specification |
    [NO] LOG |
    [ NO | DUAL ] [BEFORE] JOURNAL |
    [ NO | DUAL | LOCAL | NOT LOCAL ] AFTER JOURNAL |
    CHECKSUM = { DEFAULT | ON | OFF } |
    FREESPACE = integer [PERCENT] |
    mergeblockratio |
    datablocksize |
    blockcompression |
    isolated_loading
    }

    where:

    • mergeblockratio:
      { DEFAULT MERGEBLOCKRATIO |
      MERGEBLOCKRATIO = integer [PERCENT] |
      NO MERGEBLOCKRATIO
      }
    • datablocksize:
      DATABLOCKSIZE = {
      data_block_size [ BYTES | KBYTES | KILOBYTES ] |
      { MINIMUM | MAXIMUM | DEFAULT } DATABLOCKSIZE
      }
    • blockcompression:
      BLOCKCOMPRESSION = { AUTOTEMP | MANUAL | ALWAYS | NEVER | DEFAULT }
      [, BLOCKCOMPRESSIONALGORITHM = { ZLIB | ELZS_H | DEFAULT } ]
      [, BLOCKCOMPRESSIONLEVEL = { value | DEFAULT } ]
    • isolated_loading:
      WITH [NO] [CONCURRENT] ISOLATED LOADING [ FOR { ALL | INSERT | NONE } ]

    Examples:

    Separators between statements

    Note the commas that separate statements in table_option config.

    {{
    config(
    materialized="table",
    table_option="NO FALLBACK"
    )
    }}
    {{
    config(
    materialized="table",
    table_option="NO FALLBACK, NO JOURNAL"
    )
    }}
    {{
    config(
    materialized="table",
    table_option="NO FALLBACK, NO JOURNAL, CHECKSUM = ON,
    NO MERGEBLOCKRATIO,
    WITH CONCURRENT ISOLATED LOADING FOR ALL"
    )
    }}

    For details, see CREATE TABLE documentation.

  • with_statistics - should statistics be copied from the base table, e.g.:

    {{
    config(
    materialized="table",
    with_statistics="true"
    )
    }}

    For details, see CREATE TABLE documentation.

  • index - defines table indices:

    [UNIQUE] PRIMARY INDEX [index_name] ( index_column_name [,...] ) |
    NO PRIMARY INDEX |
    PRIMARY AMP [INDEX] [index_name] ( index_column_name [,...] ) |
    PARTITION BY { partitioning_level | ( partitioning_level [,...] ) } |
    UNIQUE INDEX [ index_name ] [ ( index_column_name [,...] ) ] [loading] |
    INDEX [index_name] [ALL] ( index_column_name [,...] ) [ordering] [loading]
    [,...]

    where:

    • partitioning_level:
      { partitioning_expression |
      COLUMN [ [NO] AUTO COMPRESS |
      COLUMN [ [NO] AUTO COMPRESS ] [ ALL BUT ] column_partition ]
      } [ ADD constant ]
    • ordering:
      ORDER BY [ VALUES | HASH ] [ ( order_column_name ) ]
    • loading:
      WITH [NO] LOAD IDENTITY

    Examples:

    Separators between statements

    Note, unlike with table_option statements, there are no commas between statements in index config.

    {{
    config(
    materialized="table",
    index="UNIQUE PRIMARY INDEX ( GlobalID )"
    )
    }}
    {{
    config(
    materialized="table",
    index="PRIMARY INDEX(id)
    PARTITION BY RANGE_N(create_date
    BETWEEN DATE '2020-01-01'
    AND DATE '2021-01-01'
    EACH INTERVAL '1' MONTH)"
    )
    }}
    {{
    config(
    materialized="table",
    index="PRIMARY INDEX(id)
    PARTITION BY RANGE_N(create_date
    BETWEEN DATE '2020-01-01'
    AND DATE '2021-01-01'
    EACH INTERVAL '1' MONTH)
    INDEX index_attrA (attrA) WITH LOAD IDENTITY"
    )
    }}

Seeds​

Using seeds to load raw data

As explained in dbt seeds documentation, seeds should not be used to load raw data (for example, large CSV exports from a production database).

Since seeds are version controlled, they are best suited to files that contain business-specific logic, for example a list of country codes or user IDs of employees.

Loading CSVs using dbt's seed functionality is not performant for large files. Consider using a different tool to load these CSVs into your data warehouse.

  • use_fastload - use fastload when handling dbt seed command. The option will likely speed up loading when your seed files have hundreds of thousands of rows. You can set this seed configuration option in your project.yml file, e.g.:

    seeds:
    <project-name>:
    +use_fastload: true

Common Teradata-specific tasks​

  • collect statistics - when a table is created or modified significantly, there might be a need to tell Teradata to collect statistics for the optimizer. It can be done using COLLECT STATISTICS command. You can perform this step using dbt's post-hooks, e.g.:

    {{ config(
    post_hook=[
    "COLLECT STATISTICS ON {{ this }} COLUMN (column_1, column_2 ...);"
    ]
    )}}

    See Collecting Statistics documentation for more information.