Skip to main content

Upsolver configurations

Supported Upsolver SQLake functionality

COMMANDSTATEMATERIALIZED
SQL compute clusternot supported-
SQL connectionssupportedconnection
SQL copy jobsupportedincremental
SQL merge jobsupportedincremental
SQL insert jobsupportedincremental
SQL materialized viewssupportedmaterializedview
Expectationssupportedincremental

Configs materialization

ConfigRequiredMaterializationDescriptionExample
connection_typeYesconnectionConnection identifier: S3/GLUE_CATALOG/KINESISconnection_type='S3'
connection_optionsYesconnectionDictionary of options supported by selected connectionconnection_options={ 'aws_role': 'aws_role', 'external_id': 'SAMPLES', 'read_only': True }
incremental_strategyNoincrementalDefine one of incremental strategies: merge/copy/insert. Default: copyincremental_strategy='merge'
sourceNoincrementalDefine source to copy from: S3/KAFKA/KINESISsource = 'S3'
target_typeNoincrementalDefine target type REDSHIFT/ELASTICSEARCH/S3/SNOWFLAKE/POSTGRES. Default None for Data laketarget_type='Snowflake'
target_prefixFalseincrementalDefine PREFIX for ELASTICSEARCH target typetarget_prefix = 'orders'
target_locationFalseincrementalDefine LOCATION for S3 target typetarget_location = 's3://your-bucket-name/path/to/folder/'
schemaYes/NoincrementalDefine target schema. Required if target_type, no table created in a metastore connectionschema = 'target_schema'
databaseYes/NoincrementalDefine target connection. Required if target_type, no table created in a metastore connectiondatabase = 'target_connection'
aliasYes/NoincrementalDefine target table. Required if target_type, no table created in a metastore connectionalias = 'target_table'
delete_conditionNoincrementalRecords that match the ON condition and a delete condition can be deleteddelete_condition='nettotal > 1000'
partition_byNoincrementalList of dictionaries to define partition_by for target metastore tablepartition_by=[{'field':'$field_name'}]
primary_keyNoincrementalList of dictionaries to define partition_by for target metastore tableprimary_key=[{'field':'customer_email', 'type':'string'}]
map_columns_by_nameNoincrementalMaps columns from the SELECT statement to the table. Boolean. Default: Falsemap_columns_by_name=True
syncNoincremental/materializedviewBoolean option to define if job is synchronized or non-msynchronized. Default: Falsesync=True
optionsNoincremental/materializedviewDictionary of job optionsoptions={ 'START_FROM': 'BEGINNING', 'ADD_MISSING_COLUMNS': True }

SQL connection

Connections are used to provide Upsolver with the proper credentials to bring your data into SQLake as well as to write out your transformed data to various services. More details on "Upsolver SQL connections" As a dbt model connection is a model with materialized='connection'

{{ config(
materialized='connection',
connection_type={ 'S3' | 'GLUE_CATALOG' | 'KINESIS' | 'KAFKA'| 'SNOWFLAKE' },
connection_options={}
)
}}

Running this model will compile CREATE CONNECTION(or ALTER CONNECTION if exists) SQL and send it to Upsolver engine. Name of the connection will be name of the model.

SQL copy job

A COPY FROM job allows you to copy your data from a given source into a table created in a metastore connection. This table then serves as your staging table and can be used with SQLake transformation jobs to write to various target locations. More details on "Upsolver SQL copy-from"

As a dbt model copy job is model with materialized='incremental'

{{ config(  materialized='incremental',
sync=True|False,
source = 'S3'| 'KAFKA' | ... ,
options={
'option_name': 'option_value'
},
partition_by=[{}]
)
}}
SELECT * FROM {{ ref(<model>) }}

Running this model will compile CREATE TABLE SQL for target type Data lake (or ALTER TABLE if exists) and CREATE COPY JOB(or ALTER COPY JOB if exists) SQL and send it to Upsolver engine. Name of the table will be name of the model. Name of the job will be name of the model plus '_job'

SQL insert job

An INSERT job defines a query that pulls in a set of data based on the given SELECT statement and inserts it into the designated target. This query is then run periodically based on the RUN_INTERVAL defined within the job. More details on "Upsolver SQL insert".

As a dbt model insert job is model with materialized='incremental' and incremental_strategy='insert'

{{ config(  materialized='incremental',
sync=True|False,
map_columns_by_name=True|False,
incremental_strategy='insert',
options={
'option_name': 'option_value'
},
primary_key=[{}]
)
}}
SELECT ...
FROM {{ ref(<model>) }}
WHERE ...
GROUP BY ...
HAVING COUNT(DISTINCT orderid::string) ...

Running this model will compile CREATE TABLE SQL for target type Data lake(or ALTER TABLE if exists) and CREATE INSERT JOB(or ALTER INSERT JOB if exists) SQL and send it to Upsolver engine. Name of the table will be name of the model. Name of the job will be name of the model plus '_job'

SQL merge job

A MERGE job defines a query that pulls in a set of data based on the given SELECT statement and inserts into, replaces, or deletes the data from the designated target based on the job definition. This query is then run periodically based on the RUN_INTERVAL defined within the job. More details on "Upsolver SQL merge".

As a dbt model merge job is model with materialized='incremental' and incremental_strategy='merge'

{{ config(  materialized='incremental',
sync=True|False,
map_columns_by_name=True|False,
incremental_strategy='merge',
options={
'option_name': 'option_value'
},
primary_key=[{}]
)
}}
SELECT ...
FROM {{ ref(<model>) }}
WHERE ...
GROUP BY ...
HAVING COUNT ...

Running this model will compile CREATE TABLE SQL for target type Data lake(or ALTER TABLE if exists) and CREATE MERGE JOB(or ALTER MERGE JOB if exists) SQL and send it to Upsolver engine. Name of the table will be name of the model. Name of the job will be name of the model plus '_job'

SQL materialized views

When transforming your data, you may find that you need data from multiple source tables in order to achieve your desired result. In such a case, you can create a materialized view from one SQLake table in order to join it with your other table (which in this case is considered the main table). More details on "Upsolver SQL materialized views".

As a dbt model materialized views is model with materialized='materializedview'.

{{ config(  materialized='materializedview',
sync=True|False,
options={'option_name': 'option_value'}
)
}}
SELECT ...
FROM {{ ref(<model>) }}
WHERE ...
GROUP BY ...

Running this model will compile CREATE MATERIALIZED VIEW SQL(or ALTER MATERIALIZED VIEW if exists) and send it to Upsolver engine. Name of the materializedview will be name of the model.

Expectations/constraints

Data quality conditions can be added to your job to drop a row or trigger a warning when a column violates a predefined condition.

WITH EXPECTATION <expectation_name> EXPECT <sql_predicate>
ON VIOLATION WARN

Expectations can be implemented with dbt constraints Supported constraints: check and not_null

models:
- name: <model name>
# required
config:
contract:
enforced: true
# model-level constraints
constraints:
- type: check
columns: ['<column1>', '<column2>']
expression: "column1 <= column2"
name: <constraint_name>
- type: not_null
columns: ['column1', 'column2']
name: <constraint_name>

columns:
- name: <column3>
data_type: string

# column-level constraints
constraints:
- type: not_null
- type: check
expression: "REGEXP_LIKE(<column3>, '^[0-9]{4}[a-z]{5}$')"
name: <constraint_name>

Projects examples

projects examples link: github.com/dbt-upsolver/examples/

Connection options

OptionStorageEditableOptionalConfig Syntax
aws_roles3TrueTrue'aws_role': '<aws_role>'
external_ids3TrueTrue'external_id': '<external_id>'
aws_access_key_ids3TrueTrue'aws_access_key_id': '<aws_access_key_id>'
aws_secret_access_keys3TrueTrue'aws_secret_access_key_id': '<aws_secret_access_key_id>'
path_display_filters3TrueTrue'path_display_filter': '<path_display_filter>'
path_display_filterss3TrueTrue'path_display_filters': ('<filter>', ...)
read_onlys3TrueTrue'read_only': True/False
encryption_kms_keys3TrueTrue'encryption_kms_key': '<encryption_kms_key>'
encryption_customer_managed_keys3TrueTrue'encryption_customer_kms_key': '<encryption_customer_kms_key>'
comments3TrueTrue'comment': '<comment>'
hostkafkaFalseFalse'host': '<host>'
hostskafkaFalseFalse'hosts': ('<host>', ...)
consumer_propertieskafkaTrueTrue'consumer_properties': '<consumer_properties>'
versionkafkaFalseTrue'version': '<value>'
require_static_ipkafkaTrueTrue'require_static_ip': True/False
sslkafkaTrueTrue'ssl': True/False
topic_display_filterkafkaTrueTrue'topic_display_filter': '<topic_display_filter>'
topic_display_filterskafkaTrueTrue'topic_display_filter': ('<filter>', ...)
commentkafkaTrueTrue'comment': '<comment>'
aws_roleglue_catalogTrueTrue'aws_role': '<aws_role>'
external_idglue_catalogTrueTrue'external_id': '<external_id>'
aws_access_key_idglue_catalogTrueTrue'aws_access_key_id': '<aws_access_key_id>'
aws_secret_access_keyglue_catalogTrueTrue'aws_secret_access_key': '<aws_secret_access_key>'
default_storage_connectionglue_catalogFalseFalse'default_storage_connection': '<default_storage_connection>'
default_storage_locationglue_catalogFalseFalse'default_storage_location': '<default_storage_location>'
regionglue_catalogFalseTrue'region': '<region>'
database_display_filterglue_catalogTrueTrue'database_display_filter': '<database_display_filter>'
database_display_filtersglue_catalogTrueTrue'database_display_filters': ('<filter>', ...)
commentglue_catalogTrueTrue'comment': '<comment>'
aws_rolekinesisTrueTrue'aws_role': '<aws_role>'
external_idkinesisTrueTrue'external_id': '<external_id>'
aws_access_key_idkinesisTrueTrue'aws_access_key_id': '<aws_access_key_id>'
aws_secret_access_keykinesisTrueTrue'aws_secret_access_key': '<aws_secret_access_key>'
regionkinesisFalseFalse'region': '<region>'
read_onlykinesisFalseTrue'read_only': True/False
max_writerskinesisTrueTrue'max_writers': <integer>
stream_display_filterkinesisTrueTrue'stream_display_filter': '<stream_display_filter>'
stream_display_filterskinesisTrueTrue'stream_display_filters': ('<filter>', ...)
commentkinesisTrueTrue'comment': '<comment>'
connection_stringsnowflakeTrueFalse'connection_string': '<connection_string>'
user_namesnowflakeTrueFalse'user_name': '<user_name>'
passwordsnowflakeTrueFalse'password': '<password>'
max_concurrent_connectionssnowflakeTrueTrue'max_concurrent_connections': <integer>
commentsnowflakeTrueTrue'comment': '<comment>'
connection_stringredshiftTrueFalse'connection_string': '<connection_string>'
user_nameredshiftTrueFalse'user_name': '<user_name>'
passwordredshiftTrueFalse'password': '<password>'
max_concurrent_connectionsredshiftTrueTrue'max_concurrent_connections': <integer>
commentredshiftTrueTrue'comment': '<comment>'
connection_stringmysqlTrueFalse'connection_string': '<connection_string>'
user_namemysqlTrueFalse'user_name': '<user_name>'
passwordmysqlTrueFalse'password': '<password>'
commentmysqlTrueTrue'comment': '<comment>'
connection_stringpostgresTrueFalse'connection_string': '<connection_string>'
user_namepostgresTrueFalse'user_name': '<user_name>'
passwordpostgresTrueFalse'password': '<password>'
commentpostgresTrueTrue'comment': '<comment>'
connection_stringelasticsearchTrueFalse'connection_string': '<connection_string>'
user_nameelasticsearchTrueFalse'user_name': '<user_name>'
passwordelasticsearchTrueFalse'password': '<password>'
commentelasticsearchTrueTrue'comment': '<comment>'
connection_stringmongodbTrueFalse'connection_string': '<connection_string>'
user_namemongodbTrueFalse'user_name': '<user_name>'
passwordmongodbTrueFalse'password': '<password>'
timeoutmongodbTrueTrue'timeout': "INTERVAL 'N' SECONDS"
commentmongodbTrueTrue'comment': '<comment>'
connection_stringmssqlTrueFalse'connection_string': '<connection_string>'
user_namemssqlTrueFalse'user_name': '<user_name>'
passwordmssqlTrueFalse'password': '<password>'
commentmssqlTrueTrue'comment': '<comment>'

Target options

OptionStorageEditableOptionalConfig Syntax
globally_unique_keysdatalakeFalseTrue'globally_unique_keys': True/False
storage_connectiondatalakeFalseTrue'storage_connection': '<storage_connection>'
storage_locationdatalakeFalseTrue'storage_location': '<storage_location>'
compute_clusterdatalakeTrueTrue'compute_cluster': '<compute_cluster>'
compressiondatalakeTrueTrue'compression': 'SNAPPY/GZIP'
compaction_processesdatalakeTrueTrue'compaction_processes': <integer>
disable_compactiondatalakeTrueTrue'disable_compaction': True/False
retention_date_partitiondatalakeFalseTrue'retention_date_partition': '<column>'
table_data_retentiondatalakeTrueTrue'table_data_retention': '<N DAYS>'
column_data_retentiondatalakeTrueTrue'column_data_retention': ({'COLUMN' : '<column>','DURATION': '<N DAYS>'})
commentdatalakeTrueTrue'comment': '<comment>'
storage_connectionmaterialized_viewFalseTrue'storage_connection': '<storage_connection>'
storage_locationmaterialized_viewFalseTrue'storage_location': '<storage_location>'
max_time_travel_durationmaterialized_viewTrueTrue'max_time_travel_duration': '<N DAYS>'
compute_clustermaterialized_viewTrueTrue'compute_cluster': '<compute_cluster>'
column_transformationssnowflakeFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
deduplicate_withsnowflakeFalseTrue'deduplicate_with': {'COLUMNS' : ['col1', 'col2'],'WINDOW': 'N HOURS'}
exclude_columnssnowflakeFalseTrue'exclude_columns': ('<exclude_column>', ...)
create_table_if_missingsnowflakeFalseTrue'create_table_if_missing': True/False}
run_intervalsnowflakeFalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'

Transformation options

OptionStorageEditableOptionalConfig Syntax
run_intervals3FalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'
start_froms3FalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_ats3TrueTrue'end_at': '<timestamp>/NOW'
compute_clusters3TrueTrue'compute_cluster': '<compute_cluster>'
comments3TrueTrue'comment': '<comment>'
skip_validationss3FalseTrue'skip_validations': ('ALLOW_CARTESIAN_PRODUCT', ...)
skip_all_validationss3FalseTrue'skip_all_validations': True/False
aggregation_parallelisms3TrueTrue'aggregation_parallelism': <integer>
run_parallelisms3TrueTrue'run_parallelism': <integer>
file_formats3FalseFalse'file_format': '(type = <file_format>)'
compressions3FalseTrue'compression': 'SNAPPY/GZIP ...'
date_patterns3FalseTrue'date_pattern': '<date_pattern>'
output_offsets3FalseTrue'output_offset': '<N MINUTES/HOURS/DAYS>'
run_intervalelasticsearchFalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'
routing_field_nameelasticsearchTrueTrue'routing_field_name': '<routing_field_name>'
start_fromelasticsearchFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_atelasticsearchTrueTrue'end_at': '<timestamp>/NOW'
compute_clusterelasticsearchTrueTrue'compute_cluster': '<compute_cluster>'
skip_validationselasticsearchFalseTrue'skip_validations': ('ALLOW_CARTESIAN_PRODUCT', ...)
skip_all_validationselasticsearchFalseTrue'skip_all_validations': True/False
aggregation_parallelismelasticsearchTrueTrue'aggregation_parallelism': <integer>
run_parallelismelasticsearchTrueTrue'run_parallelism': <integer>
bulk_max_size_byteselasticsearchTrueTrue'bulk_max_size_bytes': <integer>
index_partition_sizeelasticsearchTrueTrue'index_partition_size': 'HOURLY/DAILY ...'
commentelasticsearchTrueTrue'comment': '<comment>'
custom_insert_expressionssnowflakeTrueTrue'custom_insert_expressions': {'INSERT_TIME' : 'CURRENT_TIMESTAMP()','MY_VALUE': '<value>'}
custom_update_expressionssnowflakeTrueTrue'custom_update_expressions': {'UPDATE_TIME' : 'CURRENT_TIMESTAMP()','MY_VALUE': '<value>'}
keep_existing_values_when_nullsnowflakeTrueTrue'keep_existing_values_when_null': True/False
add_missing_columnssnowflakeFalseTrue'add_missing_columns': True/False
run_intervalsnowflakeFalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'
commit_intervalsnowflakeTrueTrue'commit_interval': '<N MINUTE[S]/HOUR[S]/DAY[S]>'
start_fromsnowflakeFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_atsnowflakeTrueTrue'end_at': '<timestamp>/NOW'
compute_clustersnowflakeTrueTrue'compute_cluster': '<compute_cluster>'
skip_validationssnowflakeFalseTrue'skip_validations': ('ALLOW_CARTESIAN_PRODUCT', ...)
skip_all_validationssnowflakeFalseTrue'skip_all_validations': True/False
aggregation_parallelismsnowflakeTrueTrue'aggregation_parallelism': <integer>
run_parallelismsnowflakeTrueTrue'run_parallelism': <integer>
commentsnowflakeTrueTrue'comment': '<comment>'
add_missing_columnsdatalakeFalseTrue'add_missing_columns': True/False
run_intervaldatalakeFalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'
start_fromdatalakeFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_atdatalakeTrueTrue'end_at': '<timestamp>/NOW'
compute_clusterdatalakeTrueTrue'compute_cluster': '<compute_cluster>'
skip_validationsdatalakeFalseTrue'skip_validations': ('ALLOW_CARTESIAN_PRODUCT', ...)
skip_all_validationsdatalakeFalseTrue'skip_all_validations': True/False
aggregation_parallelismdatalakeTrueTrue'aggregation_parallelism': <integer>
run_parallelismdatalakeTrueTrue'run_parallelism': <integer>
commentdatalakeTrueTrue'comment': '<comment>'
run_intervalredshiftFalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'
start_fromredshiftFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_atredshiftTrueTrue'end_at': '<timestamp>/NOW'
compute_clusterredshiftTrueTrue'compute_cluster': '<compute_cluster>'
skip_validationsredshiftFalseTrue'skip_validations': ('ALLOW_CARTESIAN_PRODUCT', ...)
skip_all_validationsredshiftFalseTrue'skip_all_validations': True/False
aggregation_parallelismredshiftTrueTrue'aggregation_parallelism': <integer>
run_parallelismredshiftTrueTrue'run_parallelism': <integer>
skip_failed_filesredshiftFalseTrue'skip_failed_files': True/False
fail_on_write_errorredshiftFalseTrue'fail_on_write_error': True/False
commentredshiftTrueTrue'comment': '<comment>'
run_intervalpostgresFalseTrue'run_interval': '<N MINUTES/HOURS/DAYS>'
start_frompostgresFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_atpostgresTrueTrue'end_at': '<timestamp>/NOW'
compute_clusterpostgresTrueTrue'compute_cluster': '<compute_cluster>'
skip_validationspostgresFalseTrue'skip_validations': ('ALLOW_CARTESIAN_PRODUCT', ...)
skip_all_validationspostgresFalseTrue'skip_all_validations': True/False
aggregation_parallelismpostgresTrueTrue'aggregation_parallelism': <integer>
run_parallelismpostgresTrueTrue'run_parallelism': <integer>
commentpostgresTrueTrue'comment': '<comment>'

Copy options

OptionStorageCategoryEditableOptionalConfig Syntax
topickafkasource_optionsFalseFalse'topic': '<topic>'
exclude_columnskafkajob_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
deduplicate_withkafkajob_optionsFalseTrue'deduplicate_with': {'COLUMNS' : ['col1', 'col2'],'WINDOW': 'N HOURS'}
consumer_propertieskafkajob_optionsTrueTrue'consumer_properties': '<consumer_properties>'
reader_shardskafkajob_optionsTrueTrue'reader_shards': <integer>
store_raw_datakafkajob_optionsFalseTrue'store_raw_data': True/False
start_fromkafkajob_optionsFalseTrue'start_from': 'BEGINNING/NOW'
end_atkafkajob_optionsTrueTrue'end_at': '<timestamp>/NOW'
compute_clusterkafkajob_optionsTrueTrue'compute_cluster': '<compute_cluster>'
run_parallelismkafkajob_optionsTrueTrue'run_parallelism': <integer>
content_typekafkajob_optionsTrueTrue'content_type': 'AUTO/CSV/...'
compressionkafkajob_optionsFalseTrue'compression': 'AUTO/GZIP/...'
column_transformationskafkajob_optionsFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
commit_intervalkafkajob_optionsTrueTrue'commit_interval': '<N MINUTE[S]/HOUR[S]/DAY[S]>'
skip_validationskafkajob_optionsFalseTrue'skip_validations': ('MISSING_TOPIC')
skip_all_validationskafkajob_optionsFalseTrue'skip_all_validations': True/False
commentkafkajob_optionsTrueTrue'comment': '<comment>'
table_include_listmysqlsource_optionsTrueTrue'table_include_list': ('<regexFilter>', ...)
column_exclude_listmysqlsource_optionsTrueTrue'column_exclude_list': ('<regexFilter>', ...)
exclude_columnsmysqljob_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
column_transformationsmysqljob_optionsFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
skip_snapshotsmysqljob_optionsTrueTrue'skip_snapshots': True/False
end_atmysqljob_optionsTrueTrue'end_at': '<timestamp>/NOW'
compute_clustermysqljob_optionsTrueTrue'compute_cluster': '<compute_cluster>'
snapshot_parallelismmysqljob_optionsTrueTrue'snapshot_parallelism': <integer>
ddl_filtersmysqljob_optionsFalseTrue'ddl_filters': ('<filter>', ...)
commentmysqljob_optionsTrueTrue'comment': '<comment>'
table_include_listpostgressource_optionsFalseFalse'table_include_list': ('<regexFilter>', ...)
column_exclude_listpostgressource_optionsFalseTrue'column_exclude_list': ('<regexFilter>', ...)
heartbeat_tablepostgresjob_optionsFalseTrue'heartbeat_table': '<heartbeat_table>'
skip_snapshotspostgresjob_optionsFalseTrue'skip_snapshots': True/False
publication_namepostgresjob_optionsFalseFalse'publication_name': '<publication_name>'
end_atpostgresjob_optionsTrueTrue'end_at': '<timestamp>/NOW'
compute_clusterpostgresjob_optionsTrueTrue'compute_cluster': '<compute_cluster>'
commentpostgresjob_optionsTrueTrue'comment': '<comment>'
parse_json_columnspostgresjob_optionsFalseFalse'parse_json_columns': True/False
column_transformationspostgresjob_optionsFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
snapshot_parallelismpostgresjob_optionsTrueTrue'snapshot_parallelism': <integer>
exclude_columnspostgresjob_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
locations3source_optionsFalseFalse'location': '<location>'
date_patterns3job_optionsFalseTrue'date_pattern': '<date_pattern>'
file_patterns3job_optionsFalseTrue'file_pattern': '<file_pattern>'
initial_load_patterns3job_optionsFalseTrue'initial_load_pattern': '<initial_load_pattern>'
initial_load_prefixs3job_optionsFalseTrue'initial_load_prefix': '<initial_load_prefix>'
delete_files_after_loads3job_optionsFalseTrue'delete_files_after_load': True/False
deduplicate_withs3job_optionsFalseTrue'deduplicate_with': {'COLUMNS' : ['col1', 'col2'],'WINDOW': 'N HOURS'}
end_ats3job_optionsTrueTrue'end_at': '<timestamp>/NOW'
start_froms3job_optionsFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
compute_clusters3job_optionsTrueTrue'compute_cluster': '<compute_cluster>'
run_parallelisms3job_optionsTrueTrue'run_parallelism': <integer>
content_types3job_optionsTrueTrue'content_type': 'AUTO/CSV...'
compressions3job_optionsFalseTrue'compression': 'AUTO/GZIP...'
comments3job_optionsTrueTrue'comment': '<comment>'
column_transformationss3job_optionsFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
commit_intervals3job_optionsTrueTrue'commit_interval': '<N MINUTE[S]/HOUR[S]/DAY[S]>'
skip_validationss3job_optionsFalseTrue'skip_validations': ('EMPTY_PATH')
skip_all_validationss3job_optionsFalseTrue'skip_all_validations': True/False
exclude_columnss3job_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
streamkinesissource_optionsFalseFalse'stream': '<stream>'
reader_shardskinesisjob_optionsTrueTrue'reader_shards': <integer>
store_raw_datakinesisjob_optionsFalseTrue'store_raw_data': True/False
start_fromkinesisjob_optionsFalseTrue'start_from': '<timestamp>/NOW/BEGINNING'
end_atkinesisjob_optionsFalseTrue'end_at': '<timestamp>/NOW'
compute_clusterkinesisjob_optionsTrueTrue'compute_cluster': '<compute_cluster>'
run_parallelismkinesisjob_optionsFalseTrue'run_parallelism': <integer>
content_typekinesisjob_optionsTrueTrue'content_type': 'AUTO/CSV...'
compressionkinesisjob_optionsFalseTrue'compression': 'AUTO/GZIP...'
commentkinesisjob_optionsTrueTrue'comment': '<comment>'
column_transformationskinesisjob_optionsTrueTrue'column_transformations': {'<column>' : '<expression>' , ...}
deduplicate_withkinesisjob_optionsFalseTrue'deduplicate_with': {'COLUMNS' : ['col1', 'col2'],'WINDOW': 'N HOURS'}
commit_intervalkinesisjob_optionsTrueTrue'commit_interval': '<N MINUTE[S]/HOUR[S]/DAY[S]>'
skip_validationskinesisjob_optionsFalseTrue'skip_validations': ('MISSING_STREAM')
skip_all_validationskinesisjob_optionsFalseTrue'skip_all_validations': True/False
exclude_columnskinesisjob_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
table_include_listmssqlsource_optionsTrueTrue'table_include_list': ('<regexFilter>', ...)
column_exclude_listmssqlsource_optionsTrueTrue'column_exclude_list': ('<regexFilter>', ...)
exclude_columnsmssqljob_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
column_transformationsmssqljob_optionsFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
skip_snapshotsmssqljob_optionsTrueTrue'skip_snapshots': True/False
end_atmssqljob_optionsTrueTrue'end_at': '<timestamp>/NOW'
compute_clustermssqljob_optionsTrueTrue'compute_cluster': '<compute_cluster>'
snapshot_parallelismmssqljob_optionsTrueTrue'snapshot_parallelism': <integer>
parse_json_columnsmssqljob_optionsFalseFalse'parse_json_columns': True/False
commentmssqljob_optionsTrueTrue'comment': '<comment>'
collection_include_listmongodbsource_optionsTrueTrue'collection_include_list': ('<regexFilter>', ...)
exclude_columnsmongodbjob_optionsFalseTrue'exclude_columns': ('<exclude_column>', ...)
column_transformationsmongodbjob_optionsFalseTrue'column_transformations': {'<column>' : '<expression>' , ...}
skip_snapshotsmongodbjob_optionsTrueTrue'skip_snapshots': True/False
end_atmongodbjob_optionsTrueTrue'end_at': '<timestamp>/NOW'
compute_clustermongodbjob_optionsTrueTrue'compute_cluster': '<compute_cluster>'
snapshot_parallelismmongodbjob_optionsTrueTrue'snapshot_parallelism': <integer>
commentmongodbjob_optionsTrueTrue'comment': '<comment>'
0