Skip to main content

SQL TRIM

We’ve been there: pesky blank spaces, weird, inconsistent formats, or unaccountable asterisks hiding at the end of your column value—strings are one of the most variable data types in your datasets. They likely lack a uniform casing, vary in length, and will inevitably have characters you need to trim from them.

Introducing: the SQL TRIM function, which removes the leading and trailing characters of a string. By default, it removes the blank space character from the beginning and end of a string.

How to use the SQL TRIM function

The syntax for using TRIM function looks like the following:

trim(<field_name> [, <characters_to_remove>])

Like we said earlier, the default <characters_to_remove> is a blank space, such that if you were to trim(' string with extra leading space') it would return 'string with extra leading space'. You can explicitly specify single characters or a pattern to trim from your strings.

SQL TRIM function example

select
first_name,
concat('*', first_name, '**') as test_string,
trim(test_string, '*') as back_to_first_name
from {{ ref('customers') }}
limit 3

After running this query, the resulting orders table will look like this:

first_nametest_stringback_to_first_name
Julia*Julia**Julia
Max*Max**Max
Laura*Laura**Laura

In this query, you’re adding superfluous asterisks to a string using the CONCAT function and recleaning it using the TRIM function. Even though I specified one asterisk in the TRIM function itself, it recognizes that as a pattern to remove from the beginning and end of a string, which is why the double asterisks (**) were removed from the end of the test_string column.

SQL TRIM function syntax in Snowflake, Databricks, BigQuery, and Redshift

Google BigQuery, Amazon Redshift, Snowflake, and Databricks all support the ability to use the TRIM function. In addition, the syntax to trim strings is the same across all of them using the TRIM function. These data warehouses also support the RTRIM and LTRIM functions, which allow you to only trim characters from the right side and left side of a string, respectively.

TRIM function use cases

If string values in your raw data have extra white spaces or miscellaneous characters, you’ll leverage the TRIM (and subset RTRIM AND LTRIM) functions to help you quickly remove them. You’ll likely do this cleanup in staging models, where you’re probably standardizing casing and doing other minor formatting changes to string values, so you can use a clean and consistent format across your downstream models.

0