Snowflake
Extract Parts of a String

Snowflake SUBSTRING: How to Extract Parts of a String

What it does

SUBSTRING helps you extract a specific piece of a string based on the starting position and length you define. It's perfect for pulling out relevant information from a larger text block, such as a name from a full address or a keyword from a sentence.

Syntax

You can use SUBSTR or SUBSTRING interchangeably to perform this action:


_10
SUBSTR(`<base_expr>`, `<start_expr>` [ , `<length_expr>` ])
_10
_10
SUBSTRING(`<base_expr>`, `<start_expr>` [ , `<length_expr>` ])

where

  • <base_expr>: The string you want to extract from.
  • <start_expr>: The starting position for the extraction (position counts start at 1).
  • <length_expr> (optional): How many characters to include in the extracted string. If omitted, the substring will extend to the end of the base string.

Common use cases

  • Extracting first names: If you have a database column with full names and you need to get just the first name
  • Pulling out area codes from phone numbers: If you're dealing with North American phone numbers and need to extract the area code
  • Analyzing product codes: To analyze a specific segment of a product code where, for instance, the first two characters represent the product category

Examples

Basic data extraction with SUBSTRING

In this example from "The Hitchhiker's Guide to the Galaxy," we extract fixed-length temperature data directly from structured text entries about different planets.

Input

Table: galactic_observations

observation_idplanetdata
1BetelgeuseTemperature: 450K, Atmosphere: 72% nitrogen, 21% oxygen, Gravity: 1.1G
2FrogstarTemperature: 290K, Atmosphere: 95% carbon dioxide, Gravity: 3.7G
3MagratheaTemperature: 235K, Atmosphere: 85% argon, Gravity: 0.9G

Snowflake SQL Query


_10
SELECT
_10
observation_id,
_10
planet,
_10
data,
_10
SUBSTRING(data, 14, 4) as temperature
_10
FROM
_10
galactic_observations;

Output

observation_idplanetdatatemperature
1BetelgeuseTemperature: 450K, Atmosphere: 72% nitrogen, 21% oxygen, Gravity: 1.1G450K
2FrogstarTemperature: 290K, Atmosphere: 95% carbon dioxide, Gravity: 3.7G290K
3MagratheaTemperature: 235K, Atmosphere: 85% argon, Gravity: 0.9G235K

Using SUBSTRING and POSITION functions for data extraction

This approach goes a step further by dynamically identifying where the gravity data starts in the text. It adapts to different lengths of preceding text, which is useful when dealing with varying data entries.

Input

Table: galactic_observations

observation_idplanetdata
1BetelgeuseTemperature: 450K, Atmosphere: 72% nitrogen, 21% oxygen, Gravity: 1.1G
2FrogstarTemperature: 290K, Atmosphere: 95% carbon dioxide, Gravity: 3.7G
3MagratheaTemperature: 235K, Atmosphere: 85% argon, Gravity: 0.9G

Snowflake SQL Query


_10
SELECT
_10
observation_id,
_10
planet,
_10
data,
_10
SUBSTRING(data, POSITION('Gravity: ' IN data) + 9, 4) AS gravity
_10
FROM
_10
galactic_observations;

Output

observation_idplanetdatagravity
1BetelgeuseTemperature: 450K, Atmosphere: 72% nitrogen, 21% oxygen, Gravity: 1.1G1.1G
2FrogstarTemperature: 290K, Atmosphere: 95% carbon dioxide, Gravity: 3.7G3.7G
3MagratheaTemperature: 235K, Atmosphere: 85% argon, Gravity: 0.9G0.9G

Explanation

  • Locate keyword: Uses POSITION('Gravity: ' IN data) to find where the gravity information starts within the string.
  • Adjust start: Adds 9 to the position to skip past 'Gravity: ' and target the start of the numerical gravity data.
  • Extract data: Applies SUBSTRING from the adjusted position to pull out the next 4 characters, capturing the gravity value like '1.1G'.
Y42 SQL completion user interface

Your AI copilot for Snowflake SQL

Write production-ready SQL with AI-based suggestions.

Get Started