TOKENIZE

Syntax

TOKENIZE(<string>; <regular expression for separator>; <number>])

Description

Tokenizes text by separators that match the regular expression. TOKENIZE can also return a specific token by specifying its index as optional 3rd argument. If the token of the specified index doesn't exist (because the index is out of range), TOKENIZE returns null. Because this function is meant for text analytics, it doesn't return a token for empty strings.

If you want to extract tokens that match a regular expression you need to use the REGEXTRACT function.

Examples

String	Regular expression	Number	TOKENIZE returns
hello world	" "	null	"hello" "world"
hello world\t2	\\W+	null	"hello" "world" "2"
a.c,d	[.,]	null	"a" "c" "d"
12-11-2006	-	1	11
12-11-2006	-	3	null
"a.c,,d"	"[.,]"	null	"a" "c" ",d"
"a,,c,d"	","	null	"a" ",c" "d"