Page tree
Skip to end of metadata
Go to start of metadata

Syntax

TOKENIZE(<string>; <regular expression for separator>; <number>])

Description

Tokenizes text by separators that match the regular expression. TOKENIZE can also return a specific token by specifying its index as optional 3rd argument. If the token of the specified index doesn't exist (because the index is out of range), TOKENIZE returns null. Because this function is meant for text analytics, it doesn't return a token for empty strings.

If you want to extract tokens that match a regular expression you need to use the REGEXTRACT function. 

Examples

StringRegular expressionNumberTOKENIZE returns

hello world

" "

null

"hello" "world"

hello world\t2

\\W+

null

"hello" "world" "2"

a.c,d

[.,]

null

"a" "c" "d"

12-11-2006

-

1

11

12-11-2006

-

3

null

"a.c,,d""[.,]"null"a" "c" ",d"
"a,,c,d"","null"a" ",c" "d"
  • No labels