Navigate to: Marketing & Site Search > Site Search
The following definitions are provided to help in understanding Znode Search.
Analyzer:
Used to share instructions that are given to Elasticsearch on the nitty-gritty of how the data should be indexed/stored.
Tokenizer: Use to decide how Elasticsearch will take a set of words and divide it into separated terms called “tokens”. Token generation depends on the type of token filter used
Character Filters:
Used to preprocess the stream of characters before it is passed to the tokenizer. A character filter receives the original text as a stream of characters and can transform the stream by adding, removing, or changing characters.
Ex: If the Character mapping is done as "- => &" and a user enters a text 3-3 it will be converted to 3&3 while displaying the results
Stemmer:
Used to reduce a word to its root form to ensure variants of a word match during a search.
Ex: walking and walked can be stemmed to the same root word: walk
Fuzziness:
Used to identify two elements of text, strings, or entries that are approximately similar but are not exactly the same
Ex: If a text is entered = the Cat result will also search for Mat, Bat, Rat, Sat, etc.
Ex: If a text entered = Black result will also search for Lack, Slack, etc.
Token Filter:
A token filter is an operation done on tokens that modifies them in some way or another
The following token filters would be used in Elasticsearch for Znode (base)
Lowercase:
Used to change token text to lowercase
Ex: If a text entered = “The Quick Brown Fox” it will be converted to “the quick brown fox”
Synonyms:
Used to create a list of words that can be used as an alternative and is saved in the synonym list, when a user enters a search keyword it is then compared with the synonym list to find the match before displaying the search results
Ex: If a text entered = “Brown Fox” and the synonym list has Fox= “coyote, dingo” then the search results would display products that have fox, coyote, dingo
Stopwords:
Used to ignore all stop words if found in the search keyword. Stop words are usually words like “to”, “I”, “has”, “the”, “be”, “or”, etc. They are filler words that help sentences flow better, but provide a very little context on their own.
Ex: If a text entered = “The Quick Brown Fox” it will be considered as “quick brown fox”
Shingle:
Used when tokens are to be generated using the concatenation of the adjacent tokens. Shingles are generally used to help speed up phrase queries
Ex: If a text entered = “The Quick Brown Fox” then tokens generated would be “the,” “the quick”, “quick” “quick brown“, “brown fox” and “fox”
Ngram:
Used to break a text into words, this is helpful to create tokens(set of words) that can be used to search the desired output in search results
Ex: If a text entered =“Quick fox” then tokens generated would be [ Q, Qu, u, ui, i, ic, c, ck, k, f, fo, o, ox, x ]