Digits¶
This module contains the parser that converts the string representation of a sequence of digits into the corresponding sequence of digits. These digits may be in the form of english cardinal representations of numbers, along with some homophones. The digits can be hyphenated or unhyphenated from twenty through ninety-nine. The unhyphenated numbers get joined automatically. The use of unhyphenated numbers introduces ambiguity. For example, “sixty five thousand” could be parsed as “605000” or “65000”. Our parser will output the latter. However, this can be an issue with values such as “sixty five thousand one” which parses as “650001”. This limitation will most likely be acceptable for most multi-digit use cases such as telephone numbers, social security numbers, etc.
Entity¶
This module contains the logic to parse entities from NLU results. The entity parser is a pass through for string values to allow custom logic to resolve the entities. For example, the entity can be used as a keyword in a database search.
Integer¶
This module contains the logic to parse integers from NLU results. Integers can be in the form of words (ie. one, two, three) or numbers (ie. 1, 2, 3). Either form will resolve to Python’s built-in ‘int’ type. The metadata must contain a range key containing the minimum and maximum values for the expected integer range. It is important to note the difference between digits and integers. Integers are counting numbers: 2 apples, a table for two. In contrast, digits can be used for sequences of numbers like phone numbers or social security numbers.
Selset¶
This module contains the logic to parse selsets from NLU results. Selsets contain a name along with one or more aliases. This allows one to map any of the listed aliases into a single word. For example, if a selset’s name is “light”, and its aliases are bulbs, light, beam, lamp, etc., occurrences of any alias will be parsed as light