Class JaspellTernarySearchTrie

  • All Implemented Interfaces:
    org.apache.lucene.util.Accountable

    @Deprecated
    public class JaspellTernarySearchTrie
    extends Object
    implements org.apache.lucene.util.Accountable
    Deprecated.
    Migrate to one of the newer suggesters which are much more RAM efficient.
    Implementation of a Ternary Search Trie, a data structure for storing String objects that combines the compact size of a binary search tree with the speed of a digital search trie, and is therefore ideal for practical use in sorting and searching data.

    This data structure is faster than hashing for many typical search problems, and supports a broader range of useful problems and operations. Ternary searches are faster than hashing and more powerful, too.

    The theory of ternary search trees was described at a symposium in 1997 (see "Fast Algorithms for Sorting and Searching Strings," by J.L. Bentley and R. Sedgewick, Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms, January 1997). Algorithms in C, Third Edition, by Robert Sedgewick (Addison-Wesley, 1998) provides yet another view of ternary search trees.

    • Constructor Detail

      • JaspellTernarySearchTrie

        public JaspellTernarySearchTrie()
        Deprecated.
        Constructs an empty Ternary Search Trie.
      • JaspellTernarySearchTrie

        public JaspellTernarySearchTrie​(Locale locale)
        Deprecated.
        Constructs an empty Ternary Search Trie, specifying the Locale used for lowercasing.
      • JaspellTernarySearchTrie

        public JaspellTernarySearchTrie​(Path file)
                                 throws IOException
        Deprecated.
        Constructs a Ternary Search Trie and loads data from a Path into the Trie. The file is a normal text document, where each line is of the form word TAB float.
        Parameters:
        file - The Path with the data to load into the Trie.
        Throws:
        IOException - A problem occurred while reading the data.
      • JaspellTernarySearchTrie

        public JaspellTernarySearchTrie​(Path file,
                                        boolean compression)
                                 throws IOException
        Deprecated.
        Constructs a Ternary Search Trie and loads data from a File into the Trie. The file is a normal text document, where each line is of the form "word TAB float".
        Parameters:
        file - The File with the data to load into the Trie.
        compression - If true, the file is compressed with the GZIP algorithm, and if false, the file is a normal text document.
        Throws:
        IOException - A problem occurred while reading the data.
    • Method Detail

      • get

        public Object get​(CharSequence key)
        Deprecated.
        Retrieve the object indexed by a key.
        Parameters:
        key - A String index.
        Returns:
        The object retrieved from the Ternary Search Trie.
      • getAndIncrement

        public Float getAndIncrement​(String key)
        Deprecated.
        Retrieve the Float indexed by key, increment it by one unit and store the new Float.
        Parameters:
        key - A String index.
        Returns:
        The Float retrieved from the Ternary Search Trie.
      • getKey

        protected String getKey​(JaspellTernarySearchTrie.TSTNode node)
        Deprecated.
        Returns the key that indexes the node argument.
        Parameters:
        node - The node whose index is to be calculated.
        Returns:
        The String that indexes the node argument.
      • getNode

        public JaspellTernarySearchTrie.TSTNode getNode​(CharSequence key)
        Deprecated.
        Returns the node indexed by key, or null if that node doesn't exist. Search begins at root node.
        Parameters:
        key - A String that indexes the node that is returned.
        Returns:
        The node object indexed by key. This object is an instance of an inner class named TernarySearchTrie.TSTNode.
      • getNode

        protected JaspellTernarySearchTrie.TSTNode getNode​(CharSequence key,
                                                           JaspellTernarySearchTrie.TSTNode startNode)
        Deprecated.
        Returns the node indexed by key, or null if that node doesn't exist. The search begins at root node.
        Parameters:
        key - A String that indexes the node that is returned.
        startNode - The top node defining the subtrie to be searched.
        Returns:
        The node object indexed by key. This object is an instance of an inner class named TernarySearchTrie.TSTNode.
      • matchAlmost

        public List<String> matchAlmost​(String key)
        Deprecated.
        Returns a List of keys that almost match the argument key. Keys returned will have exactly diff characters that do not match the target key, where diff is equal to the last value passed in as an argument to the setMatchAlmostDiff method.

        If the matchAlmost method is called before the setMatchAlmostDiff method has been called for the first time, then diff = 0.

        Parameters:
        key - The target key.
        Returns:
        A List with the results.
      • matchAlmost

        public List<String> matchAlmost​(CharSequence key,
                                        int numReturnValues)
        Deprecated.
        Returns a List of keys that almost match the argument key. Keys returned will have exactly diff characters that do not match the target key, where diff is equal to the last value passed in as an argument to the setMatchAlmostDiff method.

        If the matchAlmost method is called before the setMatchAlmostDiff method has been called for the first time, then diff = 0.

        Parameters:
        key - The target key.
        numReturnValues - The maximum number of values returned by this method.
        Returns:
        A List with the results
      • matchPrefix

        public List<String> matchPrefix​(String prefix)
        Deprecated.
        Returns an alphabetical List of all keys in the trie that begin with a given prefix. Only keys for nodes having non-null data are included in the List.
        Parameters:
        prefix - Each key returned from this method will begin with the characters in prefix.
        Returns:
        A List with the results.
      • matchPrefix

        public List<String> matchPrefix​(CharSequence prefix,
                                        int numReturnValues)
        Deprecated.
        Returns an alphabetical List of all keys in the trie that begin with a given prefix. Only keys for nodes having non-null data are included in the List.
        Parameters:
        prefix - Each key returned from this method will begin with the characters in prefix.
        numReturnValues - The maximum number of values returned from this method.
        Returns:
        A List with the results
      • numDataNodes

        public int numDataNodes()
        Deprecated.
        Returns the number of nodes in the trie that have non-null data.
        Returns:
        The number of nodes in the trie that have non-null data.
      • numDataNodes

        protected int numDataNodes​(JaspellTernarySearchTrie.TSTNode startingNode)
        Deprecated.
        Returns the number of nodes in the subtrie below and including the starting node. The method counts only nodes that have non-null data.
        Parameters:
        startingNode - The top node of the subtrie. the node that defines the subtrie.
        Returns:
        The total number of nodes in the subtrie.
      • numNodes

        public int numNodes()
        Deprecated.
        Returns the total number of nodes in the trie. The method counts nodes whether or not they have data.
        Returns:
        The total number of nodes in the trie.
      • numNodes

        protected int numNodes​(JaspellTernarySearchTrie.TSTNode startingNode)
        Deprecated.
        Returns the total number of nodes in the subtrie below and including the starting Node. The method counts nodes whether or not they have data.
        Parameters:
        startingNode - The top node of the subtrie. The node that defines the subtrie.
        Returns:
        The total number of nodes in the subtrie.
      • put

        public void put​(CharSequence key,
                        Object value)
        Deprecated.
        Stores a value in the trie. The value may be retrieved using the key.
        Parameters:
        key - A String that indexes the object to be stored.
        value - The object to be stored in the Trie.
      • remove

        public void remove​(String key)
        Deprecated.
        Removes the value indexed by key. Also removes all nodes that are rendered unnecessary by the removal of this data.
        Parameters:
        key - A string that indexes the object to be removed from the Trie.
      • setMatchAlmostDiff

        public void setMatchAlmostDiff​(int diff)
        Deprecated.
        Sets the number of characters by which words can differ from target word when calling the matchAlmost method.

        Arguments less than 0 will set the char difference to 0, and arguments greater than 3 will set the char difference to 3.

        Parameters:
        diff - The number of characters by which words can differ from target word.
      • setNumReturnValues

        public void setNumReturnValues​(int num)
        Deprecated.
        Sets the default maximum number of values returned from the matchPrefix and matchAlmost methods.

        The value should be set this to -1 to get an unlimited number of return values. note that the methods mentioned above provide overloaded versions that allow you to specify the maximum number of return values, in which case this value is temporarily overridden.

        *@param num The number of values that will be returned when calling the methods above.

      • sortKeys

        protected List<String> sortKeys​(JaspellTernarySearchTrie.TSTNode startNode,
                                        int numReturnValues)
        Deprecated.
        Returns keys sorted in alphabetical order. This includes the start Node and all nodes connected to the start Node.

        The number of keys returned is limited to numReturnValues. To get a list that isn't limited in size, set numReturnValues to -1.

        Parameters:
        startNode - The top node defining the subtrie to be searched.
        numReturnValues - The maximum number of values returned from this method.
        Returns:
        A List with the results.
      • ramBytesUsed

        public long ramBytesUsed()
        Deprecated.
        Specified by:
        ramBytesUsed in interface org.apache.lucene.util.Accountable