The present invention provides a computer-readable medium and system for selecting a set of n-grams for indexing string data in a DBMS system. Aspects of the invention include providing a set of candidate n-grams, each n-gram comprising a sequence of characters; identifying sample queries having character...http://www.google.de/patents/US7478081?utm_source=gb-gplus-sharePatent US7478081 - Selection of a set of optimal n-grams for indexing string data in a DBMS system under space constraints introduced by the system