Enhancing the original implementation in Java.
Referencing with C++ implementation (suffix tree on
one string).
All-in-one header: SuffixTree.h
.
put(list, value)
: adds alist
associated with avalue
.value
will be returned at later retrievals.search(sub-list)
: returns a std::set ofvalues
of the lists containingsub-list
.
More examples in main.cpp
.
vector<string> words = {"qwe", "rtyr", "uio", "pas", "dfg", "hjk", "lzx", "cvb", "bnm"};
SuffixTree<string, int> tree;
for (int idx = 0; idx < words.size(); idx++) {
// By putting `words[idx]` into SuffixTree, it becomes a candidate for future searches.
// If `words[idx]` is part of a search result, SuffixTree will returns the `idx`
// (the 2nd param) as this word identifier.
tree.put(words[idx], idx);
}
for (int idx = 0; idx < words.size(); idx++) {
set result = tree.search(word[idx].substr(1, 2));
// found!
assert(result.find(idx) != result.end());
}
- Besides searching on strings, this template allows searching on any other type of list / array / ... (container that stores objects of the same type in a linear arrangement), if:
- Typename
value_type
(type of list elements),size_type
andconst_iterator
are public. begin
andend
iterator must meetLegacyInputIterator
at minimum.
- Typename
This means you can search on C++ std containers like std::vector
and std::list
. Other std containers may be applicable as well, but I haven't checked.
-
If you use a container other than string, the element type must satisfy:
< operator
is defined (so that Suffix Tree can operate on it)
-
If you use an arbitrary type (other than index integers) as identifier, the type must satisfy:
< operator
is defined (so that it can be put intostd::set
)
-
DO NOT DESTROY the lists. They are only stored as begin and end iterators in the tree.
-
Requires C++11 at minimum.
-
This template is originally created to help perform search queries in a dictionary.