During the last weeks I tried to figure out how to efficiently find a string pattern within another string.
I found out that for a long time, the most efficient way would have been using a suffix tree. However, since this data structure is very expensive in space, I studied the use of suffix arrays further (which use far less space). Different papers such as "Suffix Arrays: A new method for on-line string searches" (Manber & Myers, 1993) state, that searching for a substring can be realised in O(P+log(N)) (where P is the length of the pattern and N is length of the string) by using binary search and suffix arrays along with LCP arrays.
I especially studied the latter paper to understand the search algorithm. This answer did a great job in helping me understand the algorithm (and incidentally made it into the LCP Wikipedia Page).
But I am still looking for an way to implement this algorithm. Especially the construction of the mentioned LCP-LR arrays seems very complicated.
References:
Manber & Myers, 1993: Manber, Udi ; Myers, Gene, SIAM Journal on Computing, 1993, Vol.22(5), pp.935-948, http://epubs.siam.org/doi/pdf/10.1137/0222058
UPDATE 1
Just to emphasize on what I am interested in: I understood LCP arrays and I found ways to implement them. However, the "plain" LCP array would not be appropriate for efficient pattern matching (as described in the reference). Thus I am interested in implementing LCP-LR arrays which seems much more complicated than just implementing an LCP array
UPDATE 2
Added link to referenced paper
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…