What algorithm is used for finding ngrams?
Supposing my input data is an array of words and the size of the ngrams I want to find, what algorithm I should use?
I'm asking for code, with preference for R. The data is stored in database, so can be a plgpsql function too. Java is a language I know better, so I can "translate" it to another language.
I'm not lazy, I'm only asking for code because I don't want to reinvent the wheel trying to do an algorithm that is already done.
Edit: it's important know how many times each n-gram appears.
Edit 2: there is a R package for N-GRAMS?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…