本文整理汇总了Java中org.apache.lucene.analysis.util.CharArrayMap类的典型用法代码示例。如果您正苦于以下问题:Java CharArrayMap类的具体用法?Java CharArrayMap怎么用?Java CharArrayMap使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
CharArrayMap类属于org.apache.lucene.analysis.util包,在下文中一共展示了CharArrayMap类的16个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: convertPhraseSet
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
private CharArrayMap convertPhraseSet( CharArraySet phraseSet ) {
CharArrayMap<CharArraySet> phraseMap = new CharArrayMap( 100, false);
Iterator<Object> phraseIt = phraseSet.iterator( );
while (phraseIt != null && phraseIt.hasNext() ) {
char[] phrase = (char[])phraseIt.next();
Log.debug( "'" + new String( phrase ) + "'" );
char[] firstTerm = getFirstTerm( phrase );
Log.debug( "'" + new String( firstTerm ) + "'" );
CharArraySet itsPhrases = phraseMap.get( firstTerm, 0, firstTerm.length );
if (itsPhrases == null) {
itsPhrases = new CharArraySet( 5, false );
phraseMap.put( new String( firstTerm ), itsPhrases );
}
itsPhrases.add( phrase );
}
return phraseMap;
}
开发者ID:lucidworks,项目名称:auto-phrase-tokenfilter,代码行数:23,代码来源:AutoPhrasingTokenFilter.java
示例2: readAffixFile
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
/**
* Reads the affix file through the provided InputStream, building up the prefix and suffix maps
*
* @param affixStream InputStream to read the content of the affix file from
* @param decoder CharsetDecoder to decode the content of the file
* @throws IOException Can be thrown while reading from the InputStream
*/
private void readAffixFile(InputStream affixStream, CharsetDecoder decoder, boolean strict) throws IOException, ParseException {
prefixes = new CharArrayMap<List<HunspellAffix>>(version, 8, ignoreCase);
suffixes = new CharArrayMap<List<HunspellAffix>>(version, 8, ignoreCase);
LineNumberReader reader = new LineNumberReader(new InputStreamReader(affixStream, decoder));
String line = null;
while ((line = reader.readLine()) != null) {
if (line.startsWith(ALIAS_KEY)) {
parseAlias(line);
} else if (line.startsWith(PREFIX_KEY)) {
parseAffix(prefixes, line, reader, PREFIX_CONDITION_REGEX_PATTERN, strict);
} else if (line.startsWith(SUFFIX_KEY)) {
parseAffix(suffixes, line, reader, SUFFIX_CONDITION_REGEX_PATTERN, strict);
} else if (line.startsWith(FLAG_KEY)) {
// Assume that the FLAG line comes before any prefix or suffixes
// Store the strategy so it can be used when parsing the dic file
flagParsingStrategy = getFlagParsingStrategy(line);
}
}
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:28,代码来源:HunspellDictionary.java
示例3: DutchAnalyzer
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public DutchAnalyzer(Version matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable, CharArrayMap<String> stemOverrideDict) {
this.matchVersion = matchVersion;
this.stoptable = CharArraySet.unmodifiableSet(CharArraySet.copy(matchVersion, stopwords));
this.excltable = CharArraySet.unmodifiableSet(CharArraySet.copy(matchVersion, stemExclusionTable));
if (stemOverrideDict.isEmpty() || !matchVersion.onOrAfter(Version.LUCENE_31)) {
this.stemdict = null;
this.origStemdict = CharArrayMap.unmodifiableMap(CharArrayMap.copy(matchVersion, stemOverrideDict));
} else {
this.origStemdict = null;
// we don't need to ignore case here since we lowercase in this analyzer anyway
StemmerOverrideFilter.Builder builder = new StemmerOverrideFilter.Builder(false);
CharArrayMap<String>.EntryIterator iter = stemOverrideDict.entrySet().iterator();
CharsRef spare = new CharsRef();
while (iter.hasNext()) {
char[] nextKey = iter.nextKey();
spare.copyChars(nextKey, 0, nextKey.length);
builder.add(spare, iter.currentValue());
}
try {
this.stemdict = builder.build();
} catch (IOException ex) {
throw new RuntimeException("can not build stem dict", ex);
}
}
}
开发者ID:yintaoxue,项目名称:read-open-source-code,代码行数:26,代码来源:DutchAnalyzer.java
示例4: DutchAnalyzer
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
/**
* @deprecated Use {@link #DutchAnalyzer(CharArraySet)}
*/
@Deprecated
public DutchAnalyzer(Version matchVersion, CharArraySet stopwords){
// historically, this ctor never the stem dict!!!!!
// so we populate it only for >= 3.6
this(matchVersion, stopwords, CharArraySet.EMPTY_SET,
matchVersion.onOrAfter(Version.LUCENE_3_6)
? DefaultSetHolder.DEFAULT_STEM_DICT
: CharArrayMap.<String>emptyMap());
}
开发者ID:lamsfoundation,项目名称:lams,代码行数:13,代码来源:DutchAnalyzer.java
示例5: add
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
/**
* @param singleMatch List<String>, the sequence of strings to match
* @param replacement List<Token> the list of tokens to use on a match
* @param includeOrig sets a flag on this mapping signaling the generation of matched tokens in addition to the replacement tokens
* @param mergeExisting merge the replacement tokens with any other mappings that exist
*/
public void add(List<String> singleMatch, List<Token> replacement, boolean includeOrig, boolean mergeExisting) {
SlowSynonymMap currMap = this;
for (String str : singleMatch) {
if (currMap.submap==null) {
// for now hardcode at 4.0, as its what the old code did.
// would be nice to fix, but shouldn't store a version in each submap!!!
currMap.submap = new CharArrayMap<>(Version.LUCENE_CURRENT, 1, ignoreCase());
}
SlowSynonymMap map = currMap.submap.get(str);
if (map==null) {
map = new SlowSynonymMap();
map.flags |= flags & IGNORE_CASE;
currMap.submap.put(str, map);
}
currMap = map;
}
if (currMap.synonyms != null && !mergeExisting) {
throw new IllegalArgumentException("SynonymFilter: there is already a mapping for " + singleMatch);
}
List<Token> superset = currMap.synonyms==null ? replacement :
mergeTokens(Arrays.asList(currMap.synonyms), replacement);
currMap.synonyms = superset.toArray(new Token[superset.size()]);
if (includeOrig) currMap.flags |= INCLUDE_ORIG;
}
开发者ID:lamsfoundation,项目名称:lams,代码行数:34,代码来源:SlowSynonymMap.java
示例6: MyAnalyzer
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
/**
* Inizializza l'analizzatore sintattico per lingua
*
* @param language lingua
* @param stopwords stop words
* @param stemExclusionSet elenco dei termini che non deve essere sottoposto
* a stemming
* @param stemOverrideDict dizionario dei termini in overriding
*/
public MyAnalyzer(String language, CharArraySet stopwords, CharArraySet stemExclusionSet, CharArrayMap<String> stemOverrideDict) {
super(stopwords);
this.language = language;
this.stemExclusionSet = CharArraySet.unmodifiableSet(CharArraySet.copy(stemExclusionSet));
this.stemTable = DefaultSetHolder.DEFAULT_TABLE;
if (stemOverrideDict.isEmpty()) {
this.stemdict = null;
} else {
Builder builder = new Builder(false);
EntryIterator iter = stemOverrideDict.entrySet().iterator();
CharsRefBuilder spare = new CharsRefBuilder();
while (iter.hasNext()) {
char[] ex = iter.nextKey();
spare.copyChars(ex, 0, ex.length);
builder.add(spare.get(), (CharSequence) iter.currentValue());
}
try {
this.stemdict = builder.build();
} catch (IOException var8) {
throw new RuntimeException("can not build stem dict", var8);
}
}
}
开发者ID:fiohol,项目名称:theSemProject,代码行数:36,代码来源:MyAnalyzer.java
示例7: doRandom
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public void doRandom(int iter, boolean ignoreCase) {
CharArrayMap<Integer> map = new CharArrayMap<>(1, ignoreCase);
HashMap<String,Integer> hmap = new HashMap<>();
char[] key;
for (int i=0; i<iter; i++) {
int len = random().nextInt(5);
key = new char[len];
for (int j=0; j<key.length; j++) {
key[j] = (char)random().nextInt(127);
}
String keyStr = new String(key);
String hmapKey = ignoreCase ? keyStr.toLowerCase(Locale.ROOT) : keyStr;
int val = random().nextInt();
Object o1 = map.put(key, val);
Object o2 = hmap.put(hmapKey,val);
assertEquals(o1,o2);
// add it again with the string method
assertEquals(val, map.put(keyStr,val).intValue());
assertEquals(val, map.get(key,0,key.length).intValue());
assertEquals(val, map.get(key).intValue());
assertEquals(val, map.get(keyStr).intValue());
assertEquals(hmap.size(), map.size());
}
}
开发者ID:europeana,项目名称:search,代码行数:31,代码来源:TestCharArrayMap.java
示例8: testToString
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public void testToString() {
CharArrayMap<Integer> cm = new CharArrayMap<>(Collections.singletonMap("test",1), false);
assertEquals("[test]",cm.keySet().toString());
assertEquals("[1]",cm.values().toString());
assertEquals("[test=1]",cm.entrySet().toString());
assertEquals("{test=1}",cm.toString());
cm.put("test2", 2);
assertTrue(cm.keySet().toString().contains(", "));
assertTrue(cm.values().toString().contains(", "));
assertTrue(cm.entrySet().toString().contains(", "));
assertTrue(cm.toString().contains(", "));
}
开发者ID:europeana,项目名称:search,代码行数:13,代码来源:TestCharArrayMap.java
示例9: create
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
@Override public Object create(Random random) {
int num = random.nextInt(10);
CharArrayMap<String> map = new CharArrayMap<>(num, random.nextBoolean());
for (int i = 0; i < num; i++) {
// TODO: make nastier
map.put(TestUtil.randomSimpleString(random), TestUtil.randomSimpleString(random));
}
return map;
}
开发者ID:europeana,项目名称:search,代码行数:10,代码来源:TestRandomChains.java
示例10: DutchAnalyzer
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public DutchAnalyzer(Version matchVersion, CharArraySet stopwords){
// historically, this ctor never the stem dict!!!!!
// so we populate it only for >= 3.6
this(matchVersion, stopwords, CharArraySet.EMPTY_SET,
matchVersion.onOrAfter(Version.LUCENE_36)
? DefaultSetHolder.DEFAULT_STEM_DICT
: CharArrayMap.<String>emptyMap());
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:9,代码来源:DutchAnalyzer.java
示例11: add
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
/**
* @param singleMatch List<String>, the sequence of strings to match
* @param replacement List<Token> the list of tokens to use on a match
* @param includeOrig sets a flag on this mapping signaling the generation of matched tokens in addition to the replacement tokens
* @param mergeExisting merge the replacement tokens with any other mappings that exist
*/
public void add(List<String> singleMatch, List<Token> replacement, boolean includeOrig, boolean mergeExisting) {
SlowSynonymMap currMap = this;
for (String str : singleMatch) {
if (currMap.submap==null) {
// for now hardcode at 4.0, as its what the old code did.
// would be nice to fix, but shouldn't store a version in each submap!!!
currMap.submap = new CharArrayMap<SlowSynonymMap>(Version.LUCENE_40, 1, ignoreCase());
}
SlowSynonymMap map = currMap.submap.get(str);
if (map==null) {
map = new SlowSynonymMap();
map.flags |= flags & IGNORE_CASE;
currMap.submap.put(str, map);
}
currMap = map;
}
if (currMap.synonyms != null && !mergeExisting) {
throw new IllegalArgumentException("SynonymFilter: there is already a mapping for " + singleMatch);
}
List<Token> superset = currMap.synonyms==null ? replacement :
mergeTokens(Arrays.asList(currMap.synonyms), replacement);
currMap.synonyms = superset.toArray(new Token[superset.size()]);
if (includeOrig) currMap.flags |= INCLUDE_ORIG;
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:34,代码来源:SlowSynonymMap.java
示例12: doRandom
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public void doRandom(int iter, boolean ignoreCase) {
CharArrayMap<Integer> map = new CharArrayMap<Integer>(TEST_VERSION_CURRENT, 1, ignoreCase);
HashMap<String,Integer> hmap = new HashMap<String,Integer>();
char[] key;
for (int i=0; i<iter; i++) {
int len = random().nextInt(5);
key = new char[len];
for (int j=0; j<key.length; j++) {
key[j] = (char)random().nextInt(127);
}
String keyStr = new String(key);
String hmapKey = ignoreCase ? keyStr.toLowerCase(Locale.ROOT) : keyStr;
int val = random().nextInt();
Object o1 = map.put(key, val);
Object o2 = hmap.put(hmapKey,val);
assertEquals(o1,o2);
// add it again with the string method
assertEquals(val, map.put(keyStr,val).intValue());
assertEquals(val, map.get(key,0,key.length).intValue());
assertEquals(val, map.get(key).intValue());
assertEquals(val, map.get(keyStr).intValue());
assertEquals(hmap.size(), map.size());
}
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:31,代码来源:TestCharArrayMap.java
示例13: testToString
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public void testToString() {
CharArrayMap<Integer> cm = new CharArrayMap<Integer>(TEST_VERSION_CURRENT, Collections.singletonMap("test",1), false);
assertEquals("[test]",cm.keySet().toString());
assertEquals("[1]",cm.values().toString());
assertEquals("[test=1]",cm.entrySet().toString());
assertEquals("{test=1}",cm.toString());
cm.put("test2", 2);
assertTrue(cm.keySet().toString().contains(", "));
assertTrue(cm.values().toString().contains(", "));
assertTrue(cm.entrySet().toString().contains(", "));
assertTrue(cm.toString().contains(", "));
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:13,代码来源:TestCharArrayMap.java
示例14: testOverride
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
public void testOverride() throws IOException {
// lets make booked stem to books
// the override filter will convert "booked" to "books",
// but also mark it with KeywordAttribute so Porter will not change it.
CharArrayMap<String> dictionary = new CharArrayMap<String>(TEST_VERSION_CURRENT, 1, false);
dictionary.put("booked", "books");
Tokenizer tokenizer = new KeywordTokenizer(new StringReader("booked"));
TokenStream stream = new PorterStemFilter(
new StemmerOverrideFilter(tokenizer, dictionary));
assertTokenStreamContents(stream, new String[] { "books" });
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:12,代码来源:TestStemmerOverrideFilter.java
示例15: create
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
@Override public Object create(Random random) {
int num = random.nextInt(10);
CharArrayMap<String> map = new CharArrayMap<String>(TEST_VERSION_CURRENT, num, random.nextBoolean());
for (int i = 0; i < num; i++) {
// TODO: make nastier
map.put(_TestUtil.randomSimpleString(random), _TestUtil.randomSimpleString(random));
}
return map;
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:10,代码来源:TestRandomChains.java
示例16: add
import org.apache.lucene.analysis.util.CharArrayMap; //导入依赖的package包/类
/**
* @param singleMatch List<String>, the sequence of strings to match
* @param replacement List<Token> the list of tokens to use on a match
* @param includeOrig sets a flag on this mapping signaling the generation of matched tokens in addition to the replacement tokens
* @param mergeExisting merge the replacement tokens with any other mappings that exist
*/
public void add(List<String> singleMatch, List<Token> replacement, boolean includeOrig, boolean mergeExisting) {
SlowSynonymMap currMap = this;
for (String str : singleMatch) {
if (currMap.submap==null) {
// for now hardcode at 4.0, as its what the old code did.
// would be nice to fix, but shouldn't store a version in each submap!!!
currMap.submap = new CharArrayMap<SlowSynonymMap>(Version.LUCENE_CURRENT, 1, ignoreCase());
}
SlowSynonymMap map = currMap.submap.get(str);
if (map==null) {
map = new SlowSynonymMap();
map.flags |= flags & IGNORE_CASE;
currMap.submap.put(str, map);
}
currMap = map;
}
if (currMap.synonyms != null && !mergeExisting) {
throw new IllegalArgumentException("SynonymFilter: there is already a mapping for " + singleMatch);
}
List<Token> superset = currMap.synonyms==null ? replacement :
mergeTokens(Arrays.asList(currMap.synonyms), replacement);
currMap.synonyms = superset.toArray(new Token[superset.size()]);
if (includeOrig) currMap.flags |= INCLUDE_ORIG;
}
开发者ID:yintaoxue,项目名称:read-open-source-code,代码行数:34,代码来源:SlowSynonymMap.java
注:本文中的org.apache.lucene.analysis.util.CharArrayMap类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论