本文整理汇总了Java中edu.stanford.nlp.process.WordSegmentingTokenizer类的典型用法代码示例。如果您正苦于以下问题:Java WordSegmentingTokenizer类的具体用法?Java WordSegmentingTokenizer怎么用?Java WordSegmentingTokenizer使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
WordSegmentingTokenizer类属于edu.stanford.nlp.process包,在下文中一共展示了WordSegmentingTokenizer类的5个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: parse
import edu.stanford.nlp.process.WordSegmentingTokenizer; //导入依赖的package包/类
/**
* Tokenizes the highlighted text (using a tokenizer appropriate for the
* selected language, and initiates the ParseThread to parse the tokenized
* text.
*/
public void parse() {
if (textPane.getText().length() == 0) {
return;
}
// use endIndex+1 because substring subtracts 1
String text = textPane.getText().substring(startIndex, endIndex + 1).trim();
if (parser != null && text.length() > 0) {
if (segmentWords) {
ChineseLexiconAndWordSegmenter lex = (ChineseLexiconAndWordSegmenter) parser.getLexicon();
ChineseTreebankLanguagePack.setTokenizerFactory(WordSegmentingTokenizer.factory(lex));
}
Tokenizer<? extends HasWord> toke = tlp.getTokenizerFactory().getTokenizer(new CharArrayReader(text.toCharArray()));
List<? extends HasWord> wordList = toke.tokenize();
parseThread = new ParseThread(wordList);
parseThread.start();
startProgressMonitor("Parsing", PARSE_TIME);
}
}
开发者ID:FabianFriedrich,项目名称:Text2Process,代码行数:26,代码来源:ParserPanel.java
示例2: parse
import edu.stanford.nlp.process.WordSegmentingTokenizer; //导入依赖的package包/类
/**
* Tokenizes the highlighted text (using a tokenizer appropriate for the
* selected language, and initiates the ParseThread to parse the tokenized
* text.
*/
public void parse() {
if (textPane.getText().length() == 0) {
return;
}
// use endIndex+1 because substring subtracts 1
String text = textPane.getText().substring(startIndex, endIndex + 1).trim();
if (parser != null && text.length() > 0) {
if (segmentWords) {
ChineseLexiconAndWordSegmenter lex = (ChineseLexiconAndWordSegmenter) parser.getLexicon();
ChineseTreebankLanguagePack.setTokenizerFactory(WordSegmentingTokenizer.factory(lex));
}
//Tokenizer<? extends HasWord> toke = tlp.getTokenizerFactory().getTokenizer(new CharArrayReader(text.toCharArray()));
Tokenizer<? extends HasWord> toke = tlp.getTokenizerFactory().getTokenizer(new StringReader(text));
List<? extends HasWord> wordList = toke.tokenize();
parseThread = new ParseThread(wordList);
parseThread.start();
startProgressMonitor("Parsing", PARSE_TIME);
}
}
开发者ID:amark-india,项目名称:eventspotter,代码行数:27,代码来源:ParserPanel.java
示例3: lex
import edu.stanford.nlp.process.WordSegmentingTokenizer; //导入依赖的package包/类
/**
* Returns a ChineseLexicon
*/
@Override
public Lexicon lex(Options op, Index<String> wordIndex, Index<String> tagIndex) {
if (useCharacterBasedLexicon) {
return lex = new ChineseCharacterBasedLexicon(this, wordIndex, tagIndex);
// } else if (useMaxentLexicon) {
// return lex = new ChineseMaxentLexicon();
}
if (op.lexOptions.uwModelTrainer == null) {
op.lexOptions.uwModelTrainer = "edu.stanford.nlp.parser.lexparser.ChineseUnknownWordModelTrainer";
}
ChineseLexicon clex = new ChineseLexicon(op, this, wordIndex, tagIndex);
if (segmenterClass != null) {
try {
segmenter = ReflectionLoading.loadByReflection(segmenterClass, this,
wordIndex, tagIndex);
} catch (ReflectionLoading.ReflectionLoadingException e) {
segmenter = ReflectionLoading.loadByReflection(segmenterClass);
}
}
if (segmenter != null) {
lex = new ChineseLexiconAndWordSegmenter(clex, segmenter);
ctlp.setTokenizerFactory(WordSegmentingTokenizer.factory(segmenter));
} else {
lex = clex;
}
return lex;
}
开发者ID:paulirwin,项目名称:Stanford.NER.Net,代码行数:33,代码来源:ChineseTreebankParserParams.java
示例4: ChineseLexiconAndWordSegmenter
import edu.stanford.nlp.process.WordSegmentingTokenizer; //导入依赖的package包/类
public ChineseLexiconAndWordSegmenter(ChineseLexicon lex, WordSegmenter seg) {
chineseLexicon = lex;
wordSegmenter = seg;
ChineseTreebankLanguagePack.setTokenizerFactory(WordSegmentingTokenizer.factory(seg));
}
开发者ID:FabianFriedrich,项目名称:Text2Process,代码行数:6,代码来源:ChineseLexiconAndWordSegmenter.java
示例5: readObject
import edu.stanford.nlp.process.WordSegmentingTokenizer; //导入依赖的package包/类
private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject();
ChineseTreebankLanguagePack.setTokenizerFactory(WordSegmentingTokenizer.factory(wordSegmenter));
}
开发者ID:FabianFriedrich,项目名称:Text2Process,代码行数:5,代码来源:ChineseLexiconAndWordSegmenter.java
注:本文中的edu.stanford.nlp.process.WordSegmentingTokenizer类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论