本文整理汇总了Java中org.apache.lucene.queries.mlt.MoreLikeThis类的典型用法代码示例。如果您正苦于以下问题:Java MoreLikeThis类的具体用法?Java MoreLikeThis怎么用?Java MoreLikeThis使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
MoreLikeThis类属于org.apache.lucene.queries.mlt包,在下文中一共展示了MoreLikeThis类的11个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: getSample
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
public synchronized Sample getSample(String id) throws Exception {
Term term = new Term(Sample.ID, id);
TermQuery tq = new TermQuery(term);
TopDocs td = getIndexSearcher().search(tq, 1);
if (td.totalHits > 0) {
Document doc = indexSearcher.doc(td.scoreDocs[0].doc);
Sample s = (Sample) SerializationUtil.deserialize(doc.getBinaryValue("RawData").bytes);
MoreLikeThis mlt = new MoreLikeThis(indexSearcher.getIndexReader());
mlt.setFieldNames(new String[] { Variable.LABEL, Variable.DESCRIPTION, Variable.TAGS });
mlt.setMaxWordLen(MAXWORDLENGTH);
String[] terms = mlt.retrieveInterestingTerms(td.scoreDocs[0].doc);
for (int i = 0; i < 10 && i < terms.length; i++) {
s.put(Variable.SUGGESTEDTAGS, terms[i]);
}
return s;
} else {
return null;
}
}
开发者ID:jdmp,项目名称:java-data-mining-package,代码行数:22,代码来源:LuceneIndex.java
示例2: searchSimilar
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
public synchronized ListDataSet searchSimilar(Sample sample, int start, int count) throws Exception {
Term term = new Term(Sample.ID, sample.getId());
TermQuery tq = new TermQuery(term);
TopDocs td = getIndexSearcher().search(tq, count);
if (td == null || td.totalHits == 0) {
ListDataSet ds = ListDataSet.Factory.emptyDataSet();
return ds;
}
MoreLikeThis mlt = new MoreLikeThis(indexSearcher.getIndexReader());
mlt.setFieldNames(new String[] { Variable.LABEL, Variable.DESCRIPTION, Variable.TAGS });
mlt.setMaxWordLen(MAXWORDLENGTH);
Query query = mlt.like(td.scoreDocs[0].doc);
BooleanQuery bq = new BooleanQuery();
bq.add(query, Occur.MUST);
bq.add(new TermQuery(new Term("Id", sample.getId())), Occur.MUST_NOT);
return search(bq, start, count);
}
开发者ID:jdmp,项目名称:java-data-mining-package,代码行数:19,代码来源:LuceneIndex.java
示例3: train
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
/**
* {@inheritDoc}
*/
@Override
public void train(AtomicReader atomicReader, String[] textFieldNames, String classFieldName, Analyzer analyzer, Query query) throws IOException {
this.textFieldNames = textFieldNames;
this.classFieldName = classFieldName;
mlt = new MoreLikeThis(atomicReader);
mlt.setAnalyzer(analyzer);
mlt.setFieldNames(textFieldNames);
indexSearcher = new IndexSearcher(atomicReader);
if (minDocsFreq > 0) {
mlt.setMinDocFreq(minDocsFreq);
}
if (minTermFreq > 0) {
mlt.setMinTermFreq(minTermFreq);
}
this.query = query;
}
开发者ID:europeana,项目名称:search,代码行数:20,代码来源:KNearestNeighborClassifier.java
示例4: newMoreLikeThis
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
/**
* Returns a new instance of lucene's {@link MoreLikeThis} with the
* right {@link IndexReader}.
*/
public MoreLikeThis newMoreLikeThis(final Locale locale) {
final Index index = IndexManager.getInstance().getIndex();
final MoreLikeThis mlt = new MoreLikeThis(index.getIndexReader());
mlt.setAnalyzer(index.getAnalyzer(locale));
return mlt;
}
开发者ID:XMBomb,项目名称:InComb,代码行数:13,代码来源:IndexSearch.java
示例5: MoreLikeThisHelper
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
public MoreLikeThisHelper( SolrParams params, SolrIndexSearcher searcher )
{
this.searcher = searcher;
this.reader = searcher.getIndexReader();
this.uniqueKeyField = searcher.getSchema().getUniqueKeyField();
this.needDocSet = params.getBool(FacetParams.FACET,false);
SolrParams required = params.required();
String[] fields = splitList.split( required.get(MoreLikeThisParams.SIMILARITY_FIELDS) );
if( fields.length < 1 ) {
throw new SolrException( SolrException.ErrorCode.BAD_REQUEST,
"MoreLikeThis requires at least one similarity field: "+MoreLikeThisParams.SIMILARITY_FIELDS );
}
this.mlt = new MoreLikeThis( reader ); // TODO -- after LUCENE-896, we can use , searcher.getSimilarity() );
mlt.setFieldNames(fields);
mlt.setAnalyzer( searcher.getSchema().getIndexAnalyzer() );
// configurable params
mlt.setMinTermFreq( params.getInt(MoreLikeThisParams.MIN_TERM_FREQ, MoreLikeThis.DEFAULT_MIN_TERM_FREQ));
mlt.setMinDocFreq( params.getInt(MoreLikeThisParams.MIN_DOC_FREQ, MoreLikeThis.DEFAULT_MIN_DOC_FREQ));
mlt.setMaxDocFreq( params.getInt(MoreLikeThisParams.MAX_DOC_FREQ, MoreLikeThis.DEFAULT_MAX_DOC_FREQ));
mlt.setMinWordLen( params.getInt(MoreLikeThisParams.MIN_WORD_LEN, MoreLikeThis.DEFAULT_MIN_WORD_LENGTH));
mlt.setMaxWordLen( params.getInt(MoreLikeThisParams.MAX_WORD_LEN, MoreLikeThis.DEFAULT_MAX_WORD_LENGTH));
mlt.setMaxQueryTerms( params.getInt(MoreLikeThisParams.MAX_QUERY_TERMS, MoreLikeThis.DEFAULT_MAX_QUERY_TERMS));
mlt.setMaxNumTokensParsed(params.getInt(MoreLikeThisParams.MAX_NUM_TOKENS_PARSED, MoreLikeThis.DEFAULT_MAX_NUM_TOKENS_PARSED));
mlt.setBoost( params.getBool(MoreLikeThisParams.BOOST, false ) );
boostFields = SolrPluginUtils.parseFieldBoosts(params.getParams(MoreLikeThisParams.QF));
}
开发者ID:europeana,项目名称:search,代码行数:31,代码来源:MoreLikeThisHandler.java
示例6: train
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
/**
* {@inheritDoc}
*/
@Override
public void train(AtomicReader atomicReader, String textFieldName, String classFieldName, Analyzer analyzer) throws IOException {
this.textFieldName = textFieldName;
this.classFieldName = classFieldName;
mlt = new MoreLikeThis(atomicReader);
mlt.setAnalyzer(analyzer);
mlt.setFieldNames(new String[]{textFieldName});
indexSearcher = new IndexSearcher(atomicReader);
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:13,代码来源:KNearestNeighborClassifier.java
示例7: MoreLikeThisHelper
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
public MoreLikeThisHelper( SolrParams params, SolrIndexSearcher searcher )
{
this.searcher = searcher;
this.reader = searcher.getIndexReader();
this.uniqueKeyField = searcher.getSchema().getUniqueKeyField();
this.needDocSet = params.getBool(FacetParams.FACET,false);
SolrParams required = params.required();
String[] fields = splitList.split( required.get(MoreLikeThisParams.SIMILARITY_FIELDS) );
if( fields.length < 1 ) {
throw new SolrException( SolrException.ErrorCode.BAD_REQUEST,
"MoreLikeThis requires at least one similarity field: "+MoreLikeThisParams.SIMILARITY_FIELDS );
}
this.mlt = new MoreLikeThis( reader ); // TODO -- after LUCENE-896, we can use , searcher.getSimilarity() );
mlt.setFieldNames(fields);
mlt.setAnalyzer( searcher.getSchema().getAnalyzer() );
// configurable params
mlt.setMinTermFreq( params.getInt(MoreLikeThisParams.MIN_TERM_FREQ, MoreLikeThis.DEFAULT_MIN_TERM_FREQ));
mlt.setMinDocFreq( params.getInt(MoreLikeThisParams.MIN_DOC_FREQ, MoreLikeThis.DEFAULT_MIN_DOC_FREQ));
mlt.setMaxDocFreq( params.getInt(MoreLikeThisParams.MAX_DOC_FREQ, MoreLikeThis.DEFAULT_MAX_DOC_FREQ));
mlt.setMinWordLen( params.getInt(MoreLikeThisParams.MIN_WORD_LEN, MoreLikeThis.DEFAULT_MIN_WORD_LENGTH));
mlt.setMaxWordLen( params.getInt(MoreLikeThisParams.MAX_WORD_LEN, MoreLikeThis.DEFAULT_MAX_WORD_LENGTH));
mlt.setMaxQueryTerms( params.getInt(MoreLikeThisParams.MAX_QUERY_TERMS, MoreLikeThis.DEFAULT_MAX_QUERY_TERMS));
mlt.setMaxNumTokensParsed(params.getInt(MoreLikeThisParams.MAX_NUM_TOKENS_PARSED, MoreLikeThis.DEFAULT_MAX_NUM_TOKENS_PARSED));
mlt.setBoost( params.getBool(MoreLikeThisParams.BOOST, false ) );
boostFields = SolrPluginUtils.parseFieldBoosts(params.getParams(MoreLikeThisParams.QF));
}
开发者ID:pkarmstr,项目名称:NYBC,代码行数:31,代码来源:MoreLikeThisHandler.java
示例8: train
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
/**
* {@inheritDoc}
*/
@Override
public void train(AtomicReader atomicReader, String textFieldName, String classFieldName, Analyzer analyzer, Query query) throws IOException {
this.textFieldNames = new String[]{textFieldName};
this.classFieldName = classFieldName;
mlt = new MoreLikeThis(atomicReader);
mlt.setAnalyzer(analyzer);
mlt.setFieldNames(new String[]{textFieldName});
indexSearcher = new IndexSearcher(atomicReader);
this.query = query;
}
开发者ID:jimaguere,项目名称:Maskana-Gestor-de-Conocimiento,代码行数:14,代码来源:KNearestNeighborClassifier.java
示例9: main
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
public static void main(String[] args) throws Throwable {
String indexDir = System.getProperty("index.dir");
FSDirectory directory = FSDirectory.open(new File(indexDir));
IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
int numDocs = reader.maxDoc();
MoreLikeThis mlt = new MoreLikeThis(reader);
mlt.setFieldNames(new String[] { "title", "author" });
mlt.setMinTermFreq(1);
mlt.setMinDocFreq(1);
for (int docID = 0; docID < numDocs; docID++) {
LOGGER.info();
StoredDocument doc = reader.document(docID);
LOGGER.info(doc.get("title"));
Query query = mlt.like(docID);
LOGGER.info(" query=" + query);
TopDocs similarDocs = searcher.search(query, 10);
if (similarDocs.totalHits == 0)
LOGGER.info(" None like this");
for (int i = 0; i < similarDocs.scoreDocs.length; i++) {
if (similarDocs.scoreDocs[i].doc != docID) {
doc = reader.document(similarDocs.scoreDocs[i].doc);
LOGGER.info(" -> " + doc.getField("title").stringValue());
}
}
}
reader.close();
directory.close();
}
开发者ID:xuzhikethinker,项目名称:t4f-data,代码行数:38,代码来源:BooksMoreLikeThis.java
示例10: getSimilar
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
/**
* Searches for similar {@link News}.
* @return a {@link List} with the similar {@link News}. Can be empty.
* The more similar the more top is the {@link News} in the {@link List}.
*/
public List<News> getSimilar() {
final int docId = IndexSearch.getInstance().getDocIdForId(
NewsIndexType.getInstance(), String.valueOf(news.getId()));
// configure "more like this"
final MoreLikeThis moreLikeThis = IndexSearch.getInstance().newMoreLikeThis(news.getLocale());
moreLikeThis.setMinWordLen(3);
moreLikeThis.setBoost(true);
moreLikeThis.setBoostFactor(10);
moreLikeThis.setFieldNames(new String[] {
NewsIndexType.FIELD_TITLE,
NewsIndexType.FIELD_DESCRIPTION
});
try {
final BooleanQuery query = new BooleanQuery();
// it must have the same locale
QueryUtil.addLocale(query, news.getLocale());
// filter with publish date
final NumericRangeQuery<Long> dateQuery = NumericRangeQuery.newLongRange(
NewsIndexType.FIELD_PUBLISH_DATE, getDate(-PUBLISH_DATE_DELTA),
getDate(PUBLISH_DATE_DELTA), true, true);
query.add(dateQuery, Occur.MUST);
// it must be in the same category.
final NumericRangeQuery<Integer> categoryQuery = NumericRangeQuery.newIntRange(
NewsIndexType.FIELD_CATEGORYID, news.getCategoryId(), news.getCategoryId(), true, true);
query.add(categoryQuery, Occur.MUST);
final Query moreQuery = moreLikeThis.like(docId);
query.add(moreQuery, Occur.MUST);
// not the same news
query.add(new TermQuery(new Term(IIndexElement.FIELD_ID, String.valueOf(news.getId()))), Occur.MUST_NOT);
// execute query
final DocumentsSearchResult result = IndexSearch.getInstance().search(query, new SearchOptions());
final List<Document> resultDocs = new ArrayList<>();
for (final Document doc : result.getResults()) {
final float score = result.getScore(doc);
// use only news with a sufficient score.
if(score >= MIN_SCORE) {
resultDocs.add(doc);
LOGGER.debug("News {} is similar to news {}.", doc.get(IIndexElement.FIELD_ID), news.getId());
}
else {
LOGGER.debug("News {} has not a sufficient score to be similar to news {}.",
doc.get(IIndexElement.FIELD_ID), news.getId());
}
}
// convert the Documents to News.
return NewsIndexType.docsToNews(resultDocs);
}
catch (final IOException e) {
LOGGER.error("Can't build query for news with id '{}'.", news.getId());
}
return null;
}
开发者ID:XMBomb,项目名称:InComb,代码行数:70,代码来源:SimilarNewsFinder.java
示例11: getMoreLikeThis
import org.apache.lucene.queries.mlt.MoreLikeThis; //导入依赖的package包/类
public MoreLikeThis getMoreLikeThis()
{
return mlt;
}
开发者ID:europeana,项目名称:search,代码行数:5,代码来源:MoreLikeThisHandler.java
注:本文中的org.apache.lucene.queries.mlt.MoreLikeThis类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论