本文整理汇总了Java中cc.mallet.types.InfoGain类的典型用法代码示例。如果您正苦于以下问题:Java InfoGain类的具体用法?Java InfoGain怎么用?Java InfoGain使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
InfoGain类属于cc.mallet.types包,在下文中一共展示了InfoGain类的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: Node
import cc.mallet.types.InfoGain; //导入依赖的package包/类
public Node (InstanceList ilist, Node parent, FeatureSelection fs)
{
InfoGain ig = new InfoGain (ilist);
this.featureIndex = ig.getMaxValuedIndexIn (fs);
this.infoGain = ig.value(featureIndex);
this.ilist = ilist;
this.dictionary = ilist.getDataAlphabet();
this.parent = parent;
this.labeling = ig.getBaseLabelDistribution();
this.labelEntropy = ig.getBaseEntropy();
this.child0 = this.child1 = null;
}
开发者ID:kostagiolasn,项目名称:NucleosomePatternClassifier,代码行数:13,代码来源:DecisionTree.java
示例2: selectFeaturesByInfoGain
import cc.mallet.types.InfoGain; //导入依赖的package包/类
/**
* Select features with the highest information gain.
*
* @param list InstanceList for computing information gain.
* @param numFeatures Number of features to select.
* @return List of features with the highest information gains.
*/
public static ArrayList<Integer> selectFeaturesByInfoGain(InstanceList list, int numFeatures) {
ArrayList<Integer> features = new ArrayList<Integer>();
InfoGain infogain = new InfoGain(list);
for (int rank = 0; rank < numFeatures; rank++) {
features.add(infogain.getIndexAtRank(rank));
}
return features;
}
开发者ID:kostagiolasn,项目名称:NucleosomePatternClassifier,代码行数:17,代码来源:FeatureConstraintUtil.java
示例3: labelFeatures
import cc.mallet.types.InfoGain; //导入依赖的package包/类
/**
* Label features using heuristic described in
* "Learning from Labeled Features using Generalized Expectation Criteria"
* Gregory Druck, Gideon Mann, Andrew McCallum.
*
* @param list InstanceList used to compute statistics for labeling features.
* @param features List of features to label.
* @param reject Whether to reject labeling features.
* @return Labeled features, HashMap mapping feature indices to list of labels.
*/
public static HashMap<Integer, ArrayList<Integer>> labelFeatures(InstanceList list, ArrayList<Integer> features, boolean reject) {
HashMap<Integer,ArrayList<Integer>> labeledFeatures = new HashMap<Integer,ArrayList<Integer>>();
double[][] featureLabelCounts = getFeatureLabelCounts(list,true);
int numLabels = list.getTargetAlphabet().size();
int minRank = 100 * numLabels;
InfoGain infogain = new InfoGain(list);
double sum = 0;
for (int rank = 0; rank < minRank; rank++) {
sum += infogain.getValueAtRank(rank);
}
double mean = sum / minRank;
for (int i = 0; i < features.size(); i++) {
int fi = features.get(i);
// reject features with infogain
// less than cutoff
if (reject && infogain.value(fi) < mean) {
//System.err.println("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
logger.info("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
continue;
}
double[] prob = featureLabelCounts[fi];
MatrixOps.plusEquals(prob,1e-8);
MatrixOps.timesEquals(prob, 1./MatrixOps.sum(prob));
int[] sortedIndices = getMaxIndices(prob);
ArrayList<Integer> labels = new ArrayList<Integer>();
if (numLabels > 2) {
// take anything within a factor of 2 of the best
// but no more than numLabels/2
boolean discard = false;
double threshold = prob[sortedIndices[0]] / 2;
for (int li = 0; li < numLabels; li++) {
if (prob[li] > threshold) {
labels.add(li);
}
if (reject && labels.size() > (numLabels / 2)) {
//System.err.println("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
logger.info("Oracle labeler rejected labeling: " + list.getDataAlphabet().lookupObject(fi));
discard = true;
break;
}
}
if (discard) {
continue;
}
}
else {
labels.add(sortedIndices[0]);
}
labeledFeatures.put(fi, labels);
}
return labeledFeatures;
}
开发者ID:kostagiolasn,项目名称:NucleosomePatternClassifier,代码行数:72,代码来源:FeatureConstraintUtil.java
示例4: run
import cc.mallet.types.InfoGain; //导入依赖的package包/类
public void run () {
Alphabet alphabet = dictOfSize(20);
// TRAIN
Clustering training = sampleClustering(alphabet);
Pipe clusterPipe = new OverlappingFeaturePipe();
System.err.println("Training with " + training);
InstanceList trainList = new InstanceList(clusterPipe);
trainList.addThruPipe(new ClusterSampleIterator(training, random, 0.5, 100));
System.err.println("Created " + trainList.size() + " instances.");
Classifier me = new MaxEntTrainer().train(trainList);
ClassifyingNeighborEvaluator eval =
new ClassifyingNeighborEvaluator(me, "YES");
Trial trial = new Trial(me, trainList);
System.err.println(new ConfusionMatrix(trial));
InfoGain ig = new InfoGain(trainList);
ig.print();
// Clusterer clusterer = new GreedyAgglomerative(training.getInstances().getPipe(),
// eval, 0.5);
Clusterer clusterer = new GreedyAgglomerativeByDensity(training.getInstances().getPipe(),
eval, 0.5, false,
new java.util.Random(1));
// TEST
Clustering testing = sampleClustering(alphabet);
InstanceList testList = testing.getInstances();
Clustering predictedClusters = clusterer.cluster(testList);
// EVALUATE
System.err.println("\n\nEvaluating System: " + clusterer);
ClusteringEvaluators evaluators = new ClusteringEvaluators(new ClusteringEvaluator[]{
new BCubedEvaluator(),
new PairF1Evaluator(),
new MUCEvaluator(),
new AccuracyEvaluator()});
System.err.println("truth:" + testing);
System.err.println("pred: " + predictedClusters);
System.err.println(evaluators.evaluate(testing, predictedClusters));
}
开发者ID:kostagiolasn,项目名称:NucleosomePatternClassifier,代码行数:43,代码来源:FirstOrderClusterExample.java
注:本文中的cc.mallet.types.InfoGain类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论