Java ParagraphVectors类代码示例

OStack程序员社区-中国程序员成长平台 › 门户 › 编程› Java›Java编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Java中org.deeplearning4j.models.paragraphvectors.ParagraphVectors类的典型用法代码示例。如果您正苦于以下问题：Java ParagraphVectors类的具体用法？Java ParagraphVectors怎么用？Java ParagraphVectors使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

ParagraphVectors类属于org.deeplearning4j.models.paragraphvectors包，在下文中一共展示了ParagraphVectors类的12个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: makeParagraphVectors

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
void makeParagraphVectors() throws Exception {

        // build a iterator for our dataset
        File dir = TYPE_LEARNING_DIR;
        dir.mkdirs();
        iterator = new FileLabelAwareIterator.Builder()
                           .addSourceFolder(new File(dir, "corpus"))
                           .build();

        tokenizerFactory = new DefaultTokenizerFactory();
        tokenizerFactory.setTokenPreProcessor(new CommonPreprocessor());

        // ParagraphVectors training configuration
        paragraphVectors = new ParagraphVectors.Builder()
                                   .learningRate(0.025)
                                   .minLearningRate(0.001)
                                   .batchSize(1000)
                                   .epochs(5)
                                   .iterate(iterator)
                                   .trainWordVectors(true)
                                   .tokenizerFactory(tokenizerFactory)
                                   .build();

        // Start model training
        paragraphVectors.fit();
    }

开发者ID:sillelien，项目名称:dollar，代码行数:27，代码来源:ParagraphVectorsClassifierExample.java

示例2: Par2Hier

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
public Par2Hier(ParagraphVectors paragraphVectors, Par2HierUtils.Method smoothing, int k) {
  this.smoothing = smoothing;
  this.k = k;

  this.labelsSource = paragraphVectors.getLabelsSource();
  this.labelAwareIterator = paragraphVectors.getLabelAwareIterator();
  this.lookupTable = paragraphVectors.getLookupTable();
  this.vocab = paragraphVectors.getVocab();
  this.tokenizerFactory = paragraphVectors.getTokenizerFactory();

}

开发者ID:tteofili，项目名称:par2hier，代码行数:12，代码来源:Par2Hier.java

示例3: loadParagraphVectors

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
private static ParagraphVectors loadParagraphVectors() {
    ParagraphVectors paragraphVectors = null;
    try {
        paragraphVectors = WordVectorSerializer.readParagraphVectors((PARAGRAPHVECTORMODELPATH));
        TokenizerFactory t = new DefaultTokenizerFactory();
        t.setTokenPreProcessor(new CommonPreprocessor());
        paragraphVectors.setTokenizerFactory(t);
        paragraphVectors.getConfiguration().setIterations(10); // please note, we set iterations to 1 here, just to speedup inference

    } catch (IOException e) {
        e.printStackTrace();
    }
    return paragraphVectors;
}

开发者ID:gizemsogancioglu，项目名称:biosses，代码行数:15，代码来源:SentenceVectorsBasedSimilarity.java

示例4: trainParagraghVecModel

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
public void trainParagraghVecModel(String locationToSave) throws FileNotFoundException {
    ClassPathResource resource = new ClassPathResource("/paragraphVectors/paragraphVectorTraining.txt");
    File file = resource.getFile();
    SentenceIterator iter = new BasicLineIterator(file);
    AbstractCache<VocabWord> cache = new AbstractCache<VocabWord>();
    TokenizerFactory t = new DefaultTokenizerFactory();
    t.setTokenPreProcessor(new CommonPreprocessor());
    /*
         if you don't have LabelAwareIterator handy, you can use synchronized labels generator
          it will be used to label each document/sequence/line with it's own label.
          But if you have LabelAwareIterator ready, you can can provide it, for your in-house labels
    */
    LabelsSource source = new LabelsSource("DOC_");

    ParagraphVectors vec = new ParagraphVectors.Builder()
            .minWordFrequency(1)
            .iterations(100)
            .epochs(1)
            .layerSize(50)
            .learningRate(0.02)
            .labelsSource(source)
            .windowSize(5)
            .iterate(iter)
            .trainWordVectors(true)
            .vocabCache(cache)
            .tokenizerFactory(t)
            .sampling(0)
            .build();

    vec.fit();

    WordVectorSerializer.writeParagraphVectors(vec, locationToSave);
}

开发者ID:gizemsogancioglu，项目名称:biosses，代码行数:34，代码来源:SentenceVectorsBasedSimilarity.java

示例5: writeWordVectors

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
/**
 * This method saves paragraph vectors to the given file.
 *
 * @param vectors
 * @param path
 */
@Deprecated
public static void writeWordVectors(@NonNull ParagraphVectors vectors, @NonNull File path) {
    try (BufferedOutputStream fos = new BufferedOutputStream(new FileOutputStream(path))) {
        writeWordVectors(vectors, fos);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

开发者ID:deeplearning4j，项目名称:deeplearning4j，代码行数:15，代码来源:WordVectorSerializer.java

示例6: writeParagraphVectors

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
/**
 * This method saves ParagraphVectors model into compressed zip file
 *
 * @param file
 */
public static void writeParagraphVectors(ParagraphVectors vectors, File file) {
    try (FileOutputStream fos = new FileOutputStream(file);
                    BufferedOutputStream stream = new BufferedOutputStream(fos)) {
        writeParagraphVectors(vectors, stream);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

开发者ID:deeplearning4j，项目名称:deeplearning4j，代码行数:14，代码来源:WordVectorSerializer.java

示例7: readParagraphVectorsFromText

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
/**
 * Restores previously serialized ParagraphVectors model
 *
 * Deprecation note: Please, consider using readParagraphVectors() method instead
 *
 * @param file File that contains previously serialized model
 * @return
 */
@Deprecated
public static ParagraphVectors readParagraphVectorsFromText(@NonNull File file) {
    try (BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file))) {
        return readParagraphVectorsFromText(bis);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

开发者ID:deeplearning4j，项目名称:deeplearning4j，代码行数:17，代码来源:WordVectorSerializer.java

示例8: main

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
public static void main(String[] args) throws Exception {
    ClassPathResource srcFile = new ClassPathResource("/raw_sentences.txt");
    File file = srcFile.getFile();
    SentenceIterator iter = new BasicLineIterator(file);
    
    TokenizerFactory tFact = new DefaultTokenizerFactory();
    tFact.setTokenPreProcessor(new CommonPreprocessor());

    LabelsSource labelFormat = new LabelsSource("LINE_");

    ParagraphVectors vec = new ParagraphVectors.Builder()
            .minWordFrequency(1)
            .iterations(5)
            .epochs(1)
            .layerSize(100)
            .learningRate(0.025)
            .labelsSource(labelFormat)
            .windowSize(5)
            .iterate(iter)
            .trainWordVectors(false)
            .tokenizerFactory(tFact)
            .sampling(0)
            .build();

    vec.fit();

    double similar1 = vec.similarity("LINE_9835", "LINE_12492");
    out.println("Comparing lines 9836 & 12493 ('This is my house .'/'This is my world .') Similarity = " + similar1);


    double similar2 = vec.similarity("LINE_3720", "LINE_16392");
    out.println("Comparing lines 3721 & 16393 ('This is my way .'/'This is my work .') Similarity = " + similar2);

    double similar3 = vec.similarity("LINE_6347", "LINE_3720");
    out.println("Comparing lines 6348 & 3721 ('This is my case .'/'This is my way .') Similarity = " + similar3);

    double dissimilar1 = vec.similarity("LINE_3720", "LINE_9852");
    out.println("Comparing lines 3721 & 9853 ('This is my way .'/'We now have one .') Similarity = " + dissimilar1);
    
    double dissimilar2 = vec.similarity("LINE_3720", "LINE_3719");
    out.println("Comparing lines 3721 & 3720 ('This is my way .'/'At first he says no .') Similarity = " + dissimilar2);
    
    
    
}

开发者ID:PacktPublishing，项目名称:Machine-Learning-End-to-Endguide-for-Java-developers，代码行数:46，代码来源:ClassifyBySimilarity.java

示例9: main

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
public static void main(String[] args) throws Exception {

        ClassPathResource resource = new ClassPathResource("paravec/labeled");

        iter = new FileLabelAwareIterator.Builder()
                .addSourceFolder(resource.getFile())
                .build();

        tFact = new DefaultTokenizerFactory();
        tFact.setTokenPreProcessor(new CommonPreprocessor());

        pVect = new ParagraphVectors.Builder()
                .learningRate(0.025)
                .minLearningRate(0.001)
                .batchSize(1000)
                .epochs(20)
                .iterate(iter)
                .trainWordVectors(true)
                .tokenizerFactory(tFact)
                .build();

        pVect.fit();


        ClassPathResource unlabeledText = new ClassPathResource("paravec/unlabeled");
        FileLabelAwareIterator unlabeledIter = new FileLabelAwareIterator.Builder()
                .addSourceFolder(unlabeledText.getFile())
                .build();


        MeansBuilder mBuilder = new MeansBuilder(
                (InMemoryLookupTable<VocabWord>) pVect.getLookupTable(),
                tFact);
        LabelSeeker lSeeker = new LabelSeeker(iter.getLabelsSource().getLabels(),
                (InMemoryLookupTable<VocabWord>) pVect.getLookupTable());

        while (unlabeledIter.hasNextDocument()) {
            LabelledDocument doc = unlabeledIter.nextDocument();
            INDArray docCentroid = mBuilder.documentAsVector(doc);
            List<Pair<String, Double>> scores = lSeeker.getScores(docCentroid);

            out.println("Document '" + doc.getLabel() + "' falls into the following categories: ");
            for (Pair<String, Double> score : scores) {
                out.println("        " + score.getFirst() + ": " + score.getSecond());
            }

        }
    }

开发者ID:PacktPublishing，项目名称:Machine-Learning-End-to-Endguide-for-Java-developers，代码行数:49，代码来源:ParagraphVectorsClassifierExample.java

示例10: testParaVecSerialization1

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
@Test
public void testParaVecSerialization1() throws Exception {
    VectorsConfiguration configuration = new VectorsConfiguration();
    configuration.setIterations(14123);
    configuration.setLayersSize(156);

    INDArray syn0 = Nd4j.rand(100, configuration.getLayersSize());
    INDArray syn1 = Nd4j.rand(100, configuration.getLayersSize());

    AbstractCache<VocabWord> cache = new AbstractCache.Builder<VocabWord>().build();

    for (int i = 0; i < 100; i++) {
        VocabWord word = new VocabWord((float) i, "word_" + i);
        List<Integer> points = new ArrayList<>();
        List<Byte> codes = new ArrayList<>();
        int num = org.apache.commons.lang3.RandomUtils.nextInt(1, 20);
        for (int x = 0; x < num; x++) {
            points.add(org.apache.commons.lang3.RandomUtils.nextInt(1, 100000));
            codes.add(org.apache.commons.lang3.RandomUtils.nextBytes(10)[0]);
        }
        if (RandomUtils.nextInt(10) < 3) {
            word.markAsLabel(true);
        }
        word.setIndex(i);
        word.setPoints(points);
        word.setCodes(codes);
        cache.addToken(word);
        cache.addWordToIndex(i, word.getLabel());
    }

    InMemoryLookupTable<VocabWord> lookupTable =
                    (InMemoryLookupTable<VocabWord>) new InMemoryLookupTable.Builder<VocabWord>()
                                    .vectorLength(configuration.getLayersSize()).cache(cache).build();

    lookupTable.setSyn0(syn0);
    lookupTable.setSyn1(syn1);

    ParagraphVectors originalVectors =
                    new ParagraphVectors.Builder(configuration).vocabCache(cache).lookupTable(lookupTable).build();

    File tempFile = File.createTempFile("paravec", "tests");
    tempFile.deleteOnExit();

    WordVectorSerializer.writeParagraphVectors(originalVectors, tempFile);

    ParagraphVectors restoredVectors = WordVectorSerializer.readParagraphVectors(tempFile);

    InMemoryLookupTable<VocabWord> restoredLookupTable =
                    (InMemoryLookupTable<VocabWord>) restoredVectors.getLookupTable();
    AbstractCache<VocabWord> restoredVocab = (AbstractCache<VocabWord>) restoredVectors.getVocab();

    assertEquals(restoredLookupTable.getSyn0(), lookupTable.getSyn0());
    assertEquals(restoredLookupTable.getSyn1(), lookupTable.getSyn1());

    for (int i = 0; i < cache.numWords(); i++) {
        assertEquals(cache.elementAtIndex(i).isLabel(), restoredVocab.elementAtIndex(i).isLabel());
        assertEquals(cache.wordAtIndex(i), restoredVocab.wordAtIndex(i));
        assertEquals(cache.elementAtIndex(i).getElementFrequency(),
                        restoredVocab.elementAtIndex(i).getElementFrequency(), 0.1f);
        List<Integer> originalPoints = cache.elementAtIndex(i).getPoints();
        List<Integer> restoredPoints = restoredVocab.elementAtIndex(i).getPoints();
        assertEquals(originalPoints.size(), restoredPoints.size());
        for (int x = 0; x < originalPoints.size(); x++) {
            assertEquals(originalPoints.get(x), restoredPoints.get(x));
        }

        List<Byte> originalCodes = cache.elementAtIndex(i).getCodes();
        List<Byte> restoredCodes = restoredVocab.elementAtIndex(i).getCodes();
        assertEquals(originalCodes.size(), restoredCodes.size());
        for (int x = 0; x < originalCodes.size(); x++) {
            assertEquals(originalCodes.get(x), restoredCodes.get(x));
        }
    }
}

开发者ID:deeplearning4j，项目名称:deeplearning4j，代码行数:75，代码来源:WordVectorSerializerTest.java

示例11: testBiggerParavecLoader

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
@Ignore
@Test
public void testBiggerParavecLoader() throws Exception {
    ParagraphVectors vectors =
                    WordVectorSerializer.readParagraphVectors("C:\\Users\\raver\\Downloads\\10kNews.zip");
}

开发者ID:deeplearning4j，项目名称:deeplearning4j，代码行数:7，代码来源:WordVectorSerializerTest.java

示例12: readParagraphVectors

import org.deeplearning4j.models.paragraphvectors.ParagraphVectors; //导入依赖的package包/类
/**
 * This method restores ParagraphVectors model previously saved with writeParagraphVectors()
 *
 * @return
 */
public static ParagraphVectors readParagraphVectors(File file) throws IOException {
    File tmpFileL = File.createTempFile("paravec", "l");
    tmpFileL.deleteOnExit();

    Word2Vec w2v = readWord2Vec(file);

    // and "convert" it to ParaVec model + optionally trying to restore labels information
    ParagraphVectors vectors = new ParagraphVectors.Builder(w2v.getConfiguration()).vocabCache(w2v.getVocab())
                    .lookupTable(w2v.getLookupTable()).resetModel(false).build();

    ZipFile zipFile = new ZipFile(file);

    // now we try to restore labels information
    ZipEntry labels = zipFile.getEntry("labels.txt");
    if (labels != null) {
        InputStream stream = zipFile.getInputStream(labels);

        Files.copy(stream, Paths.get(tmpFileL.getAbsolutePath()), StandardCopyOption.REPLACE_EXISTING);
        try (BufferedReader reader = new BufferedReader(new FileReader(tmpFileL))) {
            String line;
            while ((line = reader.readLine()) != null) {
                VocabWord word = vectors.getVocab().tokenFor(decodeB64(line.trim()));
                if (word != null) {
                    word.markAsLabel(true);
                }
            }
        }
    }

    vectors.extractLabels();

    return vectors;
}

开发者ID:deeplearning4j，项目名称:deeplearning4j，代码行数:39，代码来源:WordVectorSerializer.java

注：本文中的org.deeplearning4j.models.paragraphvectors.ParagraphVectors类示例整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Java CryptoExtension类代码示例发布时间：2022-05-22

Java StreamGobbler类代码示例发布时间：2022-05-22

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：19164|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：9981|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8320|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8690|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8634|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9650|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8617|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：7994|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8648|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7531|2022-11-06

客服电话

电子邮件

Java ParagraphVectors类代码示例

示例1: makeParagraphVectors

示例2: Par2Hier

示例3: loadParagraphVectors

示例4: trainParagraghVecModel

示例5: writeWordVectors

示例6: writeParagraphVectors

示例7: readParagraphVectorsFromText

示例8: main

示例9: main

示例10: testParaVecSerialization1

示例11: testBiggerParavecLoader

示例12: readParagraphVectors

请发表评论

全部评论

上一篇：

下一篇：

kojino/Harvard-Robust-Machine-Learning:

cescoffier/puppet-nexus: A Puppet Module

文的笔顺,诠释文的笔画,解读文的部首

Cassolotl/sentient.cloud · GitHub

CVE-2022-31836

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053