Java AvroParquetWriter类代码示例

OStack程序员社区-中国程序员成长平台 › 门户 › 编程› Java›Java编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Java中org.apache.parquet.avro.AvroParquetWriter类的典型用法代码示例。如果您正苦于以下问题：Java AvroParquetWriter类的具体用法？Java AvroParquetWriter怎么用？Java AvroParquetWriter使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

AvroParquetWriter类属于org.apache.parquet.avro包，在下文中一共展示了AvroParquetWriter类的9个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: initWriter

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
private ParquetWriterWrapper initWriter(Path file, int recordLimit, final FileSystem fileSystem,
    final Configuration conf, final CompressionCodecName compression, final int blockSize,
    final int pageSize) throws IOException {
  numRecords.set(0);
  String[] columnNames = converterDescriptor.getColumnConverters().stream()
      .map(columnConverterDescriptor -> columnConverterDescriptor.getColumnName())
      .toArray(size -> new String[size]);

  Schema[] columnTypes = converterDescriptor.getColumnConverters().stream()
      .map(columnConverterDescriptor -> columnConverterDescriptor.getTypeDescriptor())
      .toArray(size -> new Schema[size]);

  avroRecord = ParquetUtils.createAvroRecordSchema(getTableName(), columnNames, columnTypes);

  // TODO confirm how without fs will it be able to write on hdfs
  writer = AvroParquetWriter.<GenericRecord>builder(file).withCompressionCodec(compression)
      .withPageSize(pageSize).withConf(conf).withSchema(avroRecord).build();

  return this;
}

开发者ID:ampool，项目名称:monarch，代码行数:21，代码来源:ParquetWriterWrapper.java

示例2: generateAvroPrimitiveTypes

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
static File generateAvroPrimitiveTypes(File parentDir, String filename, int nrows, Date date) throws IOException {
  File f = new File(parentDir, filename);
  Schema schema = new Schema.Parser().parse(Resources.getResource("PrimitiveAvro.avsc").openStream());
  AvroParquetWriter<GenericRecord> writer = new AvroParquetWriter<GenericRecord>(new Path(f.getPath()), schema);
  try {
    DateFormat format = new SimpleDateFormat("yy-MMM-dd:hh.mm.ss.SSS aaa");
    for (int i = 0; i < nrows; i++) {
      GenericData.Record record = new GenericRecordBuilder(schema)
              .set("mynull", null)
              .set("myboolean", i % 2 == 0)
              .set("myint", 1 + i)
              .set("mylong", 2L + i)
              .set("myfloat", 3.1f + i)
              .set("mydouble", 4.1 + i)
              .set("mydate", format.format(new Date(date.getTime() - (i * 1000 * 3600))))
              .set("myuuid", UUID.randomUUID())
              .set("mystring", "hello world: " + i)
              .set("myenum", i % 2 == 0 ? "a" : "b")
              .build();
      writer.write(record);
    }
  } finally {
    writer.close();
  }
  return f;
}

开发者ID:h2oai，项目名称:h2o-3，代码行数:27，代码来源:ParseTestParquet.java

示例3: createDataFile

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
private static Path createDataFile() throws IOException {
    File parquetFile = File.createTempFile("test-", "." + FILE_EXTENSION);
    readerSchema = new Schema.Parser().parse(
            ParquetFileReaderTest.class.getResourceAsStream("/file/reader/schemas/people.avsc"));
    projectionSchema = new Schema.Parser().parse(
            ParquetFileReaderTest.class.getResourceAsStream("/file/reader/schemas/people_projection.avsc"));

    try (ParquetWriter writer = AvroParquetWriter.<GenericRecord>builder(new Path(parquetFile.toURI()))
            .withConf(fs.getConf()).withWriteMode(ParquetFileWriter.Mode.OVERWRITE).withSchema(readerSchema).build()) {

        IntStream.range(0, NUM_RECORDS).forEach(index -> {
            GenericRecord datum = new GenericData.Record(readerSchema);
            datum.put(FIELD_INDEX, index);
            datum.put(FIELD_NAME, String.format("%d_name_%s", index, UUID.randomUUID()));
            datum.put(FIELD_SURNAME, String.format("%d_surname_%s", index, UUID.randomUUID()));
            try {
                OFFSETS_BY_INDEX.put(index, Long.valueOf(index));
                writer.write(datum);
            } catch (IOException ioe) {
                throw new RuntimeException(ioe);
            }
        });
    }
    Path path = new Path(new Path(fsUri), parquetFile.getName());
    fs.moveFromLocalFile(new Path(parquetFile.getAbsolutePath()), path);
    return path;
}

开发者ID:mmolimar，项目名称:kafka-connect-fs，代码行数:28，代码来源:ParquetFileReaderTest.java

示例4: getParquetFileStream

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
public ParquetWriter<GenericRecord> getParquetFileStream() throws IOException {
  Schema avroSchema = getAvroSchema();
  Path file = new Path("/tmp/data/EmployeeData" + fileIndex++ + ".parquet");
  // create avro schema
  ParquetWriter<GenericRecord> parquetWriter =
      AvroParquetWriter.<GenericRecord>builder(file).withSchema(avroSchema).build();

  return parquetWriter;
}

开发者ID:ampool，项目名称:monarch，代码行数:10，代码来源:MTableCDCParquetListener.java

示例5: write

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
/**
 * 将avro格式的数据写入到parquet文件中
 *
 * @param parquetPath
 */
public void write(String parquetPath) {
    Schema.Parser parser = new Schema.Parser();
    try {
        Schema schema = parser.parse(AvroParquetOperation.class.getClassLoader().getResourceAsStream("StringPair.avsc"));
        GenericRecord datum = new GenericData.Record(schema);
        datum.put("left", "L");
        datum.put("right", "R");

        Path path = new Path(parquetPath);
        System.out.println(path);
        AvroParquetWriter<GenericRecord> writer = new AvroParquetWriter<GenericRecord>(path, schema);
        writer.write(datum);
        writer.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

开发者ID:mumuhadoop，项目名称:mumu-parquet，代码行数:23，代码来源:AvroParquetOperation.java

示例6: call

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
@Override
public Job call() throws Exception {
  // We're explicitly disabling speculative execution
  conf.set("mapreduce.map.speculative", "false");
  conf.set("mapreduce.map.maxattempts", "1");
  MapreduceUtils.addJarsToJob(conf,
    SemanticVersion.class,
    ParquetWriter.class,
    AvroParquetWriter.class,
    FsInput.class,
    CompressionCodec.class,
    ParquetProperties.class,
    BytesInput.class
  );

  Job job = Job.getInstance(conf);

  // IO formats
  job.setInputFormatClass(AvroParquetInputFormat.class);
  job.setOutputFormatClass(NullOutputFormat.class);

  // Mapper & job output
  job.setMapperClass(AvroParquetConvertMapper.class);
  job.setOutputKeyClass(NullWritable.class);
  job.setOutputValueClass(NullWritable.class);

  // It's map only job
  job.setNumReduceTasks(0);

  // General configuration
  job.setJarByClass(getClass());

  return job;
}

开发者ID:streamsets，项目名称:datacollector，代码行数:35，代码来源:AvroParquetConvertCreator.java

示例7: AvroParquetFileWriter

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
public AvroParquetFileWriter(LogFilePath logFilePath, CompressionCodec codec) throws IOException {
    Path path = new Path(logFilePath.getLogFilePath());
    LOG.debug("Creating Brand new Writer for path {}", path);
    CompressionCodecName codecName = CompressionCodecName
            .fromCompressionCodec(codec != null ? codec.getClass() : null);
    topic = logFilePath.getTopic();
    // Not setting blockSize, pageSize, enableDictionary, and validating
    writer = AvroParquetWriter.builder(path)
            .withSchema(schemaRegistryClient.getSchema(topic))
            .withCompressionCodec(codecName)
            .build();
}

开发者ID:pinterest，项目名称:secor，代码行数:13，代码来源:AvroParquetFileReaderWriterFactory.java

示例8: run

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
@Override
@SuppressWarnings("unchecked")
public int run() throws IOException {
  Preconditions.checkArgument(targets != null && targets.size() == 1,
      "CSV path is required.");

  if (header != null) {
    // if a header is given on the command line, don't assume one is in the file
    noHeader = true;
  }

  CSVProperties props = new CSVProperties.Builder()
      .delimiter(delimiter)
      .escape(escape)
      .quote(quote)
      .header(header)
      .hasHeader(!noHeader)
      .linesToSkip(linesToSkip)
      .charset(charsetName)
      .build();

  String source = targets.get(0);

  Schema csvSchema;
  if (avroSchemaFile != null) {
    csvSchema = Schemas.fromAvsc(open(avroSchemaFile));
  } else {
    Set<String> required = ImmutableSet.of();
    if (requiredFields != null) {
      required = ImmutableSet.copyOf(requiredFields);
    }

    String filename = new File(source).getName();
    String recordName;
    if (filename.contains(".")) {
      recordName = filename.substring(0, filename.indexOf("."));
    } else {
      recordName = filename;
    }

    csvSchema = AvroCSV.inferNullableSchema(
        recordName, open(source), props, required);
  }

  long count = 0;
  try (AvroCSVReader<Record> reader = new AvroCSVReader<>(
      open(source), props, csvSchema, Record.class, true)) {
      CompressionCodecName codec = Codecs.parquetCodec(compressionCodecName);
    try (ParquetWriter<Record> writer = AvroParquetWriter
        .<Record>builder(qualifiedPath(outputPath))
        .withWriterVersion(v2 ? PARQUET_2_0 : PARQUET_1_0)
        .withWriteMode(overwrite ?
            ParquetFileWriter.Mode.OVERWRITE : ParquetFileWriter.Mode.CREATE)
        .withCompressionCodec(codec)
        .withDictionaryEncoding(true)
        .withDictionaryPageSize(dictionaryPageSize)
        .withPageSize(pageSize)
        .withRowGroupSize(rowGroupSize)
        .withDataModel(GenericData.get())
        .withConf(getConf())
        .withSchema(csvSchema)
        .build()) {
      for (Record record : reader) {
        writer.write(record);
      }
    } catch (RuntimeException e) {
      throw new RuntimeException("Failed on record " + count, e);
    }
  }

  return 0;
}

开发者ID:apache，项目名称:parquet-mr，代码行数:73，代码来源:ConvertCSVCommand.java

示例9: run

import org.apache.parquet.avro.AvroParquetWriter; //导入依赖的package包/类
@Override
@SuppressWarnings("unchecked")
public int run() throws IOException {
  Preconditions.checkArgument(targets != null && targets.size() == 1,
      "A data file is required.");

  String source = targets.get(0);

  CompressionCodecName codec = Codecs.parquetCodec(compressionCodecName);

  Schema schema;
  if (avroSchemaFile != null) {
    schema = Schemas.fromAvsc(open(avroSchemaFile));
  } else {
    schema = getAvroSchema(source);
  }
  Schema projection = filterSchema(schema, columns);

  Path outPath = qualifiedPath(outputPath);
  FileSystem outFS = outPath.getFileSystem(getConf());
  if (overwrite && outFS.exists(outPath)) {
    console.debug("Deleting output file {} (already exists)", outPath);
    outFS.delete(outPath);
  }

  Iterable<Record> reader = openDataFile(source, projection);
  boolean threw = true;
  long count = 0;
  try {
    try (ParquetWriter<Record> writer = AvroParquetWriter
        .<Record>builder(qualifiedPath(outputPath))
        .withWriterVersion(v2 ? PARQUET_2_0 : PARQUET_1_0)
        .withConf(getConf())
        .withCompressionCodec(codec)
        .withRowGroupSize(rowGroupSize)
        .withDictionaryPageSize(dictionaryPageSize < 64 ? 64 : dictionaryPageSize)
        .withDictionaryEncoding(dictionaryPageSize != 0)
        .withPageSize(pageSize)
        .withDataModel(GenericData.get())
        .withSchema(projection)
        .build()) {
      for (Record record : reader) {
        writer.write(record);
        count += 1;
      }
    }
    threw = false;
  } catch (RuntimeException e) {
    throw new RuntimeException("Failed on record " + count, e);
  } finally {
    if (reader instanceof Closeable) {
      Closeables.close((Closeable) reader, threw);
    }
  }

  return 0;
}

开发者ID:apache，项目名称:parquet-mr，代码行数:58，代码来源:ConvertCommand.java

注：本文中的org.apache.parquet.avro.AvroParquetWriter类示例整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Java BearerTokenExtractor类代码示例发布时间：2022-05-22

Java RawInputListener类代码示例发布时间：2022-05-22

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：19271|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：10014|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8341|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8709|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8655|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9684|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8643|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：8011|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8680|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7547|2022-11-06

客服电话

电子邮件

Java AvroParquetWriter类代码示例

示例1: initWriter

示例2: generateAvroPrimitiveTypes

示例3: createDataFile

示例4: getParquetFileStream

示例5: write

示例6: call

示例7: AvroParquetFileWriter

示例8: run

示例9: run

请发表评论

全部评论

上一篇：

下一篇：

deepmedia/MavenDeployer: A handy Gradle

青岛中学艺术设计课程帮助学生树立全新思维

Karumi/SpringBootKotlin: Spring boot Kot

thomasnield/oreilly_machine_learning_fro

Login & Home Screen by Anton Aheicha

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053

客服电话

电子邮件

Java AvroParquetWriter类代码示例

示例1: initWriter

示例2: generateAvroPrimitiveTypes

示例3: createDataFile

示例4: getParquetFileStream

示例5: write

示例6: call

示例7: AvroParquetFileWriter

示例8: run

示例9: run

请发表评论

全部评论

上一篇：

下一篇：

deepmedia/MavenDeployer: A handy Gradle

青岛中学艺术设计课程帮助学生树立全新思维

Karumi/SpringBootKotlin: Spring boot Kot

thomasnield/oreilly_machine_learning_fro

Login &amp; Home Screen by Anton Aheicha

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053

Login & Home Screen by Anton Aheicha