Java ParquetWriter类代码示例

OStack程序员社区-中国程序员成长平台 › 门户 › 编程› Java›Java编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Java中parquet.hadoop.ParquetWriter类的典型用法代码示例。如果您正苦于以下问题：Java ParquetWriter类的具体用法？Java ParquetWriter怎么用？Java ParquetWriter使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

ParquetWriter类属于parquet.hadoop包，在下文中一共展示了ParquetWriter类的14个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: build

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public PFileWriter build(){
    try {
        this.parquetWriter = new ParquetWriter<Group>(
                file,
                gws,
                CompressionCodecName.SNAPPY,
                1024,
                1024,
                512,
                true,
                false,
                ParquetProperties.WriterVersion.PARQUET_1_0,
                conf);
    }catch (IOException ioe){
        LOG.error(ioe.toString());
    }
    return this;
}

开发者ID:grokcoder，项目名称:pbase，代码行数:19，代码来源:PFileWriter.java

示例2: initWriter

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public static ParquetWriter<Group> initWriter(String fileName, Map<String, String> metas)
        throws IOException{


    GroupWriteSupport.setSchema(schema, conf);


    ParquetWriter<Group> writer = new ParquetWriter<Group>(
            initFile(fileName),
            new GroupWriteSupport(metas),
            CompressionCodecName.SNAPPY,
            1024,
            1024,
            512,
            true,
            false,
            ParquetProperties.WriterVersion.PARQUET_1_0,
            conf);

    return writer;
}

开发者ID:grokcoder，项目名称:pbase，代码行数:22，代码来源:GenerateParquetFile.java

示例3: run

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
/**
 * Write the file.
 *
 * @param args the command-line arguments
 * @return the process exit code
 * @throws Exception if something goes wrong
 */
public int run(final String[] args) throws Exception {

  Cli cli = Cli.builder().setArgs(args).addOptions(CliCommonOpts.IOFileOpts.values()).build();
  int result = cli.runCmd();

  if (result != 0) {
    return result;
  }

  File inputFile = new File(cli.getArgValueAsString(CliCommonOpts.IOFileOpts.INPUT));
  Path outputPath = new Path(cli.getArgValueAsString(CliCommonOpts.IOFileOpts.OUTPUT));

  AvroParquetWriter<Stock> writer =
      new AvroParquetWriter<Stock>(outputPath, Stock.SCHEMA$,
          CompressionCodecName.SNAPPY,
          ParquetWriter.DEFAULT_BLOCK_SIZE,
          ParquetWriter.DEFAULT_PAGE_SIZE,
          true);

  for (Stock stock : AvroStockUtils.fromCsvFile(inputFile)) {
    writer.write(stock);
  }

  writer.close();

  return 0;
}

开发者ID:Hanmourang，项目名称:hiped2，代码行数:35，代码来源:ParquetAvroStockWriter.java

示例4: getWriter

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
/**
 * Build a {@link ParquetWriter<Group>} for given file path with a block size.
 * @param blockSize
 * @param stagingFile
 * @return
 * @throws IOException
 */
public ParquetWriter<Group> getWriter(int blockSize, Path stagingFile)
    throws IOException {
  State state = this.destination.getProperties();
  int pageSize = state.getPropAsInt(getProperty(WRITER_PARQUET_PAGE_SIZE), DEFAULT_PAGE_SIZE);
  int dictPageSize = state.getPropAsInt(getProperty(WRITER_PARQUET_DICTIONARY_PAGE_SIZE), DEFAULT_BLOCK_SIZE);
  boolean enableDictionary =
      state.getPropAsBoolean(getProperty(WRITER_PARQUET_DICTIONARY), DEFAULT_IS_DICTIONARY_ENABLED);
  boolean validate = state.getPropAsBoolean(getProperty(WRITER_PARQUET_VALIDATE), DEFAULT_IS_VALIDATING_ENABLED);
  String rootURI = state.getProp(WRITER_FILE_SYSTEM_URI, LOCAL_FS_URI);
  Path absoluteStagingFile = new Path(rootURI, stagingFile);
  CompressionCodecName codec = getCodecFromConfig();
  GroupWriteSupport support = new GroupWriteSupport();
  Configuration conf = new Configuration();
  GroupWriteSupport.setSchema(this.schema, conf);
  ParquetProperties.WriterVersion writerVersion = getWriterVersion();
  return new ParquetWriter<>(absoluteStagingFile, support, codec, blockSize, pageSize, dictPageSize, enableDictionary,
      validate, writerVersion, conf);
}

开发者ID:apache，项目名称:incubator-gobblin，代码行数:26，代码来源:ParquetDataWriterBuilder.java

示例5: main

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public static void main(String []args) throws IOException{


        int fileNum = 10;   //num of files constructed
        int fileRecordNum = 50; //record num of each file
        int rowKey = 0;
        for(int i = 0; i < fileNum; ++ i ) {

            Map<String, String> metas = new HashMap<>();
            metas.put(HConstants.START_KEY, genRowKey("%10d", rowKey + 1));
            metas.put(HConstants.END_KEY, genRowKey("%10d", rowKey + fileRecordNum));

            ParquetWriter<Group> writer = initWriter("pfile/scanner_test_file" + i, metas);

            for (int j = 0;  j < fileRecordNum; ++j) {
                rowKey ++;
                Group group = sfg.newGroup().append("rowkey", genRowKey("%10d", rowKey))
                        .append("cf:name", "wangxiaoyi" + rowKey)
                        .append("cf:age", String.format("%10d", rowKey))
                        .append("cf:job", "student")
                        .append("timestamp", System.currentTimeMillis());
                writer.write(group);
            }

            writer.close();
        }
    }

开发者ID:grokcoder，项目名称:pbase，代码行数:28，代码来源:GenerateParquetFile.java

示例6: main

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public static void main(String[] args) throws IOException {


        Path root = new Path("hdfs://10.214.208.11:9000/parquet/");//文件夹路径

        Configuration configuration = new Configuration();

        MessageType schema = MessageTypeParser.parseMessageType( //parquet文件模式

                " message people { " +

                        "required binary rowkey;" +
                        "required binary cf:name;" +
                        "required binary cf:age;" +
                        "required int64 timestamp;"+
                 " }");

        GroupWriteSupport.setSchema(schema, configuration);

        SimpleGroupFactory sfg = new SimpleGroupFactory(schema);
        Path file = new Path(root, "people002.parquet");

        Map<String, String> meta = new HashMap<String, String>();
        meta.put("startkey", "1");
        meta.put("endkey", "2");


        ParquetWriter<Group> writer = new ParquetWriter<Group>(
                file,
                new GroupWriteSupport(meta),
                CompressionCodecName.UNCOMPRESSED,
                1024,
                1024,
                512,
                true,
                false,
                ParquetProperties.WriterVersion.PARQUET_1_0,
                configuration);

        Group group = sfg.newGroup().append("rowkey", "1")
                      .append("cf:name", "wangxiaoyi")
                      .append("cf:age", "24")
                      .append("timestamp", System.currentTimeMillis());


        for (int i = 0; i < 10000; ++i) {
            writer.write(
                    sfg.newGroup()
                            .append("name", "wangxiaoyi" + i)
                            .append("age", i));
        }
        writer.close();


    }

开发者ID:grokcoder，项目名称:pbase，代码行数:56，代码来源:TestParquetWrite.java

示例7: writeToFile

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public static void writeToFile(File f, List<User> users) throws IOException {
    Configuration conf = new Configuration();
    GroupWriteSupport.setSchema(schema, conf);

    ParquetWriter<Group> writer = new ParquetWriter<Group>(new Path(f.getAbsolutePath()), conf, new GroupWriteSupport(null));
    for (User u : users) {
        writer.write(groupFromUser(u));
    }
    writer.close();
}

开发者ID:grokcoder，项目名称:pbase，代码行数:11，代码来源:PhoneBookWriter.java

示例8: getDefaultBlockSize

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public String getDefaultBlockSize()  {
    int defaultBlockSize = ParquetWriter.DEFAULT_BLOCK_SIZE;
    boolean bigDataPluginInstalled = false;
    try {
      bigDataPluginInstalled = isBigDataPluginInstalled();
    } catch ( KettleException ex )
    {
      logError( ex.getMessage() );
    }
    if( bigDataPluginInstalled )
    {
 /*     try {
        int hdfsBlockSize = BigDataHelper.getHdfsConfiguredBlockSize();
        if( hdfsBlockSize > 0 )
        {
          defaultBlockSize = hdfsBlockSize;
        }
      } catch ( KettleException ex )
      {
        logDebug( "Error fetching Hadoop configuration " + ex.getMessage() );
      }
*/
    }

    int mbBlockSize = defaultBlockSize / 1024 / 1024;
    return Integer.toString( mbBlockSize );
  }

开发者ID:inquidia，项目名称:ParquetPlugin，代码行数:28，代码来源:ParquetOutputMeta.java

示例9: closeFile

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
private boolean closeFile() {
  boolean retval = false;


  if( data.parquetWriters != null ) {
    Iterator<ParquetWriter> openFiles = data.parquetWriters.iterator();
    while ( openFiles.hasNext() ) {
      ParquetWriter writer = openFiles.next();
      if ( writer != null ) {
        try {
          writer.close();
        } catch ( Exception e ) {
          logBasic( "Error trying to close file.  This may not be a problem.");
          logDetailed( "Stack trace from error trying to close file:", e );
        }
        writer = null;
      }
    }

    if ( log.isDebug() ) {
      logDebug( "Closed all open parquet writers." );
    }
  }



  return retval;
}

开发者ID:inquidia，项目名称:ParquetPlugin，代码行数:29，代码来源:ParquetOutput.java

示例10: open

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
@Override
public void open() {
  Preconditions.checkState(state.equals(ReaderWriterState.NEW),
    "Unable to open a writer from state:%s", state);

  logger.debug(
    "Opening data file with pathTmp:{} (final path will be path:{})",
    pathTmp, path);

  try {
    CompressionCodecName codecName = CompressionCodecName.UNCOMPRESSED;
    if (enableCompression) {
       if (SnappyCodec.isNativeCodeLoaded()) {
         codecName = CompressionCodecName.SNAPPY;
       } else {
         logger.warn("Compression enabled, but Snappy native code not loaded. " +
             "Parquet file will not be compressed.");
       }
    }
    avroParquetWriter = new AvroParquetWriter<E>(fileSystem.makeQualified(pathTmp),
        schema, codecName, DEFAULT_BLOCK_SIZE,
        ParquetWriter.DEFAULT_PAGE_SIZE);
  } catch (IOException e) {
    throw new DatasetWriterException("Unable to create writer to path:" + pathTmp, e);
  }

  state = ReaderWriterState.OPEN;
}

开发者ID:cloudera，项目名称:cdk，代码行数:29，代码来源:ParquetFileSystemDatasetWriter.java

示例11: main

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public static void main(String[] args) {

		String inputFile = null;
		String outputFile = null;
		
		HelpFormatter formatter = new HelpFormatter();
		// create Options object
		Options options = new Options();

		// add t option
		options.addOption("i", true, "input avro file");
		options.addOption("o", true, "output Parquet file");
		CommandLineParser parser = new DefaultParser();
		CommandLine cmd;
		try {
			cmd = parser.parse(options, args);
			inputFile = cmd.getOptionValue("i");
			if (inputFile == null) {
				formatter.printHelp("AvroToParquet", options);
				return;
			}
			outputFile = cmd.getOptionValue("o");
		} catch (ParseException exc) {
			System.err.println("Problem with command line parameters: " + exc.getMessage());
			return;
		}

		File avroFile = new File(inputFile);

		if (!avroFile.exists()) {
			System.err.println("Could not open file: " + inputFile);
			return;
		}
		try {

			DatumReader<GenericRecord> datumReader = new GenericDatumReader<GenericRecord>();
			DataFileReader<GenericRecord> dataFileReader;
			dataFileReader = new DataFileReader<GenericRecord>(avroFile, datumReader);
			Schema avroSchema = dataFileReader.getSchema();

			// choose compression scheme
			CompressionCodecName compressionCodecName = CompressionCodecName.SNAPPY;

			// set Parquet file block size and page size values
			int blockSize = 256 * 1024 * 1024;
			int pageSize = 64 * 1024;

			String base = FilenameUtils.removeExtension(avroFile.getAbsolutePath()) + ".parquet";
			if(outputFile != null) {
				File file = new File(outputFile);
				base = file.getAbsolutePath();
			}
			
			Path outputPath = new Path("file:///"+base);

			// the ParquetWriter object that will consume Avro GenericRecords
			ParquetWriter<GenericRecord> parquetWriter;
			parquetWriter = new AvroParquetWriter<GenericRecord>(outputPath, avroSchema, compressionCodecName, blockSize, pageSize);
			for (GenericRecord record : dataFileReader) {
				parquetWriter.write(record);
			}
			dataFileReader.close();
			parquetWriter.close();
		} catch (IOException e) {
			System.err.println("Caught exception: " + e.getMessage());
		}
	}

开发者ID:CohesionForce，项目名称:avroToParquet，代码行数:68，代码来源:AvroToParquet.java

示例12: getDefaultPageSize

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public String getDefaultPageSize()
{
  int defaultPageSize = ParquetWriter.DEFAULT_PAGE_SIZE;
  int mbPageSize = defaultPageSize / 1024;
  return Integer.toString( mbPageSize );
}

开发者ID:inquidia，项目名称:ParquetPlugin，代码行数:7，代码来源:ParquetOutputMeta.java

示例13: getDefaultIsDictionaryEnabled

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public boolean getDefaultIsDictionaryEnabled()
{
  return ParquetWriter.DEFAULT_IS_DICTIONARY_ENABLED;
}

开发者ID:inquidia，项目名称:ParquetPlugin，代码行数:5，代码来源:ParquetOutputMeta.java

示例14: ParquetOutputData

import parquet.hadoop.ParquetWriter; //导入依赖的package包/类
public ParquetOutputData() {
  super();

   daf = new SimpleDateFormat();
  dafs = new DateFormatSymbols();

  defaultDateFormat = new SimpleDateFormat();
  defaultDateFormatSymbols = new DateFormatSymbols();

  openFiles = new ArrayList<String>();
  parquetWriters = new ArrayList<ParquetWriter>();

}

开发者ID:inquidia，项目名称:ParquetPlugin，代码行数:14，代码来源:ParquetOutputData.java

注：本文中的parquet.hadoop.ParquetWriter类示例整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Java UnnamedConfigurable类代码示例发布时间：2022-05-22

Java ViewPagerClassic类代码示例发布时间：2022-05-22

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：19233|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：9998|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8331|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8701|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8646|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9667|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8631|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：8005|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8666|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7540|2022-11-06

客服电话

电子邮件

Java ParquetWriter类代码示例

示例1: build

示例2: initWriter

示例3: run

示例4: getWriter

示例5: main

示例6: main

示例7: writeToFile

示例8: getDefaultBlockSize

示例9: closeFile

示例10: open

示例11: main

示例12: getDefaultPageSize

示例13: getDefaultIsDictionaryEnabled

示例14: ParquetOutputData

请发表评论

全部评论

上一篇：

下一篇：

chasinginfinity/ml-from-scratch: Machine

mkyong/spring3-mvc-maven-annotation-hell

床的笔顺,关于床的笔画,体会床的部首

CVE-2022-35648

zendesk/android-floating-action-button:

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053