本文整理汇总了Java中org.apache.hadoop.mapred.MultiFileSplit类的典型用法代码示例。如果您正苦于以下问题:Java MultiFileSplit类的具体用法?Java MultiFileSplit怎么用?Java MultiFileSplit使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
MultiFileSplit类属于org.apache.hadoop.mapred包,在下文中一共展示了MultiFileSplit类的6个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: MultiFileLineRecordReader
import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public MultiFileLineRecordReader(Configuration conf, MultiFileSplit split)
throws IOException {
this.split = split;
fs = FileSystem.get(conf);
this.paths = split.getPaths();
this.totLength = split.getLength();
this.offset = 0;
//open the first file
Path file = paths[count];
currentStream = fs.open(file);
currentReader = new BufferedReader(new InputStreamReader(currentStream));
}
开发者ID:rhli,项目名称:hadoop-EAR,代码行数:15,代码来源:MultiFileWordCount.java
示例2: WarcFileRecordReader
import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public WarcFileRecordReader(Configuration conf, InputSplit split) throws IOException {
this.fs = FileSystem.get(conf);
this.conf = conf;
if (split instanceof FileSplit) {
this.filePathList = new Path[1];
this.filePathList[0] = ((FileSplit) split).getPath();
} else if (split instanceof MultiFileSplit) {
this.filePathList = ((MultiFileSplit) split).getPaths();
} else {
throw new IOException("InputSplit is not a file split or a multi-file split - aborting");
}
// get the total file sizes
for (int i = 0; i < filePathList.length; i++) {
totalFileSize += fs.getFileStatus(filePathList[i]).getLen();
}
Class<? extends CompressionCodec> codecClass = null;
try {
codecClass = conf.getClassByName("org.apache.hadoop.io.compress.GzipCodec")
.asSubclass(CompressionCodec.class);
compressionCodec = (CompressionCodec) ReflectionUtils.newInstance(codecClass, conf);
} catch (ClassNotFoundException cnfEx) {
compressionCodec = null;
LOG.info("!!! ClassNotFound Exception thrown setting Gzip codec");
}
openNextFile();
}
开发者ID:lucidworks,项目名称:solr-hadoop-common,代码行数:31,代码来源:WarcFileRecordReader.java
示例3: WarcFileRecordReader
import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public WarcFileRecordReader(Configuration conf, InputSplit split) throws IOException {
if (split instanceof FileSplit) {
this.filePathList=new Path[1];
this.filePathList[0]=((FileSplit)split).getPath();
} else if (split instanceof MultiFileSplit) {
this.filePathList=((MultiFileSplit)split).getPaths();
} else {
throw new IOException("InputSplit is not a file split or a multi-file split - aborting");
}
// Use FileSystem.get to open Common Crawl URIs using the S3 protocol.
URI uri = filePathList[0].toUri();
this.fs = FileSystem.get(uri, conf);
// get the total file sizes
for (int i=0; i < filePathList.length; i++) {
totalFileSize += fs.getFileStatus(filePathList[i]).getLen();
}
Class<? extends CompressionCodec> codecClass=null;
try {
codecClass=conf.getClassByName("org.apache.hadoop.io.compress.GzipCodec").asSubclass(CompressionCodec.class);
compressionCodec=(CompressionCodec)ReflectionUtils.newInstance(codecClass, conf);
} catch (ClassNotFoundException cnfEx) {
compressionCodec=null;
LOG.info("!!! ClassNotFoun Exception thrown setting Gzip codec");
}
openNextFile();
}
开发者ID:rossf7,项目名称:wikireverse,代码行数:32,代码来源:WarcFileRecordReader.java
示例4: MultiFileLineRecordReader
import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
public MultiFileLineRecordReader(Configuration conf, MultiFileSplit split)
throws IOException {
this.split = split;
fs = FileSystem.get(conf);
this.paths = split.getPaths();
this.totLength = split.getLength();
this.offset = 0;
//open the first file
Path file = paths[count];
currentStream = fs.open(file);
currentReader = new BufferedReader(new InputStreamReader(currentStream));
}
开发者ID:elephantscale,项目名称:hadoop-book,代码行数:15,代码来源:MultiFileWordCount.java
示例5: getRecordReader
import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
@Override
public RecordReader<WordOffset,Text> getRecordReader(InputSplit split
, JobConf job, Reporter reporter) throws IOException {
return new MultiFileLineRecordReader(job, (MultiFileSplit)split);
}
开发者ID:rhli,项目名称:hadoop-EAR,代码行数:6,代码来源:MultiFileWordCount.java
示例6: getRecordReader
import org.apache.hadoop.mapred.MultiFileSplit; //导入依赖的package包/类
@Override
public RecordReader<WordOffset, Text> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException {
return new MultiFileLineRecordReader(job, (MultiFileSplit) split);
}
开发者ID:elephantscale,项目名称:hadoop-book,代码行数:5,代码来源:MultiFileWordCount.java
注:本文中的org.apache.hadoop.mapred.MultiFileSplit类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论