本文整理汇总了Java中org.kitesdk.data.View类的典型用法代码示例。如果您正苦于以下问题:Java View类的具体用法?Java View怎么用?Java View使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
View类属于org.kitesdk.data包,在下文中一共展示了View类的20个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: run
import org.kitesdk.data.View; //导入依赖的package包/类
public void run(@DataIn(name="source.events", type=StandardEvent.class) View<StandardEvent> input,
@DataOut(name="target.events", type=StandardEvent.class) View<StandardEvent> output) {
DatasetReader<StandardEvent> reader = input.newReader();
DatasetWriter<StandardEvent> writer = output.newWriter();
try {
while (reader.hasNext()) {
writer.write(reader.next());
}
} finally {
Closeables.closeQuietly(reader);
Closeables.closeQuietly(writer);
}
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:18,代码来源:StandardEventsJob.java
示例2: run
import org.kitesdk.data.View; //导入依赖的package包/类
public void run(@DataIn(name="source_users") View<GenericRecord> input,
@DataOut(name="target_users") View<GenericRecord> output) {
DatasetReader<GenericRecord> reader = input.newReader();
DatasetWriter<GenericRecord> writer = output.newWriter();
try {
while (reader.hasNext()) {
writer.write(reader.next());
}
} finally {
Closeables.closeQuietly(reader);
Closeables.closeQuietly(writer);
}
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:18,代码来源:ScheduledInputOutputJob.java
示例3: run
import org.kitesdk.data.View; //导入依赖的package包/类
public void run(@DataIn(name="example_events", type=ExampleEvent.class) View<ExampleEvent> input,
@DataOut(name="odd_users", type=ExampleEvent.class) View<ExampleEvent> output) throws IOException {
Job job = Job.getInstance(getJobContext().getHadoopConf());
DatasetKeyInputFormat.configure(job).readFrom(input);
DatasetKeyOutputFormat.configure(job).writeTo(output);
JavaPairRDD<ExampleEvent, Void> inputData = getJobContext()
.getSparkContext()
.newAPIHadoopRDD(job.getConfiguration(), DatasetKeyInputFormat.class,
ExampleEvent.class, Void.class);
JavaPairRDD<ExampleEvent, Void> filteredData = inputData.filter(new KeepOddUsers());
filteredData.saveAsNewAPIHadoopDataset(job.getConfiguration());
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:17,代码来源:SparkJob.java
示例4: run
import org.kitesdk.data.View; //导入依赖的package包/类
@Override
public int run(List<String> args) throws Exception {
Preconditions.checkState(!Datasets.exists(uri),
"events dataset already exists");
DatasetDescriptor descriptor = new DatasetDescriptor.Builder()
.schema(StandardEvent.class).build();
View<StandardEvent> events = Datasets.create(uri, descriptor, StandardEvent.class);
DatasetWriter<StandardEvent> writer = events.newWriter();
try {
while (System.currentTimeMillis() - baseTimestamp < 36000) {
writer.write(generateRandomEvent());
}
} finally {
writer.close();
}
System.out.println("Generated " + counter + " events");
return 0;
}
开发者ID:kite-sdk,项目名称:kite-examples,代码行数:24,代码来源:CreateEvents.java
示例5: read
import org.kitesdk.data.View; //导入依赖的package包/类
public static <T> HashSet<T> read(View<T> view) {
DatasetReader<T> reader = null;
try {
reader = view.newReader();
return Sets.newHashSet(reader.iterator());
} finally {
if (reader != null) {
reader.close();
}
}
}
开发者ID:moueimei,项目名称:flume-release-1.7.0,代码行数:12,代码来源:TestDatasetSink.java
示例6: save
import org.kitesdk.data.View; //导入依赖的package包/类
/**
* Save all RDDs in the given DStream to the given view.
* @param dstream
* @param view
*/
public static <T> void save(JavaDStream<T> dstream, final View<T> view) {
final String uri = view.getUri().toString();
dstream.foreachRDD(new Function2<JavaRDD<T>, Time, Void>() {
@Override
public Void call(JavaRDD<T> rdd, Time time) throws Exception {
save(rdd, uri);
return null;
}
});
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:20,代码来源:SparkDatasets.java
示例7: run
import org.kitesdk.data.View; //导入依赖的package包/类
public void run(@DataIn(name="source.users") View<GenericRecord> input,
@DataOut(name="target.users") View<GenericRecord> output) throws IOException {
Job job = Job.getInstance();
DatasetKeyInputFormat.configure(job).readFrom(input);
DatasetKeyOutputFormat.configure(job).writeTo(output);
@SuppressWarnings("unchecked")
JavaPairRDD<GenericData.Record, Void> inputData = getJobContext()
.getSparkContext()
.newAPIHadoopRDD(job.getConfiguration(), DatasetKeyInputFormat.class,
GenericData.Record.class, Void.class);
inputData.saveAsNewAPIHadoopDataset(job.getConfiguration());
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:16,代码来源:SimpleSparkJob.java
示例8: loadWhenAvailable
import org.kitesdk.data.View; //导入依赖的package包/类
/**
* Loads the expected number of items when they are available in the dataset.
* @param expected
* @return
*/
private <T> List<T> loadWhenAvailable(View<T> view, int expected) {
for (int attempt = 0; attempt < 20; ++attempt) {
List<T> items = Lists.newArrayList();
DatasetReader<T> reader = view.newReader();
int count = 0;
try {
for (; count < expected; ++count) {
if (!reader.hasNext())
continue;
items.add(reader.next());
}
} finally {
reader.close();
}
if (count == expected) {
return items;
}
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
Assert.fail("Unable to load the expected items.");
return null;
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:44,代码来源:TopicToDatasetJobTest.java
示例9: testJob
import org.kitesdk.data.View; //导入依赖的package包/类
@Test
public void testJob() throws IOException {
List<SmallEvent> events = Lists.newArrayList();
for (int i = 0; i < 10; ++i) {
SmallEvent event = SmallEvent.newBuilder()
.setSessionId("1234")
.setUserId(i)
.build();
events.add(event);
}
harness.writeMessages(StreamingSparkApp.TOPIC_NAME, events);
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
View<SmallEvent> view = Datasets.load(StreamingSparkApp.EVENTS_DS_URI, SmallEvent.class);
List<SmallEvent> results = loadWhenAvailable(view, 10);
Assert.assertEquals(events, results);
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:29,代码来源:TopicToDatasetJobTest.java
示例10: checkMessages
import org.kitesdk.data.View; //导入依赖的package包/类
/**
* Checks if the given view contains the expected messages within a timeout period.
*/
static boolean checkMessages(View view, List<ExampleEvent> expected, int timeoutSeconds) throws InterruptedException {
List<ExampleEvent> actual = loadWhenAvailable(view, expected.size(), timeoutSeconds);
if (actual == null) {
return false;
}
return expected.equals(actual);
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:14,代码来源:DataUtil.java
示例11: newWriter
import org.kitesdk.data.View; //导入依赖的package包/类
private DatasetWriter<GenericRecord> newWriter(
final UserGroupInformation login, final URI uri) {
View<GenericRecord> view = KerberosUtil.runPrivileged(login,
new PrivilegedExceptionAction<Dataset<GenericRecord>>() {
@Override
public Dataset<GenericRecord> run() {
return Datasets.load(uri);
}
});
DatasetDescriptor descriptor = view.getDataset().getDescriptor();
String formatName = descriptor.getFormat().getName();
Preconditions.checkArgument(allowedFormats().contains(formatName),
"Unsupported format: " + formatName);
Schema newSchema = descriptor.getSchema();
if (targetSchema == null || !newSchema.equals(targetSchema)) {
this.targetSchema = descriptor.getSchema();
// target dataset schema has changed, invalidate all readers based on it
readers.invalidateAll();
}
this.reuseDatum = !("parquet".equals(formatName));
this.datasetName = view.getDataset().getName();
return view.newWriter();
}
开发者ID:kite-sdk,项目名称:kite-examples-integration-tests,代码行数:28,代码来源:DatasetSink.java
示例12: run
import org.kitesdk.data.View; //导入依赖的package包/类
@Override
public int run(String[] args) throws Exception {
final long startOfToday = startOfDay();
// the destination dataset
Dataset<Record> persistent = Datasets.load(
"dataset:file:/tmp/data/logs", Record.class);
// the source: anything before today in the staging area
Dataset<Record> staging = Datasets.load(
"dataset:file:/tmp/data/logs_staging", Record.class);
View<Record> ready = staging.toBefore("timestamp", startOfToday);
ReadableSource<Record> source = CrunchDatasets.asSource(ready);
PCollection<Record> stagedLogs = read(source);
getPipeline().write(stagedLogs,
CrunchDatasets.asTarget(persistent), Target.WriteMode.APPEND);
PipelineResult result = run();
if (result.succeeded()) {
// remove the source data partition from staging
ready.deleteAll();
return 0;
} else {
return 1;
}
}
开发者ID:kite-sdk,项目名称:kite-examples,代码行数:31,代码来源:StagingToPersistent.java
示例13: signalOutputViews
import org.kitesdk.data.View; //导入依赖的package包/类
/**
* Signal the produced views as ready for downstream processing.
*/
protected void signalOutputViews(Map<String,View> views) {
// If the job specified output parameters,
// signal them when we complete.
JobParameters params = getJobParameters();
if (params != null) {
Set<String> outputNames = getJobParameters().getOutputNames();
for (String outputName: outputNames) {
View view = views.get(outputName);
if (view instanceof Signalable) {
((Signalable) view).signalReady();
}
}
}
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:25,代码来源:SchedulableJobManager.java
示例14: runScheduledJobs
import org.kitesdk.data.View; //导入依赖的package包/类
/**
* Runs all scheduled jobs in the application using the given
* nominal time.
*/
public void runScheduledJobs(Instant nominalTime) {
Map<String,View> views = Maps.newHashMap();
for (Schedule schedule: app.getSchedules()) {
for (Schedule.ViewTemplate input: schedule.getViewTemplates().values()) {
String uri = resolveTemplate(input.getUriTemplate(), nominalTime);
View view = Datasets.load(uri, input.getInputType());
views.put(input.getName(), view);
}
}
runScheduledJobs(nominalTime, views);
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:23,代码来源:TestScheduler.java
示例15: run
import org.kitesdk.data.View; //导入依赖的package包/类
public void run(@DataOut(name="kv-output", type= KeyValues.class) View<KeyValues> output) {
DatasetWriter<KeyValues> writer = output.newWriter();
try {
JobContext context = getJobContext();
KeyValues kv = KeyValues.newBuilder()
.setJobsettings(context.getSettings())
.setOutputsettings(context.getOutputSettings("kv-output"))
.build();
writer.write(kv);
} finally {
Closeables.closeQuietly(writer);
}
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:23,代码来源:WriteConfigOutputJob.java
示例16: run
import org.kitesdk.data.View; //导入依赖的package包/类
public void run() {
// TODO: Switch to parameterized views.
View<ExampleEvent> view = Datasets.load(ScheduledReportApp.EXAMPLE_DS_URI,
ExampleEvent.class);
RefinableView<GenericRecord> target = Datasets.load(ScheduledReportApp.REPORT_DS_URI,
GenericRecord.class);
// Get the view into which this report will be written.
DateTime dateTime = getNominalTime().toDateTime(DateTimeZone.UTC);
View<GenericRecord> output = target
.with("year", dateTime.getYear())
.with("month", dateTime.getMonthOfYear())
.with("day", dateTime.getDayOfMonth())
.with("hour", dateTime.getHourOfDay())
.with("minute", dateTime.getMinuteOfHour());
Pipeline pipeline = getPipeline();
PCollection<ExampleEvent> events = pipeline.read(CrunchDatasets.asSource(view));
PTable<Long, ExampleEvent> eventsByUser = events.by(new GetEventId(), Avros.longs());
// Count of events by user ID.
PTable<Long, Long> userEventCounts = eventsByUser.keys().count();
PCollection<GenericData.Record> report = userEventCounts.parallelDo(
new ToUserReport(),
Avros.generics(SCHEMA));
pipeline.write(report, CrunchDatasets.asTarget(output));
pipeline.run();
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:37,代码来源:ScheduledReportJob.java
示例17: testTriggeredApp
import org.kitesdk.data.View; //导入依赖的package包/类
@Test
public void testTriggeredApp() {
Map<String,String> settings = ImmutableMap.<String,String>builder()
.put("spark.master", "local[3]")
.put("spark.app.name", "spark-test")
.build();
AppContext appContext = new AppContext(settings, getConfiguration());
TestScheduler generatorRunner = TestScheduler.load(DataGeneratorApp.class, appContext);
TestScheduler triggeredRunner = TestScheduler.load(SparkApp.class, appContext);
DateTime firstNominalTime = new DateTime(2015, 5, 7, 12, 0, 0);
// Run the generator job at each minute.
for (int i = 0; i < 2; ++i) {
Instant nominalTime = firstNominalTime.plusMinutes(i).toInstant();
DateTime dateTime = nominalTime.toDateTime(DateTimeZone.UTC);
// Generate the data and then run the triggered job, which should read it.
generatorRunner.runScheduledJobs(nominalTime);
triggeredRunner.runScheduledJobs(nominalTime);
// Get the output at the expected time and read its contents.
Dataset<ExampleEvent> oddUserDataset = Datasets.load(SparkApp.ODD_USER_DS_URI, ExampleEvent.class);
View<ExampleEvent> output = oddUserDataset.with("year", dateTime.getYear())
.with("month", dateTime.getMonthOfYear())
.with("day", dateTime.getDayOfMonth())
.with("hour", dateTime.getHourOfDay())
.with("minute", dateTime.getMinuteOfHour());
// Verify the output contains only the expected user IDs.
DatasetReader<ExampleEvent> reader = output.newReader();
try {
int count = 0;
for (ExampleEvent event: reader) {
System.out.println(event.getUserId());
Assert.assertTrue(event.getUserId() % 2 == 1);
++count;
}
} finally {
reader.close();
}
}
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:57,代码来源:SparkAppTest.java
示例18: main
import org.kitesdk.data.View; //导入依赖的package包/类
public static int main(String[] args) throws IOException {
if (args.length != 4) {
System.out.println("Usage: <brokerList> <topic> <count> <timeout>");
return 1;
}
String brokerList = args[0];
String topic = args[1];
int count = Integer.parseInt(args[2]);
int timeout = Integer.parseInt(args[3]);
// Create a random session ID for testing.
String sessionId = UUID.randomUUID().toString();
List<ExampleEvent> events = DataUtil.createEvents(count, sessionId);
Producer producer = DataUtil.createProducer(brokerList);
System.out.println("Generating " + count + " messages with session ID " + sessionId);
DataUtil.sendMessages(topic, producer, events);
RefinableView<ExampleEvent> dataset = Datasets.load(TopicToDatasetApp.EVENTS_DS_URI, ExampleEvent.class);
View<ExampleEvent> view = dataset.with("session_id", sessionId);
try {
System.out.println("Checking for expected output.");
boolean foundMessages = DataUtil.checkMessages(view, events, timeout);
if (foundMessages) {
System.out.println("Found test messages.");
return 0;
} else {
System.out.println("Expected test messages not found");
System.out.println("Dataset: " + TopicToDatasetApp.EVENTS_DS_URI);
System.out.println("Session ID: " + sessionId);
return 1;
}
} catch (InterruptedException e) {
return 1;
}
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:47,代码来源:TopicToDataset.java
示例19: run
import org.kitesdk.data.View; //导入依赖的package包/类
@Override
public int run(String[] args) throws Exception {
// Turn debug on while in development.
getPipeline().enableDebug();
getPipeline().getConfiguration().set("crunch.log.job.progress", "true");
Dataset<StandardEvent> eventsDataset = Datasets.load(
"dataset:hdfs:/tmp/data/default/events", StandardEvent.class);
View<StandardEvent> eventsToProcess;
if (args.length == 0 || (args.length == 1 && args[0].equals("LATEST"))) {
// get the current minute
Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
long currentMinute = cal.getTimeInMillis();
// restrict events to before the current minute
// in the workflow, this also has a lower bound for the timestamp
eventsToProcess = eventsDataset.toBefore("timestamp", currentMinute);
} else if (isView(args[0])) {
eventsToProcess = Datasets.load(args[0], StandardEvent.class);
} else {
eventsToProcess = FileSystemDatasets.viewForPath(eventsDataset, new Path(args[0]));
}
if (eventsToProcess.isEmpty()) {
LOG.info("No records to process.");
return 0;
}
// Create a parallel collection from the working partition
PCollection<StandardEvent> events = read(
CrunchDatasets.asSource(eventsToProcess));
// Group events by user and cookie id, then create a session for each group
PCollection<Session> sessions = events
.by(new GetSessionKey(), Avros.strings())
.groupByKey()
.parallelDo(new MakeSession(), Avros.specifics(Session.class));
// Write the sessions to the "sessions" Dataset
getPipeline().write(sessions,
CrunchDatasets.asTarget("dataset:hive:/tmp/data/default/sessions"),
Target.WriteMode.APPEND);
return run().succeeded() ? 0 : 1;
}
开发者ID:kite-sdk,项目名称:kite-examples,代码行数:49,代码来源:CreateSessions.java
示例20: run
import org.kitesdk.data.View; //导入依赖的package包/类
@Override
public void run(Instant nominalTime, Map<String,View> views) {
job.runJob(views, getJobContext(), nominalTime);
signalOutputViews(views);
}
开发者ID:rbrush,项目名称:kite-apps,代码行数:8,代码来源:SparkJobManager.java
注:本文中的org.kitesdk.data.View类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论