• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

stream-dm: streamDM,是由华为诺亚方舟实验室开源的使用 Spark Streaming 挖掘大数据 ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

stream-dm

开源软件地址:

https://gitee.com/mirrors/stream-dm

开源软件介绍:

streamDM for Spark Streaming

streamDM is a new open source software for mining big data streams using Spark Streaming, started at Huawei Noah's ArkLab. streamDM is licensed under Apache Software License v2.0.

Big Data Stream Learning

Big Data stream learning is more challenging than batch or offline learning,since the data may not keep the same distribution over the lifetime of thestream. Moreover, each example coming in a stream can only be processed once, orthey need to be summarized with a small memory footprint, and the learningalgorithms must be very efficient.

Spark Streaming

Spark Streaming is an extension of thecore Spark API that enables stream processing froma variety of sources. Spark is a extensible and programmable framework formassive distributed processing of datasets, called Resilient DistributedDatasets (RDD). Spark Streaming receives input data streams and divides the datainto batches, which are then processed by the Spark engine to generate theresults.

Spark Streaming data is organized into a sequence of DStreams, representedinternally as a sequence of RDDs.

Included Methods

In this current release of StreamDM v0.2, we have implemented:

we also implemented following data generators:

  • HyperplaneGenerator
  • RandomTreeGenerator
  • RandomRBFGenerator
  • RandomRBFEventsGenerator

We have also implemented SampleDataWriter, which can call data generatorsto create sample data for simulation or test.

In the next release of streamDM, we are going to add:

  • Classification: Random Forests
  • Multi-label: Hoeffding Tree ML, Random Forests ML
  • Frequent Itemset Miner: IncMine

For future works, we are considering:

  • Regression: Hoeffding Regression Tree, Bagging, Random Forests
  • Clustering: Clustree, DenStream
  • Frequent Itemset Miner: IncSecMine

Going Further

For a quick introduction to running StreamDM, refer to the GettingStarted document. The StreamDM ProgrammingGuide presents a detailed view of StreamDM. The full APIdocumentation can be consulted here.

Environment

  • Spark 2.3.2
  • Scala 2.11
  • SBT 0.13
  • Java 8+

Mailing lists

User support and questions mailing list:

[email protected]

Development related discussions:

[email protected]


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap