• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

Kraken: Kraken 是 Uber 开源的点对点(P2P)Docker 容器仓库

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

Kraken

开源软件地址:

https://gitee.com/mirrors/Kraken

开源软件介绍:

Kraken is a P2P-powered Docker registry that focuses on scalability and availability. It isdesigned for Docker image management, replication, and distribution in a hybrid cloud environment.With pluggable backend support, Kraken can easily integrate into existing Docker registry setupsas the distribution layer.

Kraken has been in production at Uber since early 2018. In our busiest cluster, Kraken distributesmore than 1 million blobs per day, including 100k 1G+ blobs. At its peak production load, Krakendistributes 20K 100MB-1G blobs in under 30 sec.

Below is the visualization of a small Kraken cluster at work:

Table of Contents

Features

Following are some highlights of Kraken:

  • Highly scalable. Kraken is capable of distributing Docker images at > 50% of max downloadthe speed limit on every host. Cluster size and image size do not have a significant impact ondownload speed.
    • Supports at least 15k hosts per cluster.
    • Supports arbitrarily large blobs/layers. We normally limit max size to 20G for the best performance.
  • Highly available. No component is a single point of failure.
  • Secure. Support uploader authentication and data integrity protection through TLS.
  • Pluggable storage options. Instead of managing data, Kraken plugs into reliable blob storageoptions, like S3, GCS, ECR, HDFS or another registry. The storage interface is simple and newoptions are easy to add.
  • Lossless cross-cluster replication. Kraken supports rule-based async replication betweenclusters.
  • Minimal dependencies. Other than pluggable storage, Kraken only has an optional dependency onDNS.

Design

The high-level idea of Kraken is to have a small number of dedicated hosts seeding content to anetwork of agents running on each host in the cluster.

A central component, the tracker, will orchestrate all participants in the network to form apseudo-random regular graph.

Such a graph has high connectivity and a small diameter. As a result, even with only one seeder andhaving thousands of peers joining in the same second, all participants can reach a minimum of 80%max upload/download speed in theory (60% with current implementation), and performance doesn'tdegrade much as the blob size and cluster size increase. For more details, see the team's techtalk at KubeCon + CloudNativeCon.

Architecture

  • Agent
    • Deployed on every host
    • Implements Docker registry interface
    • Announces available content to tracker
    • Connects to peers returned by the tracker to download content
  • Origin
    • Dedicated seeders
    • Stores blobs as files on disk backed by pluggable storage (e.g. S3, GCS, ECR)
    • Forms a self-healing hash ring to distribute the load
  • Tracker
    • Tracks which peers have what content (both in-progress and completed)
    • Provides ordered lists of peers to connect to for any given blob
  • Proxy
    • Implements Docker registry interface
    • Uploads each image layer to the responsible origin (remember, origins form a hash ring)
    • Uploads tags to build-index
  • Build-Index
    • Mapping of the human-readable tag to blob digest
    • No consistency guarantees: the client should use unique tags
    • Powers image replication between clusters (simple duplicated queues with retry)
    • Stores tags as files on disk backed by pluggable storage (e.g. S3, GCS, ECR)

Benchmark

The following data is from a test where a 3G Docker image with 2 layers is downloaded by 2600 hostsconcurrently (5200 blob downloads), with 300MB/s speed limit on all agents (using 5 trackers and5 origins):

  • p50 = 10s (at speed limit)
  • p99 = 18s
  • p99.9 = 22s

Usage

All Kraken components can be deployed as Docker containers. To build the Docker images:

$ make images

For information about how to configure and use Kraken, please refer to the documentation.

Kraken on Kubernetes

You can use our example Helm chart to deploy Kraken (with an example HTTP fileserver backend) onyour k8s cluster:

$ helm install --name=kraken-demo ./helm

Once deployed, every node will have a docker registry API exposed on localhost:30081.For example pod spec that pulls images from Kraken agent, see example.

For more information on k8s setup, see README.

Devcluster

To start a herd container (which contains origin, tracker, build-index and proxy) and two agentcontainers with development configuration:

$ make devcluster

Docker-for-Mac is required for making dev-cluster work on your laptop.For more information on devcluster, please check out devcluster README.

Comparison With Other Projects

Dragonfly from Alibaba

Dragonfly cluster has one or a few "supernodes" that coordinates the transfer of every 4MB chunk of datain the cluster.

While the supernode would be able to make optimal decisions, the throughput of the whole cluster islimited by the processing power of one or a few hosts, and the performance would degrade linearly aseither blob size or cluster size increases.

Kraken's tracker only helps orchestrate the connection graph and leaves the negotiation of actual datatransfer to individual peers, so Kraken scales better with large blobs.On top of that, Kraken is HA and supports cross-cluster replication, both are required for areliable hybrid cloud setup.

BitTorrent

Kraken was initially built with a BitTorrent driver, however, we ended up implementing our P2Pdriver based on BitTorrent protocol to allow for tighter integration with storage solutions and morecontrol over performance optimizations.

Kraken's problem space is slightly different than what BitTorrent was designed for. Kraken's goal isto reduce global max download time and communication overhead in a stable environment, whileBitTorrent was designed for an unpredictable and adversarial environment, so it needs to preserve morecopies of scarce data and defend against malicious or bad behaving peers.

Despite the differences, we re-examine Kraken's protocol from time to time, and if it's feasible, wehope to make it compatible with BitTorrent again.

Limitations

  • If Docker registry throughput is not the bottleneck in your deployment workflow, switching toKraken will not magically speed up your docker pull. To speed up docker pull, considerswitching to Makisu to improve layer reusability at build time, ortweak compression ratios, as docker pull spends most of the time on data decompression.
  • Mutating tags (e.g. updating a latest tag) is allowed, however, a few things will not work: taglookups immediately afterwards will still return the old value due to Nginx caching, and replicationprobably won't trigger. We are working on supporting this functionality better. If you need tagmutation support right now, please reduce the cache interval of the build-index component. If you also needreplication in a multi-cluster setup, please consider setting up another Docker registry as Kraken'sbackend.
  • Theoretically, Kraken should distribute blobs of any size without significant performancedegradation, but at Uber, we enforce a 20G limit and cannot endorse the production use ofultra-large blobs (i.e. 100G+). Peers enforce connection limits on a per blob basis, and new peersmight be starved for connections if no peers become seeders relatively soon. If you have ultra-largeblobs you'd like to distribute, we recommend breaking them into <10G chunks first.

Contributing

Please check out our guide.

Contact

To contact us, please join our Slack channel.


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap