Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share

Login

Remember

Register

Ask
Q&A
All Activity
Hot!
Unanswered
Tags
Users
Ask a Question

Welcome To Ask or Share your Answers For Others

Categories

Topic[话题] (13)

Life[生活] (4)

Technique[技术] (2.1m)

Idea[创意] (3)

Jobs[工作] (2)

Others[杂七杂八] (18)

Code Example[编程示例] (0)

node爬虫限制采集速度？

0 votes

372 views

asked Jan 29, 2021 in Technique[技术] by 深蓝 (71.8m points)

node爬虫限制采集速度？

request、cheerio

node的爬虫如何设定采集速度？
由于目标站点打开速度比较慢，加载数据相对较长的时间，所以总有一些页面还没加载完，就被忽略掉了。

要怎么限制node爬虫的并发量和采集速度之类的。

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Please log in or register to add a comment.

Welcome To Ask or Share your Answers For Others

Please log in or register to answer this question.

1 Answer

0 votes

answered Jan 29, 2021 by 深蓝 (71.8m points)

使用队列的方式就可以。
如果是分布式抓取，建议使用专业的消息队列组件。
如果是单机简单的抓取，网上也有一些开源的单机消息队列。

使用消息队列，就可以轻易的控制并发和抓取间隔。
比如我设定每次同时消费10个队列，就可以控制并发为10.
如果是想要1分钟内抓取10个网页，也可以通过一些限流算法去实现（网上有很多教程）

总之，你把他写成消息队列的形式，一切都简单起来了。

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Please log in or register to add a comment.

Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share

Click Here to Ask a Question

Just Browsing Browsing

[1] python - Multiprocessing an array in chunks

[2] node.js - Send double data in request json

[3] ansible - How can I set the value of a variable being passed to a role based on ansible_facts?

[4] php处理png图片失真问题

[5] javascript - Populate SELECT options with Ajax and Django

[6] 输入的名称正则要求怎么写？

[7] js时间戳转换成日期的方法

[8] scala - How to apply a function on each row of a Spark Dataframe after groupby using Java

[9] Linux软件的配置文件寻找路径优先级是什么？

[10] echarts中不连续且间隔不一致的时间如何有规律的显示？

2.1m questions

2.1m answers

60 comments

57.0k users

Most popular tags

javascript python c# java How android c++ php ios html sql r c node.js .net iphone asp.net css reactjs jquery ruby What Android objective mysql linux Is git Python windows Why regex angular swift amazon excel algorithm macos Java visual how bash Can multithreading PHP Using scala angularjs typescript apache spring performance postgresql database flutter json rust arrays C# dart vba django wpf xml vue.js In go Get google jQuery xcode jsf http Google mongodb string shell oop powershell SQL C++ security assembly docker Javascript Android: Does haskell Convert azure debugging delphi vb.net Spring datetime pandas oracle math Django

Xstack问答社区
生活宝问答社区
OverStack问答社区
Ostack问答社区
在这了问答社区
在哪了问答社区
Xstack问答社区
无极谷问答社区
TouSu问答社区
SQlite问答社区
Qi-U问答社区
MLink问答社区
Jonic问答社区
Jike问答社区
16892问答社区
Vigges问答社区
55276问答社区
OGeek问答社区
深圳家问答社区
深圳家问答社区
深圳家问答社区
Vigges问答社区
Vigges问答社区
在这了问答社区
DevDocs API Documentations

Xstack问答社区
生活宝问答社区
OverStack问答社区
Ostack问答社区
在这了问答社区
在哪了问答社区
Xstack问答社区
无极谷问答社区
TouSu问答社区
SQlite问答社区
Qi-U问答社区
MLink问答社区
Jonic问答社区
Jike问答社区
16892问答社区
Vigges问答社区
55276问答社区
OGeek问答社区
深圳家问答社区
深圳家问答社区
深圳家问答社区
Vigges问答社区
Vigges问答社区
在这了问答社区
在这了问答社区
DevDocs API Documentations

Xstack问答社区
生活宝问答社区
OverStack问答社区
Ostack问答社区
在这了问答社区
在哪了问答社区
Xstack问答社区
无极谷问答社区
TouSu问答社区
SQlite问答社区
Qi-U问答社区
MLink问答社区
Jonic问答社区
Jike问答社区
16892问答社区
Vigges问答社区
55276问答社区
OGeek问答社区
深圳家问答社区
深圳家问答社区
深圳家问答社区
Vigges问答社区
Vigges问答社区
在这了问答社区
DevDocs API Documentations

Send feedback
深圳家
深圳家
极客中国
搜外友链
Ostack Developer QA ZONE
CC BY-SA 3.0
Contact with WebMaster by Email: [email protected]

Powered by Question2Answer

Theme by Q2A Market&&OStack.cn

...