Recent questions tagged scrapy

0 votes

1.0k views

1 answer

scrapy - Get variables from spider in pipelines.py

I need to store intermediate data. So, in spider, at parse method i create variable, that stores it. ... .com/questions/65884897/get-variables-from-spider-in-pipelines-py...

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.1k views

1 answer

scrapy FilesPipeline change file_path based in item property

I'd like to override the file_path method in the FilesPipeline based on an item property. I use scrapy ... /65617459/scrapy-filespipeline-change-file-path-based-in-item-property...

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.0k views

1 answer

scrapy - Web刮板添加额外的字段，但刮板不会刮字段(Web scraper add extra field but scraper will not scrape field)

I have a web scraper coded for me using scrapy. (我有一个使用scrapy为我编写的网络刮板 ) I wish to add an extra field from the website the ... 希望我添加其他文本，因此请忽略 ) ask by Davey Boy translate from so...

asked Mar 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

967 views

1 answer

scrapy - Web刮板添加额外的字段，但刮板不会刮字段(Web scraper add extra field but scraper will not scrape field)

I have a web scraper coded for me using scrapy. (我有一个使用scrapy为我编写的网络刮板 ) I wish to add an extra field from the website the ... 希望我添加其他文本，因此请忽略 ) ask by Davey Boy translate from so...

asked Mar 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.1k views

1 answer

scrapy - Web刮板添加额外的字段，但刮板不会刮字段(Web scraper add extra field but scraper will not scrape field)

I have a web scraper coded for me using scrapy. (我有一个使用scrapy为我编写的网络刮板 ) I wish to add an extra field from the website the ... 希望我添加其他文本，因此请忽略 ) ask by Davey Boy translate from so...

asked Feb 21, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.1k views

1 answer

scrapy - Web刮板添加额外的字段，但刮板不会刮字段(Web scraper add extra field but scraper will not scrape field)

I have a web scraper coded for me using scrapy. (我有一个使用scrapy为我编写的网络刮板 ) I wish to add an extra field from the website the ... 希望我添加其他文本，因此请忽略 ) ask by Davey Boy translate from so...

asked Feb 21, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

977 views

1 answer

scrapy - Trying to loop until reach last page in web scraping using scrapy_splash

I am trying to loop over last page until next button is not present in the web page. CODE: import scrapy ... .start_requests) but couldn't retrieve the data. Assistance required....

asked Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.2k views

1 answer

scrapy - I would like to get the "2 001&nbsp"

I want to scrap the info of this line and get the 2?001&nbsp. 2?001?€ This is the image I put this line in my ... extract() The result of what I did is here: The result Thank you...

asked Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.0k views

1 answer

scrapy shell 报错521

想用scrapy shell url 调试下进入ipython后没有爬到网页内容，报错521 请问怎么弄谢谢！！...

asked Feb 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.3k views

1 answer

scrapy shell url 报错521

想用scrapy shell调试response.xpath提取的标签内容，发现response没有内容，response.status显示521，后来直接从网页进，显示内部服务器错误，貌似ip被封了后来换了几个url,response.status ... 改cookie的，但是不知道在哪里找cookie，改哪里的文件设置？？废话有点多，谢谢大神们了...

asked Feb 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.1k views

1 answer

scrapy url去重

请问scrapy是url自动去重的吗？比如下面这段代码，为什么运行时start_urls里面的重复url会重复爬取了？ class TestSpider(scrapy.Spider): name = "test" ... sel.xpath('div[@class="list"]/a/@href')[0].extract() yield item...

asked Feb 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.2k views

1 answer

scrapy 遍历url列表，循环发送请求只循环一次的问题

一问题描述我把url构成列表作为爬虫请求的入口（由'http://www.bjev520.com/jsp/beiqi/pcmap/do/pcMap.jsp?cityName=省市名'构成）对入口地址请求后，每个对象中都还有一层带url的子集合(' ... scrapy方法思路雍错，使得程序只遍历了一个对象，循环就不再继续了，请大家帮助解答一下，感恩！...

asked Feb 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.0k views

1 answer

scrapy 处理文章分页的内容

如一篇文章有2-3页，然后想把这些内容页爬下来，拼接成一页，然后再放入数据库。文章url如：article_1.html,article_2.html item有：item['title'],item['content'] 而item['content']就是拼接成一页的内容。大概怎么写呢？...

asked Feb 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

1.1k views

1 answer

scrapy Re format urls

I got my hands into an instagram spider and its working like a charm for posts i want to change the the url ... /@content').extract_first() item['videoURL'] = video_url yield item...

asked Feb 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

2.2k views

1 answer

scrapy Pipeline TypeError: can only concatenate str (not "dict") to str

I have a scrapy spider that i want to connect with pipeline my scrapy items are def parse(self, response): x = response. ... (finished) can anyone help me what i am doing wrong ?...

asked Jan 27, 2021 in Technique[技术] by 深蓝 (71.8m points)

0 votes

4.4k views

1 answer

scrapy - Scrapyd bug in combination with git tags

I use git for version controll and tag my releases of scrapy crawlers. Since adding git tags (v1.3.1 or v1_3_3), ... running. Does somebody have an idea on how to fix this?...

asked Jan 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

Categories

Just Browsing Browsing

Most popular tags