相关文章推荐
悲伤的熊猫  ·  Cloud Web Scraper | ...·  6 月前    · 
悲伤的熊猫  ·  What is a Scrapy ...·  6 月前    · 
悲伤的熊猫  ·  Scrapy · PyPI·  6 月前    · 
悲伤的熊猫  ·  Start a Scrapy ...·  6 月前    · 
悲伤的熊猫  ·  Scrapy Tutorial — ...·  6 月前    · 

An open source and collaborative framework for extracting the data you need from websites.

In a fast, simple, yet extensible way.

Maintained by many other contributors class BlogSpider ( scrapy . Spider ): name = ' blogspider ' start_urls = [ ' https://www.zyte.com/blog/ ' ] def parse ( self , response ): for title in response . css ( ' .oxy-post-title ' ): yield { ' title ' : title . css ( ' ::text ' ). get ()} for next_page in response . css ( ' a.next ' ): yield response . follow ( next_page , self . parse ) EOF scrapy runspider myspider.py # Schedule the spider for execution shub schedule blogspider Spider blogspider scheduled, watch it running here: https://app.zyte.com/p/26731/job/1/8 # Retrieve the scraped data shub items 26731/1/8
{"title": "Improved Frontera: Web Crawling at Scale with Python 3 Support"}
{"title": "How to Crawl the Web Politely with Scrapy"}
    

Deploy them to
Zyte Scrapy Cloud

or use Scrapyd to host the spiders on your own server