谷歌图像爬虫 – 谷歌吧

一个通过搜索关键词，下载谷歌的搜索结果图片的爬虫。比如：配置了关键词“猫”，“狗”，运行代码，就会把Google搜索中“猫”、“狗”的图片分别以目录的方式保存。代码运行中会弹出Chome浏览器，这是在模拟浏览器访问，不用管就行了。

开源：https://github.com/ohyicong/Google-Image-Scraper

安装使用很简单，配置好python环境后，安装分三步：

git clone https://github.com/ohyicong/Google-Image-Scraper pip install selenium, requests, pillow python main.py

样例代码，也就是main.py的代码也很简单：

#Import libraries (Don't change) from GoogleImageScrapper import GoogleImageScraper import os from patch import webdriver_executable


#Define file path (Don't change)

webdriver_path = os.path.normpath(os.path.join(os.getcwd(), 'webdriver', webdriver_executable()))

image_path = os.path.normpath(os.path.join(os.getcwd(), 'photos'))
#Add new search key into array ["cat","t-shirt","apple","orange","pear","fish"]

search_keys= ["cat","t-shirt"]
#Parameters

number_of_images = 10

headless = True

min_resolution=(0,0)

max_resolution=(1920,1080)

#Main program for search_key in search_keys: image_scrapper = GoogleImageScraper(webdriver_path,image_path,search_key,number_of_images,headless,min_resolution,max_resolution) image_urls = image_scrapper.find_image_urls() image_scrapper.save_images(image_urls)

发表回复 取消回复

发表回复取消回复