依旧爱色,股票知识,郑各庄广场舞
例:抓取photoshop视频教程 网址http://www.mxiaobei.com/?id=424
import requests import re from bs4 import beautifulsoup import time dicts = {} list1 = set() print('start') ua = 'mozilla/5.0 (macintosh; intel mac os x 10_14_0) applewebkit/537.36 (khtml, like gecko) chrome/76.0.3809.87 safari/537.36' urls = 'http://www.mxiaobei.com/?id=' for index in range(451, 565): r = requests.get(urls + str(index), headers = {'user-agent': ua }) r.encoding = 'utf-8' soup = beautifulsoup(r.text, 'lxml') title = soup.find(name='h2') mp4url = soup.find('div', id='cuplayer') if mp4url is none: list1.add(index) continue mpurl = re.search('http.*?mp4', mp4url.text) dicts[title.text] = mpurl.group() #print(index) #time.sleep(1) #print(title.text + ' : ' + dicts[title.text]) print(dicts) print(list1) for temp in dicts.items(): #time.sleep(1) r = requests.get(temp[1], stream=true) with open(temp[0] + '.mp4', "wb") as mp4: for chunk in r.iter_content(chunk_size=1024 * 1024): if chunk: mp4.write(chunk) print(temp[0]+'下载完成') print('end!')
如对本文有疑问,请在下面进行留言讨论,广大热心网友会与你互动!! 点击进行留言回复
新手学习Python2和Python3中print不同的用法
Python基于os.environ从windows获取环境变量
网友评论