Python单线程爬取指定规则的网站

网站规则:6位验证码,135位是小写字母,246位是大写字母,其中12位和56位是大小写对应的关系,比如aBcDbA,并且把爬的结果输出到log里面

其实这就是搞定随机验证码的问题,并且Python3.6用requests模块

# vim 666.py 
import random
import requests
import time

def verification_code():
    lowa = chr(random.randint(97, 122))
    lowb = chr(random.randint(97, 122))
    capa = chr(random.randint(65, 90))
    capb = chr(random.randint(65, 90))
    vercode = lowa + capa + lowb + capb + capa.lower() + lowa.upper()
    return vercode

for i in range(1, 10000):    # 循环一万次
    req = requests.get('https://www.luyouli.com/' + verification_code())
    luyoulidate = open('/root/luyouli.com.log', 'a+')   # 输出结果到.log文件里
    print(req.text, req.url, file=luyoulidate)    # 写入文件里
    time.sleep(1)    # 每秒钟执行一次(怕被封IP)
# python36 666.py

代码写的比较low,还没有找到特别好的方法,如果以后找到了更好的方法就再写进来

3 评论

  1. Hi there would you mind letting me know which webhost you’re utilizing?
    I’ve loaded your blog in 3 different web browsers and I must say this blog loads a lot quicker
    then most. Can you recommend a good hosting provider at a honest price?
    Cheers, I appreciate it!

留下评论

error: Content is protected !!