We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
其实可以不用进行验证码操作,受作者启发,可以先登录weibo.com的无验证码入口(微博账号安全里设为常登陆地点可以免验证码),然后直接在phontomjs模拟打开weibo.cn,weibo.cn会是登录状态,这时候获取cookies便可。
weibo.com
phontomjs
weibo.cn
由于我自己实现了,代码如下,仅供参考:
def init_phantomjs_driver(): headers = { 'Cookie': 'YF-Ugrow-G0=b02489d329584fca03ad6347fc915997; SUB=_2AkMvgPj2dcPxrAFYnPgWyGvkZYpH-jycVZEAAn7uJhMyOhgv7nBSqSVOKynW2PbhU4768kfRGZgNPwXeRA..; SUBP=0033WrSXqPxfM72wWs9jqgMF55529P9D9WWEFXHsNpvgJdQjr1GM.e765JpVF020SKM7e0571hMc', # 未登录时weibo.com的cookie } for key, value in headers.items(): webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = value useragent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.110 Safari/537.36' webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.settings.userAgent'] = useragent # local path refer phantomjs driver = webdriver.PhantomJS(executable_path='xxxxxxxphantomjs路径xxxxxxx') driver.set_window_size(1366, 768) return driver
browser = weibo_auto_handle.init_phantomjs_driver() browser.get("http://weibo.com") time.sleep(3) failure = 0 while "微博-随时随地发现新鲜事" == browser.title and failure < 5: failure += 1 username = browser.find_element_by_name("username") pwd = browser.find_element_by_name("password") login_submit = browser.find_element_by_class_name('W_btn_a') username.clear() username.send_keys(account['usn']) pwd.clear() pwd.send_keys(account['pwd']) login_submit.click() time.sleep(5) # if browser.find_element_by_class_name('verify').is_displayed(): # logging.error("Verify code is needed! (Account: %s)" % account) if "我的首页 微博-随时随地发现新鲜事" in browser.title: browser.get('http://weibo.cn/') cookie = dict() if "我的首页" in browser.title: for elem in browser.get_cookies(): cookie[elem["name"]] = elem["value"] # p2 = persist_iics.Persist() # p2.save_account_cookies(accounts[0][0], cookie, datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")) logging.error('Account cookies updated! (Account_id: %s)' % account['usn']) return cookie
The text was updated successfully, but these errors were encountered:
嗯,想法不错,少量作业的情况可以用这个。 但是如果抓取量大的话登录的账号比较多,不可能人工去设置,另外微博对IP有限制,爬得快的要加代理,也不适用。
Sorry, something went wrong.
No branches or pull requests
其实可以不用进行验证码操作,受作者启发,可以先登录
weibo.com
的无验证码入口(微博账号安全里设为常登陆地点可以免验证码),然后直接在phontomjs
模拟打开weibo.cn
,weibo.cn
会是登录状态,这时候获取cookies便可。由于我自己实现了,代码如下,仅供参考:
The text was updated successfully, but these errors were encountered: