2016-06-07 174 views
1

由於幾天我嘗試登錄到www.onlydomains.com網站檢索我的域名列表到腳本。 我已經有這樣的事情:Python 2.7,請求,登錄到onlydomains.com網站

#!/usr/bin/env python 
# -*- coding: utf-8 -*- 

import requests, sys, re, whois 
from bs4 import BeautifulSoup 

def onlydomains(): 
    with requests.Session() as c: 
     PASSWORD = 'my%password' 
     USERNAME = 'my_username' 
     URL = 'https://www.onlydomains.com/account/login' 
     c.get(URL) 
     soup = BeautifulSoup(c.get(URL).text, "lxml") 

     csrf = soup.find("input", value=True)["value"] 

    login_data = { 
     'csrfToken' : csrf, 
     'username' : USERNAME, 
     'password' : PASSWORD, 
     'submit' : 'Submit',} 

    r = c.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'}) 
    r = c.get('https://onlydomains.secure-admin.com/domain/index') 
    print r.text 

onlydomains() 

而且它不會爲我工作,因爲我總是得到

> ./onlydomains.py 

    <!DOCTYPE html><html lang="en"><head><meta charset="utf-8" /><title>Login/Sign Up - OnlyDomains</title> 

任何想法我做錯了什麼?

回答

1

如果你看一下從後回來,你可以看到一個window.location = some_url

<script type="text/javascript"> 
       $(document).ready(function(){ 

        setTimeout(function(){ 

          window.location = 'https://onlydomains.secure-admin.com/dashboard/index?_srs_=v42oadi4cAuxIM4PHc5IdgU%5CdXd3AjswsOraTLjynso%3D';; 


        },1000); 
       }); 
      </script> 

可以用它來獲取頁面:

patt = re.compile("window.location\s+=\s+'(http.*)'") 

    with requests.Session() as s: 
     PASSWORD = 'user' 
     USERNAME = "pass" 
     URL = 'https://www.onlydomains.com/account/login' 
     soup = BeautifulSoup(s.get(URL).text, "lxml") 
     csrf = soup.select_one("input[name=csrfToken]")["value"] 

    login_data = { 
     'csrfToken' : csrf, 
     'username' : USERNAME, 
     'password' : PASSWORD} 


    r = c.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'}) 

    url = patt.search(r.text).group(1) 
    r = s.get(url).text 
    print(r) 

如果我們運行的代碼和主要內容打印data-original-title屬性,你可以看到我們是在dashborad頁:

In [5]: with requests.Session() as s: 
    ...:   PASSWORD = 'xxxxxx' 
    ...:   USERNAME = "xxxxxxxxxx" 
    ...:   URL = 'https://www.onlydomains.com/account/login' 
    ...:   soup = BeautifulSoup(c.get(URL).text, "lxml") 
    ...:   csrf = soup.select_one("input[name=csrfToken]")["value"] 
    ...:   login_data = { 
    ...:   'csrfToken' : csrf, 
    ...:   'username' : USERNAME, 
    ...:   'password' : PASSWORD} 
    ...:   r = s.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'}) 
    ...:   url = patt.search(r.text).group(1) 
    ...:   r = s.get(url).text 
    ...:   soup = BeautifulSoup(r,"lxml") 
    ...:   print(soup.select_one("h1.PageTitle.visible-xs i.fa.fa-info-circle")["data-original-title"]) 
    ...:  

Welcome to your Dashboard! Here you have a general overview of what's happening and how to manage your domain assets. 
+0

對不起,但我不熟悉'patt.search(r.text).group(1)' 我得到: 'url = patt.search(r.text).group(1) NameError:全局名稱'patt'未定義' –

+0

對不起,我忘了添加正則表達式,我將在 –

+0

Thanx編輯!我在。:) –

-1

我認爲,解決問題的最佳方式將與硒(我記得做一些像你想與BS做什麼,但我不記得如何現在)

from selenium import webdriver 

chromedriver = 'C:\\chromedriver.exe' 
browser = webdriver.Chrome(chromedriver) 
browser.get('http://www.example.com') 

username = browser.find_element_by_name('username') 
username.send_keys('user1') 

password = browser.find_element_by_name('password') 
password.send_keys('secret') 

form = browser.find_element_by_id('loginForm') 
form.submit() 

這將使你能夠加載應該包含您要:)的信息下一頁

+0

硒工作,昨天我已經嘗試成功。但我不想每次打開Firefox /鉻。它將是服務器腳本。我在linux上工作。 ;) –