python
  • web-scraping
  • beautifulsoup
  • 2017-07-19 96 views 0 likes 
    0

    我正在處理簡單的項目,並且遇到了問題。我想從"div player_data="得到字符串。下面是這個divPython - BeautifulSoup從player_data獲取字符串

    <div id="mediaplayer60597053" 
        player_data='{ 
         "id": "mediaplayer60597053", 
         "ads": { 
         "schedule": [{ 
          "enabled": true, 
          "counter": false, 
          "skip": true, 
          "click": true, 
          "key": "", 
          "tag": "https:\/\/www.cda.pl\/xml.php?type=g_embed&get=pool&ts=1500453286", 
          "repeat": 1, 
          "time": 0, 
          "type": "pool", 
          "displayAs": "prerol" 
         }] 
         }, 
         "video": { 
         "id": "60597053", 
         "file": "http:\/\/vrbx072.cda.pl\/dYXEHM8Nw3y_TZTmTs4e0g\/1500496486\/vl9afb2190473cc908d0c33cdb15bb212994083ca30c797154058bc8717c4ca746.mp4", 
         "manifest": null, 
         "duration": "6115", 
         "durationFull": "01:41:55", 
         "poster": "\/\/static.cda.pl\/v001\/img\/mobile\/poster16x9.png", 
         "type": "plain", 
         "width": 1920, 
         "height": 816, 
         "content_rating": null, 
         "quality": "vl", 
         "ts": 1500453286, 
         "hash": "26be0bc36e8575c32ff32f4329a301889d1f6f7a" 
         }, 
         "nextVideo": null, 
         "autoplay": false, 
         "seekTo": 0, 
         "premium": false, 
         "api": { 
         "client": "json_client", 
         "ts": "1500453286_60686", 
         "key": "9a3859a86e909430bd379badfa68d0d712603626", 
         "method": "" 
         }, 
         "user": { 
         "role": "guest" 
         } 
        }' 
        tabindex="1"> 
    </div> 
    

    我想這個字符串:

    "http:\/\/vrbx072.cda.pl\/dYXEHM8Nw3y_TZTmTs4e0g\/1500496486\/vl9afb2190473cc908d0c33cdb15bb212994083ca30c797154058bc8717c4ca746.mp4 
    

    感謝您的幫助。

    回答

    1

    看起來你需要你得到div,然後從那裏提取json對象。您可以使用soup.find來提取div,然後使用json.loads將json字符串轉換爲python字典。

    import json 
    
    div = soup.find('div', {'id' : 'mediaplayer60597053' }) 
    data = json.loads(div['player_data']) 
    
    print(data['video']['file']) 
    # 'http://vrbx072.cda.pl/dYXEHM8Nw3y_TZTmTs4e0g/1500496486/vl9afb2190473cc908d0c33cdb15bb212994083ca30c797154058bc8717c4ca746.mp4' 
    
    +0

    THX的答案,但它給了我這個'uggcf://ieok056.pqn.cy/0r_FFJVYyyttw9jq-BHXmD/1500497686/uq9nso2190473pp908q0p33pqo15oo212994083pn30p797154058op8717p4pn746nqp.zc4 ' – jestembotem

    +0

    @jestembotem提出了整改意見。現在檢查。 –

    +0

    此代碼是正確的。我犯了一個錯誤。 Thx – jestembotem

    相關問題