具有多個「。」的Python requests.get（）url。

-1

我正在寫一個腳本來刮的網站，目前下載MP3文件，它抓住以下爲藝術家個人頁面中使用BeautifulSoup與Python 3.6.3所有鏈接，但一些搶下網址包含多個「」在它和當我着火具有多個「。」的Python requests.get（）url。

request.get(url, header=<random header using fake_useragent>)

它不會下載該文件，我該如何糾正？

例如：

URL > abc.com/mp3/artist/songs/song.com.mp3 (not downloading) 
URL > abc.com/mp3/artist/songs/song.mp3 (downloading)

代碼：

def download_mp3(url_list_file, download_dir):                                         
    with open(url_list_file, 'r', encoding='utf-8') as urls:                                      

     for url in urls:                                               
      ua = fake_useragent.UserAgent(verify_ssl=False)                                      
      #header = {'User-Agent': str(ua.chrome.random)}                                      
      artist_dir = url.split('/')[4]                                          
      song_name = url.split('/')[6].replace('\n', '')                                      
      #corrected_url = ('path:url')                                           
      #print(corrected_url)                                             
      download_to = os.path.join(download_dir, artist_dir, song_name)                                  
      save_path = os.path.join(download_dir, artist_dir)                                     
      #print(download_to)                                             
      #print(save_path)                                              
      print(url)                                               
      if(os.path.isdir(save_path) == True):                                         
       #print ('True')                                             
       header = {'User-Agent': str(ua.random)}                                       
       lower_download_to = str.lower(song_name) 
       #To get correct file name 
       mp3_file_name = os.path.join(download_dir, artist_dir, lower_download_to.replace("www.", "").replace("[", "").replace("]", "")) 
       mp3_file = open(mp3_file_name, "wb")                                        
       temp_file = requests.get(url, headers=header)                                   
       mp3_file.write(temp_file.content)                                         
       time.sleep(random.randint(5,10))                                         
      else:                                                 
       #print('False') 
       #To create Artist folder names 
       os.mkdir(save_path)                                            
       header = {'User-Agent': str(ua.random)}                                       
       lower_download_to = str.lower(download_to) 
       #To get correct file name    
       mp3_file_name = os.path.join(download_dir, artist_dir, lower_download_to.replace("www.", "").replace("[", "").replace("]", "")) 
       mp3_file = open(mp3_file_name, "wb")                                        
       temp_file = requests.get(url, headers=header)                                      
       mp3_file.write(temp_file.content)                                         
       time.sleep(random.randint(5,10))                                         
    urls.closed

來源

2017-10-13 Chamisxs

'requests'不關心那個詳細程度的URL結構。你能發表一些實際的，可運行的代碼來證明這個問題嗎？這會讓我們更容易幫忙。 – larsks

嗨，我更新了問題 – Chamisxs

當你說「不下載」，你會得到一個錯誤？什麼錯誤？ –

得到修正它全部換成「」在不同的是使用

與string.replace最後一個爲」 MP3" 播放以百分比編碼的URL（「％2E」，（string.count（） ' ''。' - 1））

來源

2017-10-13 15:11:51 Chamisxs

具有多個「。」的Python requests.get（）url。

回答

相關問題