2017-06-05 615 views
0

我有一個(大)的XML文件集,我想搜索一組字符串都存在 - 我試圖使用以下Python代碼做到這一點:os.scandir給出[WinError 3]系統找不到指定的路徑

import collections 

thestrings = [] 
with open('Strings.txt') as f: 
    for line in f: 
    text = line.strip() 
    thestrings.append(text) 

print('Searching for:') 
print(thestrings) 
print('Results:') 

try: 
    from os import scandir 
except ImportError: 
    from scandir import scandir 

def scantree(path): 
    """Recursively yield DirEntry objects for given directory.""" 
    for entry in scandir(path): 
    if entry.is_dir(follow_symlinks=False) and (not entry.name.startswith('.')): 
     yield from scantree(entry.path) 
    else: 
     yield entry 

if __name__ == '__main__': 
    for entry in scantree('//path/to/folder'): 
    if ('.xml' in entry.name) and ('.zip' not in entry.name): 
     with open(entry.path) as f: 
     data = f.readline() 
     if (thestrings[0] in data): 
      print('') 
      print('****** Schema found in: ', entry.name) 
      print('') 
      data = f.read() 
      if (thestrings[1] in data) and (thestrings[2] in data) and (thestrings[3] in data): 
      print('Hit at:', entry.path) 

    print("Done!") 

哪裏Strings.txt是我有興趣,找到字符串的文件,並且第一行是架構URI。

這似乎在第一次運行正常,但在幾秒鐘後給了我:

FileNotFoundError: [WinError 3] The system cannot find the path specified: //some/path 

這是困惑我,因爲路徑運行期間,興建?

注意,如果我儀器代碼如下:

with open(entry.path) as f: 
    data = f.readline() 
    if (thestrings[0] in data): 

要成爲:

with open(entry.path) as f: 
    print(entry.name) 
    data = f.readline() 
    if (thestrings[0] in data): 

然後我看到發生錯誤之前被人發現一些潛在的文件。

+0

可能的路徑開頭的雙斜槓'//一些/ path'解釋作爲遠程SMB路徑,如'\\ server \ shared'?然後你無法訪問名爲'path'或'some'的服務器。 – rodrigo

+0

附註:你的'scantree'函數正在重新發明'os.walk'。 –

+0

@rodrigo,如圖所示,它是一個UNC路徑,但這種情況下可能的錯誤是'ERROR_BAD_NET_NAME'(67)。獲取'ERROR_PATH_NOT_FOUND'(3)意味着本地設備名稱不存在(例如列出未映射的驅動器盤符)或未找到路徑組件 - 除非最終組件未找到,則錯誤爲「ERROR_FILE_NOT_FOUND」 (2)。 – eryksun

回答

0

我意識到,我的劇本是尋找一些很長的UNC路徑名,太長的Windows似乎,所以我現在也在嘗試打開文件之前檢查的路徑長度,如下所示:

if name.endswith('.xml'): 
    fullpath = os.path.join(root, name) 
    if (len(fullpath) > 255): ##Too long for Windows! 
    print('File-extension-based candidate: ', fullpath) 
    else: 
    if os.path.isfile(fullpath): 
     with open(fullpath) as f: 
     data = f.readline() 
     if (thestrings[0] in data): 
      print('Schema-based candidate: ', fullpath) 

注意,我還決定檢查文件是否真的是一個文件,並且我改變了我的代碼以使用os.walk,如上所述。隨着使用.endswith()

現在一切似乎工作確定簡化爲.xml文件擴展名的支票......

相關問題