用Python找到壞符號鏈接

20

常見的Python的說法是，它更容易請求原諒比允許。雖然我不是現實生活中的這種說法的粉絲，但它在很多情況下都適用。通常你想避免在同一個文件上鍊接兩個系統調用的代碼，因爲你永遠不知道你的代碼中兩次調用之間的文件會發生什麼。

一個典型的錯誤是寫類似：

if os.path.exists(path): 
    os.unlink(path)

第二次調用（os.unlink）如果你的東西，如果測試之後人刪除它可能會失敗，引發異常，並停止休息你的函數執行。（你可能認爲這不會發生在現實生活中，但我們上週發現了另一個類似於我們的代碼庫的bug - 這是一些錯誤，導致一些程序員撓着腦袋，聲稱'Heisenbug'過去幾個月中）

所以，你的具體情況，我可能會做：

try: 
    os.stat(path) 
except OSError, e: 
    if e.errno == errno.ENOENT: 
     print 'path %s does not exist or is a broken symlink' % path 
    else: 
     raise e

這裏的煩惱是，統計返回相同的錯誤代碼爲一個符號，僅僅是不存在的，而虛符號鏈接。

所以，我想你沒有選擇，比打破原子，並完成類似

if not os.path.exists(os.readlink(path)): 
    print 'path %s is a broken symlink' % path

來源

2008-08-25 21:32:20

+1

的readlink還可以設置errno == ENOTDIR如果符號鏈接missuses文件作爲目錄。 – 2010-07-20 23:41:37

+3

如果鏈接「路徑」被賦予了一個到其目標的相對路徑，則os.readlink（路徑）可能不會獲得實際路徑。例如，如果path鏈接到'../target'，當你運行的腳本不在鏈接所在的路徑中時，os.path.exists（os.readlink（path））將返回false，因爲在路徑您的腳本，其上一級目錄沒有文件或文件夾稱爲「目標」。避免這種情況的一種安全方法是使用os.path.exists（os.path.realpath（path））。 – AplusG 2014-05-09 05:38:22

3

我可以提到測試沒有Python的硬鏈接嗎？/bin/test具有文件共享inode時FILE1 -ef FILE2條件。

因此，像find . -type f -exec test \{} -ef /path/to/file \; -print這樣的東西適用於特定文件的硬鏈接測試。

這使我讀man test和-L和-h的提到，如果該文件是一個符號鏈接，在一個文件中同時工作，並返回true，但是這並沒有告訴你，如果目標丟失。

我確實發現head -0 FILE1將返回0退出代碼如果該文件可以打開和1如果不能，這在一個符號鏈接到一個普通文件的情況下，可以作爲一個測試是否是目標能被閱讀。

來源

2008-08-21 19:13:46 dlamblin

1

我不是一個Python的傢伙，但它看起來像os.readlink（）？我將在perl中使用的邏輯是使用readlink（）來查找目標，並使用stat（）來測試目標是否存在。

編輯：我ban出了一些演示readlink的perl。我相信Perl的統計和執行的readlink和Python的os.stat（）和os.readlink（）是系統調用兩種包裝，所以這應該翻譯合理以及概念證明代碼：

wembley 0 /home/jj33/swap > cat p 
my $f = shift; 

while (my $l = readlink($f)) { 
    print "$f -> $l\n"; 
    $f = $l; 
} 

if (!-e $f) { 
    print "$f doesn't exist\n"; 
} 
wembley 0 /home/jj33/swap > ls -l | grep ^l 
lrwxrwxrwx 1 jj33 users   17 Aug 21 14:30 link -> non-existant-file 
lrwxrwxrwx 1 root  users   31 Oct 10 2007 mm -> ../systems/mm/20071009-rewrite// 
lrwxrwxrwx 1 jj33 users   2 Aug 21 14:34 mmm -> mm/ 
wembley 0 /home/jj33/swap > perl p mm 
mm -> ../systems/mm/20071009-rewrite/ 
wembley 0 /home/jj33/swap > perl p mmm 
mmm -> mm 
mm -> ../systems/mm/20071009-rewrite/ 
wembley 0 /home/jj33/swap > perl p link 
link -> non-existant-file 
non-existant-file doesn't exist 
wembley 0 /home/jj33/swap >

來源

2008-08-21 19:14:01 jj33

11

os.lstat()可能會有所幫助。如果lstat（）成功並且stat（）失敗，那麼它可能是一個斷開的鏈接。

來源

2008-08-21 19:15:33

2

os.path

您可以嘗試使用真實路徑（）來獲得什麼符號連接點，然後試圖以確定它是否是一個使用有效的文件是文件。

（我不能嘗試出去了，所以你必須發揮與它周圍，看看你會得到什麼）

來源

2008-08-21 19:19:24

8

這不是原子，但它的工作原理。

os.path.islink(filename) and not os.path.exists(filename)

實際上由RTFM （閱讀奇妙手冊），我們看到

os.path.exists（路徑）

返回真，如果路徑是指現有的路徑。對於錯誤的符號鏈接返回False。

它還說：

在某些平臺上，這個函數可以返回false，如果不被許可對請求的文件執行os.stat（），即使路徑實際存在。

所以，如果你擔心權限，你應該添加其他子句。

來源

2015-06-28 16:49:13 am70

0

我有一個類似的問題：如何捕捉破碎的符號鏈接，即使它們發生在某些父目錄中？我也想記錄所有這些（在處理大量文件的應用程序中），但沒有太多的重複。

這是我想出的，包括單元測試。

fileutil.py：

import os 
from functools import lru_cache 
import logging 

logger = logging.getLogger(__name__) 

@lru_cache(maxsize=2000) 
def check_broken_link(filename): 
    """ 
    Check for broken symlinks, either at the file level, or in the 
    hierarchy of parent dirs. 
    If it finds a broken link, an ERROR message is logged. 
    The function is cached, so that the same error messages are not repeated. 

    Args: 
     filename: file to check 

    Returns: 
     True if the file (or one of its parents) is a broken symlink. 
     False otherwise (i.e. either it exists or not, but no element 
     on its path is a broken link). 

    """ 
    if os.path.isfile(filename) or os.path.isdir(filename): 
     return False 
    if os.path.islink(filename): 
     # there is a symlink, but it is dead (pointing nowhere) 
     link = os.readlink(filename) 
     logger.error('broken symlink: {} -> {}'.format(filename, link)) 
     return True 
    # ok, we have either: 
    # 1. a filename that simply doesn't exist (but the containing dir 
      does exist), or 
    # 2. a broken link in some parent dir 
    parent = os.path.dirname(filename) 
    if parent == filename: 
     # reached root 
     return False 
    return check_broken_link(parent)

單元測試：

import logging 
import shutil 
import tempfile 
import os 

import unittest 
from ..util import fileutil 


class TestFile(unittest.TestCase): 

    def _mkdir(self, path, create=True): 
     d = os.path.join(self.test_dir, path) 
     if create: 
      os.makedirs(d, exist_ok=True) 
     return d 

    def _mkfile(self, path, create=True): 
     f = os.path.join(self.test_dir, path) 
     if create: 
      d = os.path.dirname(f) 
      os.makedirs(d, exist_ok=True) 
      with open(f, mode='w') as fp: 
       fp.write('hello') 
     return f 

    def _mklink(self, target, path): 
     f = os.path.join(self.test_dir, path) 
     d = os.path.dirname(f) 
     os.makedirs(d, exist_ok=True) 
     os.symlink(target, f) 
     return f 

    def setUp(self): 
     # reset the lru_cache of check_broken_link 
     fileutil.check_broken_link.cache_clear() 

     # create a temporary directory for our tests 
     self.test_dir = tempfile.mkdtemp() 

     # create a small tree of dirs, files, and symlinks 
     self._mkfile('a/b/c/foo.txt') 
     self._mklink('b', 'a/x') 
     self._mklink('b/c/foo.txt', 'a/f') 
     self._mklink('../..', 'a/b/c/y') 
     self._mklink('not_exist.txt', 'a/b/c/bad_link.txt') 
     bad_path = self._mkfile('a/XXX/c/foo.txt', create=False) 
     self._mklink(bad_path, 'a/b/c/bad_path.txt') 
     self._mklink('not_a_dir', 'a/bad_dir') 

    def tearDown(self): 
     # Remove the directory after the test 
     shutil.rmtree(self.test_dir) 

    def catch_check_broken_link(self, expected_errors, expected_result, path): 
     filename = self._mkfile(path, create=False) 
     with self.assertLogs(level='ERROR') as cm: 
      result = fileutil.check_broken_link(filename) 
      logging.critical('nothing') # trick: emit one extra message, so the with assertLogs block doesn't fail 
     error_logs = [r for r in cm.records if r.levelname is 'ERROR'] 
     actual_errors = len(error_logs) 
     self.assertEqual(expected_result, result, msg=path) 
     self.assertEqual(expected_errors, actual_errors, msg=path) 

    def test_check_broken_link_exists(self): 
     self.catch_check_broken_link(0, False, 'a/b/c/foo.txt') 
     self.catch_check_broken_link(0, False, 'a/x/c/foo.txt') 
     self.catch_check_broken_link(0, False, 'a/f') 
     self.catch_check_broken_link(0, False, 'a/b/c/y/b/c/y/b/c/foo.txt') 

    def test_check_broken_link_notfound(self): 
     self.catch_check_broken_link(0, False, 'a/b/c/not_found.txt') 

    def test_check_broken_link_badlink(self): 
     self.catch_check_broken_link(1, True, 'a/b/c/bad_link.txt') 
     self.catch_check_broken_link(0, True, 'a/b/c/bad_link.txt') 

    def test_check_broken_link_badpath(self): 
     self.catch_check_broken_link(1, True, 'a/b/c/bad_path.txt') 
     self.catch_check_broken_link(0, True, 'a/b/c/bad_path.txt') 

    def test_check_broken_link_badparent(self): 
     self.catch_check_broken_link(1, True, 'a/bad_dir/c/foo.txt') 
     self.catch_check_broken_link(0, True, 'a/bad_dir/c/foo.txt') 
     # bad link, but shouldn't log a new error: 
     self.catch_check_broken_link(0, True, 'a/bad_dir/c') 
     # bad link, but shouldn't log a new error: 
     self.catch_check_broken_link(0, True, 'a/bad_dir') 

if __name__ == '__main__': 
    unittest.main()

來源

2016-10-27 01:49:11

用Python找到壞符號鏈接

回答

相關問題