如何基於與蟒蛇縮進

我有一個儲存與縮進/空格中源會計師樹解析層次：如何基於與蟒蛇縮進

Income 
    Revenue 
     IAP 
     Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses

有水平的固定號碼，所以我想扁平化層次結構，通過使用3個字段（實際數據具有6個級別，簡化例如）：

for rownum in range(6,ws.max_row+1): 
    accountName = str(ws.cell(row=rownum,column=1).value) 
    indent = len(accountName) - len(accountName.lstrip(' ')) 
    if indent == 0: 
     l1 = accountName 
     l2 = '' 
     l3 = '' 
    elif indent == 3: 
     l2 = accountName 
     l3 = '' 
    else: 
     l3 = accountName 

    w.writerow([l1,l2,l3])

：

L1  L2   L3 
Income 
Income Revenue 
Income Revenue  IAP 
Income Revenue  Ads 
Income Other-Income 
Expenses Developers In-house 
... etc

我可以通過檢查之前的帳戶名的空格數要這樣做

有沒有一種更靈活的方式來實現這一點，基於當前行的縮進與前一行相比，而不是假設它每個級別總是3個空格？ L1將始終沒有縮進，並且我們可以相信較低的級別會比其父級進一步縮進，但每個級別可能不總是3個空格。

更新，最終以此作爲邏輯的肉，因爲我最終希望擁有內容的帳戶列表，似乎最簡單的方法是使用縮進來決定是重置，追加還是彈出列表：

 if indent == 0: 
      accountList = [] 
      accountList.append((indent,accountName)) 
     elif indent > prev_indent: 
      accountList.append((indent,accountName)) 
     elif indent <= prev_indent: 
      max_indent = int(max(accountList,key=itemgetter(0))[0]) 
      while max_indent >= indent: 
       accountList.pop() 
       max_indent = int(max(accountList,key=itemgetter(0))[0]) 
      accountList.append((indent,accountName))

所以在輸出的每一行accountList都是完整的。

來源

2017-08-30 Hart CO

你可以模仿Python實際解析縮進的方式。首先，創建一個包含縮進級別的堆棧。在每一行上：

如果壓痕大於堆棧頂部，則按下它並增加深度級別。
如果相同，繼續在同一級別。
如果較低，則彈出堆棧頂部，高於新縮進。如果在查找完全相同之前發現較低的縮進級別，則會出現縮進錯誤。

indentation = [] 
indentation.append(0) 
depth = 0 

f = open("test.txt", 'r') 

for line in f: 
    line = line[:-1] 

    content = line.strip() 
    indent = len(line) - len(content) 
    if indent > indentation[-1]: 
     depth += 1 
     indentation.append(indent) 

    elif indent < indentation[-1]: 
     while indent < indentation[-1]: 
      depth -= 1 
      indentation.pop() 

     if indent != indentation[-1]: 
      raise RuntimeError("Bad formatting") 

    print(f"{content} (depth: {depth})")

隨着其含量「的test.txt」文件是爲您提供：

Income 
    Revenue 
     IAP 
     Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses

這裏是輸出：

Income (depth: 0) 
Revenue (depth: 1) 
IAP (depth: 2) 
Ads (depth: 2) 
Other-Income (depth: 1) 
Expenses (depth: 0) 
Developers (depth: 1) 
In-house (depth: 2) 
Contractors (depth: 2) 
Advertising (depth: 1) 
Other Expense (depth: 1)

所以，你可以你這樣做？假設你想構建嵌套列表。首先，創建一個數據堆棧。

當您找到縮進時，在數據堆棧的末尾附加一個新列表。
當您發現一個unindentation時，彈出頂部列表，並將其追加到新的頂部。

而且，無論如何，對於每一行，都會將內容附加到數據堆棧頂部的列表中。

下面是相應的實施：

for line in f: 
    line = line[:-1] 

    content = line.strip() 
    indent = len(line) - len(content) 
    if indent > indentation[-1]: 
     depth += 1 
     indentation.append(indent) 
     data.append([]) 

    elif indent < indentation[-1]: 
     while indent < indentation[-1]: 
      depth -= 1 
      indentation.pop() 
      top = data.pop() 
      data[-1].append(top) 

     if indent != indentation[-1]: 
      raise RuntimeError("Bad formatting") 

    data[-1].append(content) 

while len(data) > 1: 
    top = data.pop() 
    data[-1].append(top)

你的嵌套列表是在您data堆棧的頂部。爲同一文件的輸出是：

['Income', 
    ['Revenue', 
     ['IAP', 
     'Ads' 
     ], 
    'Other-Income' 
    ], 
'Expenses', 
    ['Developers', 
     ['In-house', 
     'Contractors' 
     ], 
    'Advertising', 
    'Other Expense' 
    ] 
]

這是比較容易操縱，雖然相當深度嵌套。您可以通過級聯項訪問數據訪問：

>>> l = data[0] 
>>> l 
['Income', ['Revenue', ['IAP', 'Ads'], 'Other-Income'], 'Expenses', ['Developers', ['In-house', 'Contractors'], 'Advertising', 'Other Expense']] 
>>> l[1] 
['Revenue', ['IAP', 'Ads'], 'Other-Income'] 
>>> l[1][1] 
['IAP', 'Ads'] 
>>> l[1][1][0] 
'IAP'

來源

2017-08-30 15:54:39

感謝這個，我最終希望能夠輸出在與行的內容沿每一行的層次，所以我稍作修改，但這讓我朝着正確的方向前進。 –

如果壓痕是空間固定金額（這裏3個空格），可以簡化縮進級別的計算。

注：我用StringIO的模擬文件

import io 
import itertools 

content = u"""\ 
Income 
    Revenue 
     IAP 
     Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses 
""" 

stack = [] 
for line in io.StringIO(content): 
    content = line.rstrip() # drop \n 
    row = content.split(" ") 
    stack[:] = stack[:len(row) - 1] + [row[-1]] 
    print("\t".join(stack))

你得到：

Income 
Income Revenue 
Income Revenue IAP 
Income Revenue Ads 
Income Other-Income 
Expenses 
Expenses Developers 
Expenses Developers In-house 
Expenses Developers Contractors 
Expenses Advertising 
Expenses Other Expenses

編輯：壓痕不固定

如果縮進不是固定（你並不總是有3個空格），如下例所示：

content = u"""\ 
Income 
    Revenue 
    IAP 
    Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses 
"""

你需要估計在每一個新行轉移：

stack = [] 
last_indent = u"" 
for line in io.StringIO(content): 
    indent = "".join(itertools.takewhile(lambda c: c == " ", line)) 
    shift = 0 if indent == last_indent else (-1 if len(indent) < len(last_indent) else 1) 
    index = len(stack) + shift 
    stack[:] = stack[:index - 1] + [line.strip()] 
    last_indent = indent 
    print("\t".join(stack))

來源

2017-08-30 16:17:12

如何基於與蟒蛇縮進

回答

相關問題