通過列名選擇文本文件中的特定列並提取它們的內容

我是Python的初學者，我發現很難爲此問題提出正確的解決方案。我瀏覽了所有類似的帖子，並找不到解決方案。
我有一個「.ext」文件。我需要跳過前兩行。第三行有表格的列名稱。
我需要搜索其中n可以是任意數字（例如：sigma（1,1），omega（2,2））的列omega（n，n）和Sigma（n，n）列名稱。分析列名稱爲「sigma（n，n）」和「omega（n，n）」的列，然後檢查這些列的值，以'-1000000000'開始。如果值爲< 0.001，則輸出「true 」。通過列名選擇文本文件中的特定列並提取它們的內容

我的代碼是：

import numpy as np 
array=[] 
array1=[] 
b = np.genfromtxt(r'C:/nm73/proj/one.ext', delimiter=' ', names=True,dtype=None)[3:,:] 
for n in range(len(b)-1): 
    array=b['Sigma(n,n)'] 
    array1=b['omega(n,n)']

我不知道如何檢查的元素。

One.ext文件如下所示：如果文件格式不正確，我很抱歉。我是新的stackoverflow。任何幫助，高度讚賞。

TABLE NO.  1: First Order Conditional Estimation with Interaction: Goal  Function=MINIMUM VALUE OF OBJECTIVE FUNCTION: Problem=1 Subproblem=0 Superproblem1=0  Iteration1=0 Superproblem2=0 Iteration2=0 
ITERATION THETA1  THETA2  SIGMA(1,1) SIGMA(2,1) SIGMA(2,2) OMEGA(1,1) OMEGA(2,1) OMEGA(2,2) OBJ 
      0 2.50000E-01 1.00000E+01 1.00000E-01 0.00000E+00 1.00000E-01  1.00000E-01 0.00000E+00 1.00000E-01 9436.65314342255 
      5 2.34948E-01 3.67675E+00 9.04159E-02 0.00000E+00 2.74933E+00 1.98686E-01 0.00000E+00 1.75724E-01 8745.97204613658 
      10 2.11090E-01 4.30565E+00 1.34312E-01 0.00000E+00 1.12619E+00 1.32484E-01 0.00000E+00 1.36824E-02 8595.43106384756 
      15 2.10696E-01 4.35495E+00 1.23897E-01 0.00000E+00 1.29124E+00 1.28600E-01 0.00000E+00 1.24441E-02 8591.51400321872 
      20 2.11129E-01 4.36325E+00 1.24283E-01 0.00000E+00 1.28733E+00 1.28815E-01 0.00000E+00 1.24211E-02 8591.50022332770 
    -1000000000 2.11129E-01 4.36325E+00 1.24283E-01 0.00000E+00 1.28733E+00 1.28815E-01 0.00000E+00 1.24211E-02 8591.50022332770 
    -1000000001 8.07565E-03 6.97861E-02 5.28558E-03 1.00000E+10 4.20370E-01 1.78706E-02 1.00000E+10 3.15324E-03 0.000000000000000E+000 
    -1000000004 0.00000E+00 0.00000E+00 3.52538E-01 0.00000E+00 1.13460E+00 3.58908E-01 0.00000E+00 1.11450E-01 0.000000000000000E+000 
    -1000000005 0.00000E+00 0.00000E+00 7.49648E-03 1.00000E+10 1.85250E-01 2.48957E-02 1.00000E+10 1.41465E-02 0.000000000000000E+000

來源

2014-09-02 user3923643

我想你可能需要用'skip_header = 1'在您的來電genfromtxt。這將跳過第一行，您也可以刪除陣列拼接。這應該給你你期望的矩陣，第一行的'SIGMA（1,1）'值是'b [0] ['SIGMA（1,1）']'，對於第二行： 'b [1] [ 'SIGMA（1,1）']'。我無法測試這個atm，所以我不能100％確定。 – 2014-09-02 21:31:54

如果不指定delimiter，則所有連續空白將被理解爲作爲一個分隔符。如果指定delimiter=' '則字面上每個空間將作爲分隔符。這導致ValueError，因爲genfromtxt將預計錯誤的列數。

所以相反地，如果您使用：

In [396]: b = np.genfromtxt(filename, names=True, dtype=None, skip_header=1)

然後你就會結束了一個結構數組是這樣的：

In [397]: b 
Out[397]: 
array([(0, 0.25, 10.0, 0.1, 0.0, 0.1, 0.1, 0.0, 0.1, 9436.65314342255), 
     (5, 0.234948, 3.67675, 0.0904159, 0.0, 2.74933, 0.198686, 0.0, 0.175724, 8745.97204613658), 
     (10, 0.21109, 4.30565, 0.134312, 0.0, 1.12619, 0.132484, 0.0, 0.0136824, 8595.43106384756), 
     (15, 0.210696, 4.35495, 0.123897, 0.0, 1.29124, 0.1286, 0.0, 0.0124441, 8591.51400321872), 
     (20, 0.211129, 4.36325, 0.124283, 0.0, 1.28733, 0.128815, 0.0, 0.0124211, 8591.5002233277), 
     (-1000000000, 0.211129, 4.36325, 0.124283, 0.0, 1.28733, 0.128815, 0.0, 0.0124211, 8591.5002233277), 
     (-1000000001, 0.00807565, 0.0697861, 0.00528558, 10000000000.0, 0.42037, 0.0178706, 10000000000.0, 0.00315324, 0.0), 
     (-1000000004, 0.0, 0.0, 0.352538, 0.0, 1.1346, 0.358908, 0.0, 0.11145, 0.0), 
     (-1000000005, 0.0, 0.0, 0.00749648, 10000000000.0, 0.18525, 0.0248957, 10000000000.0, 0.0141465, 0.0)], 
     dtype=[('ITERATION', '<i4'), ('THETA1', '<f8'), ('THETA2', '<f8'), ('SIGMA11', '<f8'), ('SIGMA21', '<f8'), ('SIGMA22', '<f8'), ('OMEGA11', '<f8'), ('OMEGA21', '<f8'), ('OMEGA22', '<f8'), ('OBJ', '<f8')])

通知的dtype底。列名不包含圓括號或逗號，所以不是SIGMA(1,1)您有SIGMA11。你可以這樣訪問這個列：

In [398]: b['SIGMA11'] 
Out[398]: 
array([ 0.1  , 0.0904159 , 0.134312 , 0.123897 , 0.124283 , 
     0.124283 , 0.00528558, 0.352538 , 0.00749648])

來源

2014-09-02 21:55:27 unutbu

你試過熊貓嗎？
這個例子可能說明你在找什麼依據：

import pandas as p 
f = 'C:\Documents and Settings\Joaquin\Escritorio\one.ext' 

# read your table and set the first column as index 
table = p.read_csv(f, sep=' ', header=1,skipinitialspace=True) 
table = table.set_index('ITERATION') 

# get the two cells corresponding to the columns you wan at row -100000000 
print table.xs(-1000000000)[['SIGMA(1,1)', 'OMEGA(1,1)']]

給出：

SIGMA(1,1) 0.124283 
OMEGA(1,1) 0.128815 
Name: -1000000000, dtype: float64

來源

2014-09-02 21:59:30 joaquin

通過列名選擇文本文件中的特定列並提取它們的內容

回答

相關問題