2015-10-20 78 views
0

我有一個包含multiindex的數據框。我需要根據架構和/或腳本使用各種數據子集(索引是schemascript)。數據框看起來是這樣的:通過MultiIndex檢索數據

      tx_id step step_id   start_time              
schema_10 cmc_v2_file  19-3 10  279 2015-09-04 00:46:30 
      cmc_v2_file  2-7 10  423 2015-09-04 00:46:22 
      cmc_v2_file  29-1 10  20 2015-09-04 00:46:34 
      cmc_v2_file  35-1  4  63 2015-09-04 00:46:51 
      cmc_v2_file  31-2 10  79 2015-09-04 00:46:54 
      cmc_v2_file  5-8 10  536 2015-09-04 00:46:57 
      cmc_v2_file  5-9 10  610 2015-09-04 00:47:13 
      cmc_v2_file  39-1 10  178 2015-09-04 00:47:12 
      cmc_v2_file  41-1 10  211 2015-09-04 00:47:22 
      cmc_v2_file  21-4 10  678 2015-09-04 00:47:28 
      cmc_v2_file  23-4 10  698 2015-09-04 00:47:31 
      cmc_v2_file  31-5 10  399 2015-09-04 00:47:45 
      cmc_v2_file  35-4  3  453 2015-09-04 00:47:54 
      cmc_v2_file  29-5  4  461 2015-09-04 00:47:54 
      cmc_v2_file  29-5  8  465 2015-09-04 00:47:55 
      cmc_v2_file  42-3  1  467 2015-09-04 00:47:57 
      cmc_v2_file  22-5  8  866 2015-09-04 00:47:53 
      cmc_v2_file  16-6  8  893 2015-09-04 00:47:51 
      cmc_v2_file  17-6  4  938 2015-09-04 00:47:54 
      cmc_v2_file  17-6  8  942 2015-09-04 00:47:55 
      cmc_v2_file  6-2 10  707 2015-09-04 00:47:50 
      cmc_v2_file  4-11 10  730 2015-09-04 00:47:54 
      cmc_v2_file  6-3  2  745 2015-09-04 00:47:53 
      cmc_v2_file  5-11  1  762 2015-09-04 00:47:55 
      cmc_v2_file  4-12  1  763 2015-09-04 00:47:56 
      cmc_v2_file  5-12 10  782 2015-09-04 00:48:16 
      cmc_v2_file  31-6  4  471 2015-09-04 00:47:55 
      cmc_v2_file  38-3  4  520 2015-09-04 00:47:51 
      cmc_v2_file  39-3  4  551 2015-09-04 00:47:55 
      cmc_v2_file  31-7 10  570 2015-09-04 00:48:20 
...       ... ...  ...     ... 
schema_9 hcs-vbu  1332-132 14 197542 2015-09-04 00:29:46 
      hcs-vbu  515-143  5 196309 2015-09-04 00:29:01 
      hcs-vbu  552-126 13 196333 2015-09-04 00:29:19 
      hcs-vbu  559-116 12 197068 2015-09-04 00:29:33 
      hcs-vbu  566-115 13 197201 2015-09-04 00:29:47 
      hcs-vbu  523-152  3 197443 2015-09-04 00:29:33 
      hcs-vbu  790-136  2 200774 2015-09-04 00:28:46 
      hcs-vbu  790-136  4 200776 2015-09-04 00:28:56 
      hcs-vbu  790-136 12 200784 2015-09-04 00:29:13 
      hcs-vbu  206-148  5 198213 2015-09-04 00:29:04 

爲了獲取數據特定腳本我這樣做:

df.loc(axis=0)[:,[script]] 

,當我打印出整個數據幀,它看起來是正確的。問題是,我也寫了這一切,併爲測試的一部分,一個單元測試,我想驗證數據只包含一個腳本:

scripts = df.index.levels[df.index.names.index('script')] 

然而,而不是返回像一個列表我預計我會得到一個6的列表,這是原始未過濾數據中的腳本數量。通過調用.loc篩選數據框後,是否有另外一種方法可以檢索腳本索引?

回答

0

您的第二條陳述df.index.levels獲取索引中的所有級別。然後,通過說,將第二個多索引(稱爲「腳本」)中的所有關卡給我。

我想你想要的是類似這樣的東西,對於名爲'script'的索引,給我一個特定的值。

## here we set a specific value you want to filter with 

specific_script_value = cmc_v2_file 

## and then we filter in the second dimension of the index. 
## The indexer helps slice in several dimensions 

idx=pd.IndexSlice 
df.loc[idx[:,specific_script_value],:]