2013-03-03 56 views
1

我讀了文件,文件格式是該
輸入文件格式程序,在C#讀取文件並提取單詞不#

 id   PosScore NegScore  Word        SynSet 

     00002098 0   0.75   unable#1       (usually followed by `to') not having the necessary means or skill or know-how; "unable to get to town without a car"; "unable to obtain funds" 
     00002312 0.23  0.43   dorsal#2 abaxial#1    facing away from the axis of an organ or organism; "the abaxial surface of a leaf is the underside or side facing away from the stem" 
     00002527 0.14  0.26   ventral#2 adaxial#1    nearest to or facing toward the axis of an organ or organism; "the upper side of a leaf is known as the adaxial surface" 
     00002730 0.45  0.32   acroscopic#1      facing or on the side toward the apex 
     00002843 0.91  0.87   basiscopic#1      facing or on the side toward the base 
     00002956 0.43  0.73   abducting#1 abducent#1   especially of muscles; drawing away from the midline of the body or from an adjacent part 
     00003131 0.15  0.67   adductive#1 adducting#1 adducent#1 especially of muscles; bringing together or drawing toward the midline of the body or toward an adjacent part  
in this file  

此文件中的同義詞集列應刪除和第二件事,如果Word列有多個詞,那麼id,PosScore,NegScore會按照重複的詞重複一行,但是id,posScore,NegScore會相同。 我想上面的文件
輸出

id   PosScore  NegScore    Word  
00002098 0    0.75    unable#1  
00002312 0.23   0.43    dorsal#2  
00002312 0.23   0.43    abaxial#1  
00002527 0.14   0.26    ventral#2  
00002527 0.14   0.26    adaxial#1  
00002730 0.45   0.32    acroscopic#1  
00002843 0.91   0.87    basiscopic#1  
00002956 0.43   0.73    abducting#1  
00002956 0.43   0.73    abducent#1  
00003131 0.15   0.67    adductive#1  
00003131 0.15   0.67    adducting#1  
00003131 0.15   0.67    adducent#1  

我寫了下面的代碼的下面的輸出,但它會出現意想不到的結果。

TextWriter tw = new StreamWriter("D:\\output.txt");  
private void button1_Click(object sender, EventArgs e) 
     { 

       StreamReader reader = new StreamReader(@"C:\Users\Zia Ur Rehman\Desktop\records.txt"); 
       string line; 
       String lines = ""; 
       while ((line = reader.ReadLine()) != null) 
       { 

        String[] str = line.Split('\t'); 

        String[] words = str[4].Split(' '); 
        for (int k = 0; k < words.Length; k++) 
        { 
         for (int i = 0; i < str.Length; i++) 
         { 
          if (i + 1 != str.Length) 
          { 
           lines = lines + str[i] + ","; 
          } 
          else 
          { 
           lines = lines + words[k] + "\r\n"; 

          } 
         } 
        } 
       } 
      tw.Write(lines); 
      tw.Close(); 
      reader.Close();  
     } 

這個代碼給出的結果是錯誤的

00002098,0,0.75,unable#1,unable#1 
00002312,0,0,dorsal#2 abaxial#1,dorsal#2 
00002312,0,0,dorsal#2 abaxial#1,abaxial#1 
00002527,0,0,ventral#2 adaxial#1,ventral#2 
00002527,0,0,ventral#2 adaxial#1,adaxial#1 
00002730,0,0,acroscopic#1,acroscopic#1 
00002843,0,0,basiscopic#1,basiscopic#1 
00002956,0,0,abducting#1 abducent#1,abducting#1 
00002956,0,0,abducting#1 abducent#1,abducent#1 
00003131,0,0,adductive#1 adducting#1 adducent#1,adductive#1 
00003131,0,0,adductive#1 adducting#1 adducent#1,adducting#1 
00003131,0,0,adductive#1 adducting#1 adducent#1,adducent#1 
+0

什麼是「意外」? – 2013-03-03 18:53:59

+0

你怎麼知道新專欄何時開始? – Magnus 2013-03-03 18:54:37

+0

@Magnus我現在編輯它你可以看到 – 2013-03-03 19:01:20

回答

1

我簡化了你的代碼,並使其能夠正常工作。 它仍然缺少驗證,可以通過使用StringBuilder,尤其是通過將每行寫入文件而不是將其追加到字符串來實現更高性能。它也缺少exception handling

using (TextWriter tw = File.CreateText(@"c:\temp\result.txt")) 
using (StreamReader reader = new StreamReader(@"stackov1.txt")) 
{ 
    string line; 
    String lines = ""; 
    while ((line = reader.ReadLine()) != null) 
    { 

     String[] str = line.Split('\t'); 

     String[] words = str[3].Split(' '); 
     for (int k = 0; k < words.Length; k++) 
     { 
      lines = lines + str[0] + "\t" + str[1] + "\t" + str[2] + "\t" + words[k] + "\r\n"; 
     } 
    } 
    tw.Write(lines); 
} 
+0

此代碼無法正常工作 – 2013-03-03 20:00:20

+0

我添加了書寫器和閱讀器代碼,所以現在我的示例已完成(在它需要IO初始化之前)。請發佈詳細信息,如果有問題 – MichalMa 2013-03-03 20:10:53

+0

主席先生我添加IO,但它沒有寫入文件 – 2013-03-03 20:22:36

2

所以,這是現在工作。經過長時間的努力。
注意:如果您在輸入文件中沒有使用正確的製表符。結果將不正確。不要忽視正確的標籤。

TextWriter tw = new StreamWriter("D:\\output.txt");  
    private void button1_Click(object sender, EventArgs e) 
    { 
     StreamReader reader = new StreamReader(@"C:\Users\Mohsin\Desktop\records.txt"); 
     string line; 
     String lines = ""; 
     while ((line = reader.ReadLine()) != null) 
     { 

      String[] str = line.Split('\t'); 

      String[] words = str[3].Split(' '); 
      for (int k = 0; k < words.Length; k++) 
      { 
       for (int i = 0; i < 4; i++) 
       { 
        if (i + 1 != 4) 
        { 
         lines = lines + str[i] + "\t"; 
        } 
        else 
        { 
         lines = lines + words[k] + "\r\n"; 

        } 
       } 
      } 
     } 
     tw.Write(lines); 
     tw.Close(); 
     reader.Close(); 
    }