2011-02-09 66 views
0

我已經用Java編寫代碼來讀取文件的內容。但它僅適用於小文件行不超過1000行的文件。請告訴我我在下面的程序中犯了什麼錯誤。爲什麼我的程序不能讀取完整文件?

程序:

import java.io.DataInputStream; 
import java.io.DataOutputStream; 
import java.io.File; 
import java.io.FileInputStream; 
import java.io.FileNotFoundException; 
import java.io.FileOutputStream; 
import java.util.regex.Matcher; 
import java.util.regex.Pattern; 

public class aaru 
{ 
    public static void main(String args[]) throws FileNotFoundException 
    { 
    File sourceFile = new File("E:\\parser\\parse3.txt"); 
    File destinationFile = new File("E:\\parser\\new.txt"); 
    FileInputStream fileIn = new FileInputStream(sourceFile); 
    FileOutputStream fileOut = new FileOutputStream(destinationFile); 
    DataInputStream dataIn = new DataInputStream(fileIn); 
    DataOutputStream dataOut = new DataOutputStream(fileOut); 

    String str = ""; 
    String[] st; 
    String sub[] = null; 
    String word = ""; 
    String contents = ""; 
    String total = ""; 

    String stri = ""; 
    try 
    { 
     while ((contents = dataIn.readLine()) != null) 
     { 
     total = contents.replaceAll(",", ""); 
     String str1 = total.replaceAll("--", ""); 
     String str2 = str1.replaceAll(";", ""); 
     String str3 = str2.replaceAll("&", ""); 
     String str4 = str3.replaceAll("^", ""); 
     String str5 = str4.replaceAll("#", ""); 
     String str6 = str5.replaceAll("!", ""); 
     String str7 = str6.replaceAll("/", ""); 
     String str8 = str7.replaceAll(":", ""); 
     String str9 = str8.replaceAll("]", ""); 
     String str10 = str9.replaceAll("\\?", ""); 
     String str11 = str10.replaceAll("\\*", ""); 
     String str12 = str11.replaceAll("\\'", ""); 


     Pattern pattern = 
      Pattern.compile("\\s+", Pattern.CASE_INSENSITIVE | Pattern.DOTALL | Pattern.MULTILINE); 
     Matcher matcher = pattern.matcher(str12); 
     //boolean check = matcher.find(); 
     String result = str12; 
     Pattern p = Pattern.compile("^www\\.|\\@"); 
     Matcher m = p.matcher(result); 
     stri = m.replaceAll(" "); 

     int i; 
     int j; 

     st = stri.split("\\."); 

     for (i = 0; i < st.length; i++) 
     { 
      st[i] = st[i].trim(); 
      /*if(st[i].startsWith(" ")) 
      st[i]=st[i].substring(1,st[i].length);*/ 
      sub = st[i].split(" "); 

      if (sub.length > 1) 
      { 
      for (j = 0; j < sub.length - 1; j++) 
      { 
       word = word + sub[j] + "," + sub[j + 1] + "\r\n"; 

      } 
      } 
      else 
      { 
      word = word + st[i] + "\r\n"; 
      } 
     } 
     } 

     System.out.println(word); 
     dataOut.writeBytes(word + "\r\n"); 

     fileIn.close(); 
     fileOut.close(); 
     dataIn.close(); 
     dataOut.close(); 
    } catch (Exception e) 
    { 
     System.out.print(e); 
    } 
    } 
} 
+0

歡迎堆棧溢出!你可以在代碼中添加一些評論或描述,使它更易於遵循嗎?我真的不知道這裏的大局是什麼。 – templatetypedef 2011-02-09 07:26:08

回答

3

它不是立即顯而易見的,爲什麼你的代碼不讀滿檔,但這裏有兩個提示:

第一:不要使用DataInputStream讀取整行。

像這樣::

BufferedReader reader = new BufferedReader(new InputStreamReader(fileIn, "UTF-8")); 

二:當你不知道如何取而代之的InputStreamReader(理想情況下提供的編碼)和BufferedReader(如記錄由DataInputStream.readLine()的JavaDoc)包裝您FileInputStream處理異常至少打印其堆棧跟蹤是這樣的:

catch(Exception e) 
{ 
    e.printStackTrace(); 
} 
相關問題