2015-11-05 88 views
3

當標題不在第一行時,Univocity Parsers如何讀取.csv文件?當標題不在第一行時,Univocity Parsers如何讀取.csv文件?

如果.csv文件中的第一行不是標題,則會出現錯誤。

代碼和堆棧跟蹤如下。

任何幫助將不勝感激。

import com.univocity.parsers.csv.CsvParserSettings; 
import com.univocity.parsers.common.processor.*; 
import com.univocity.parsers.csv.*; 
import java.io.InputStreamReader; 
import java.io.Reader; 
import java.io.UnsupportedEncodingException; 
import java.lang.IllegalStateException; 
import java.lang.String; 
import java.util.List; 


public class UnivocityParsers { 

public Reader getReader(String relativePath) { 
    try { 
     return new InputStreamReader(this.getClass().getResourceAsStream(relativePath), "Windows-1252"); 
    } catch (UnsupportedEncodingException e) { 
     throw new IllegalStateException("Unable to read input", e); 
    } 
} 


public void columnSelection() { 
    RowListProcessor rowProcessor = new RowListProcessor(); 
    CsvParserSettings parserSettings = new CsvParserSettings(); 

    parserSettings.setRowProcessor(rowProcessor); 
    parserSettings.setHeaderExtractionEnabled(true); 
    parserSettings.setLineSeparatorDetectionEnabled(true); 
    parserSettings.setSkipEmptyLines(true); 

    // Here we select only the columns "Price", "Year" and "Make". 
    // The parser just skips the other fields 
    parserSettings.selectFields("AUTHOR", "ISBN"); 

    CsvParser parser = new CsvParser(parserSettings); 
    parser.parse(getReader("list2.csv")); 

    List<String[]> rows = rowProcessor.getRows(); 

    String[] strings = rows.get(0); 

    System.out.print(strings[0]); 

} 


public static void main(String arg[]) { 

    UnivocityParsers univocityParsers = new UnivocityParsers(); 

    univocityParsers.columnSelection(); 


} 


} 

堆棧跟蹤:

Exception in thread "main" com.univocity.parsers.common.TextParsingException: Error processing input: java.lang.IllegalStateException - Unknown field names: [author, isbn]. Available fields are: [list of books by author - created today] 

這裏是正在分析的文件:

List of books by Author - Created today 
"REVIEW_DATE","AUTHOR","ISBN","DISCOUNTED_PRICE" 
"1985/01/21","Douglas Adams",0345391802,5.95 
"1990/01/12","Douglas Hofstadter",0465026567,9.95 
"1998/07/15","Timothy ""The Parser"" Campbell",0968411304,18.99 
"1999/12/03","Richard Friedman",0060630353,5.95 
"2001/09/19","Karen Armstrong",0345384563,9.95 
"2002/06/23","David Jones",0198504691,9.95 
"2002/06/23","Julian Jaynes",0618057072,12.50 
"2003/09/30","Scott Adams",0740721909,4.95 
"2004/10/04","Benjamin Radcliff",0804818088,4.95 
"2004/10/04","Randel Helms",0879755725,4.50 

回答

1

截至今日,在2.0.0-SNAPSHOT你可以這樣做:

settings.setNumberOfRowsToSkip(1); 

在版本1.5.6上,你可以做到這一點基普第一線,正確搶頭:

RowListProcessor rowProcessor = new RowListProcessor(){ 
     @Override 
     public void processStarted(ParsingContext context) { 
      super.processStarted(context); 
      context.skipLines(1); 
     } 
    }; 

另一種方法是評論的第一行,如果你的輸入文件(如果你有過怎樣生成的文件控制)通過的開頭添加一個#行,你想放棄:

#List of books by Author - Created today 
+0

當單義的解析器-1.5.6.jar嘗試上述解決方案的代碼,有錯誤:'java.lang.IllegalStateException:未知的字段名稱:[作者,ISBN review_date] 。可用字段爲:[1985/01/21,道格拉斯亞當斯,0345391802,5.95]'。通過將pom.xml從該快照添加到IntelliJ項目來嘗試2.0.0-SNAPSHOT解決方案時:'錯誤:(53,23)java:找不到符號符號:方法setNumberOfRowsToSkip(int)位置:com類型的變量parserSettings。 univocity.parsers.csv.CsvParserSettings'。此外,IntelliJ沒有找到pom.xml插件:' maven-gpg-plugin'。 – 65535

+0

1.5.6解決方案可以工作(即覆蓋'processStartedMethod'),我用你發佈的代碼自己測試它。顯然你也應用了其他更改,它抓住第三行用作標題而不是第二行。在2.0.0-SNAPSHOT版本上,你不需要複製pom文件,那永遠不會工作。你可以更新你自己的pom.xml,也可以從[https://oss.sonatype.org/content/repositories/snapshots/com/univocity/univocity-parsers/2.0.0-SNAPSHOT/univocity-parsers-2.0 .0-20151111.095007-18.jar) –

+0

我把'univocity-parsers-2.0.0-20151111.095007-18.jar'放到項目庫中,但是當我用下面的方法調用新方法時:'parserSettings.setNumberOfRowsToSkip(1);發生錯誤:'無法解析方法'setNumberOfRowsToSkip(int)''。 – 65535