2010-09-15 70 views
0

我有一個固定寬度的平面文件。更糟糕的是,每行可以是一個新的記錄或高於該行的子記錄,通過對各行的第一個字符標識:分析多行固定寬度文件

A0020SOME DESCRIPTION MORE DESCRIPTION 922 2321  # Separate 
A0021ANOTHER DESCRIPTIONMORE DESCRIPTION 23111442  # records 
B0021ANOTHER DESCRIPTION THIS TIME IN ANOTHER FORMAT # sub-record of record "0021" 

我使用Flatworm這似乎是一個很好的嘗試庫用於解析固定寬度的數據。不幸的是,它的文檔陳述如下:

"Repeating segments are supported only for delimited files" 

(同上,「重複片段」)。

我寧可不寫一個自定義分析器。 (1)是否可以在Flatworm中做到這一點?(2)是否有提供這種(多行,多子記錄)功能的庫?

回答

2

你看過JRecordBind嗎?

http://jrecordbind.org/

「JRecordBind支持分級固定長度的文件:是其他記錄類型的‘兒子’某種類型的記錄。」

0

使用uniVocity-parsers您不僅可以讀取固定寬度的輸入,還可以讀取主 - 行數據(其中一行具有子行)。

下面是一個例子:

//1st, use a RowProcessor for the "detail" rows. 
ObjectRowListProcessor detailProcessor = new ObjectRowListProcessor(); 

//2nd, create MasterDetailProcessor to identify whether or not a row is the master row. 
// the row placement argument indicates whether the master detail row occurs before or after a sequence of "detail" rows. 
MasterDetailListProcessor masterRowProcessor = new MasterDetailListProcessor(RowPlacement.TOP, detailProcessor) { 
    @Override 
    protected boolean isMasterRecord(String[] row, ParsingContext context) { 
     //Returns true if the parsed row is the master row. 
     return row[0].startsWith("B"); 
    } 
}; 

FixedWidthParserSettings parserSettings = new FixedWidthParserSettings(new FixedWidthFieldLengths(4, 5, 40, 40, 8)); 

// Set the RowProcessor to the masterRowProcessor. 
parserSettings.setRowProcessor(masterRowProcessor); 

FixedWidthParser parser = new FixedWidthParser(parserSettings); 
parser.parse(new FileReader(yourFile)); 

// Here we get the MasterDetailRecord elements. 
List<MasterDetailRecord> rows = masterRowProcessor.getRecords(); 
for(MasterDetailRecord masterRecord = rows){ 
// The master record has one master row and multiple detail rows. 
    Object[] masterRow = masterRecord.getMasterRow(); 
    List<Object[]> detailRows = masterRecord.getDetailRows(); 
} 

披露:我是這個庫的作者。它是開放源代碼和免費的(Apache V2.0許可證)。