2017-09-27 47 views
2

我正在使用univocity解析器讀取CSV列表 - https://www.univocity.com/pages/parsers-tutorial。下面是test.csv怎麼看起來像如何在基於某些規則的CSV解析之後連接字符串 - 逐行

Active;3189;Active on this date 2015-03-15-17.03.06.000000 

Catalog;3189;This is for date 2015-04-21-11.04.11.000000 

Master;3190;It happens on this date 2016-04-22-09.04.27.000000 

InActive;3190;Inactive on this date 2016-04-23-09.04.46.000000 

下面的代碼做一個解析 -

List<String[]> allRows = parser.parseAll(new FileReader("E:/test.csv")); 

我如何解析後的比較行逐個並連接基於第2列獨特

Ø/p

爲3189點的記錄 - 串x = Active on this date 2016-03-15-17.03.06.000000 and This is for date 2015-04-21-11.04.11.000000

爲3190的記錄 串x = It happens on this date 2016-04-22-09.04.27.000000 and Inactive on this date 2016-04-23-09.04.46.000000

+0

我可以想一些骯髒的方法(不是一個好的設計!):你可以爲'Active'和'Inactive'值創建兩個不同的列表,並根據'id'(比如3189或3190)進行比較。如果比較匹配,則連接字符串值。 – procrastinator

+0

讚賞你的迴應。第一列是動態的,它可以是除主動或非主動以外的任何字符串。我們必須在第二列而不是第一列值上作出決定。更新問題 – Sks

回答

2

這是你必須要更加小心,可能會出現例外的例子,所以你可以做這樣的事情:

String pattern = "^(Active|Inactive);([^;]*);(.*)$"; 
Pattern r = Pattern.compile(pattern); 
for (String[] row : allRows) { 
    if (row[0].matches(pattern)) { 
     Matcher m = r.matcher(row[0]); 
     if (m.find()) { 
      Record record = records.get(m.group(2)) == null ? new Record() : records.get(m.group(2)); 
      record.setId(m.group(2)); 
      if (m.group(1).equals("Active")) { 
       record.setActiveComment(m.group(3)); 
      } else if (m.group(1).equals("Inactive")) { 
       record.setInactiveComment(m.group(3)); 
      } 
      records.put(record.getId(), record); 
     } else { 
      System.out.println("NO MATCH"); 
     } 
    } 
} 

for (Entry<String, Record> rec : records.entrySet()) { 
    System.out.println(rec.getValue().getActiveComment() + " and " + rec.getValue().getInactiveComment()); 
} 

和類實錄:

public class Record { 

    private String id; 

    private String activeComment; 

    private String inactiveComment; 

    //add setters getters 

    //hashcode equals and toString. 

} 

hashcode和等於只比較ID。

+0

讚賞您的迴應。第一列是動態的,它可以是除主動或非主動以外的任何字符串。我們必須在第二列而不是第一列值上作出決定。 – Sks

+0

更新問題以消除任何混淆。 – Sks

+0

沒有混淆!您可以根據需要編輯發佈的代碼。 – ddarellis

1

我嘗試了一些方法,以某種方式解決您的問題。但我不確定它是否是一個好的設計。您可以嘗試添加以下代碼到你的方法:

for (int i = 0; i < allRows.size(); i++) { 
       if (allRows.get(i).length < 2) 
        continue; 
       for (int j = i + 1; j < allRows.size(); j++) { 
        if (allRows.get(j).length < 2) 
         continue; 
        if (allRows.get(i)[1].equals(allRows.get(j)[1])) // Comparing the second column with other objects 
        { 
         System.out.println("for " + allRows.get(i)[1] + " records- String X=" + allRows.get(i)[2] + " and " + allRows.get(j)[2]); 
         // Say if you have more than two occurences to 3189 then it prints two times this line. 
        } 
       } 
      } 

輸出:

for 3189 records- String X=Active on this date 2015-03-15-17.03.06.000000 and This is for date 2015-04-21-11.04.11.000000 
for 3190 records- String X=It happens on this date 2016-04-22-09.04.27.000000 and Inactive on this date 2016-04-23-09.04.46.000000 
2

我希望我得到了你的要求權。只需使用一個地圖存儲了「關鍵」的價值觀,當你找到一個預先存在的值將字符串:

public static void main(String... args) { 
    CsvParserSettings settings = new CsvParserSettings(); 
    settings.getFormat().setDelimiter(';'); 

    //looks like you are not interested in the first column. 
    //select the columns you actually need - faster and ensures all rows will come out with 2 columns 
    settings.selectIndexes(1, 2); 

    CsvParser parser = new CsvParser(settings); 

    //linked hashmap to keep the original order if that's important 
    Map<String, String[]> rows = new LinkedHashMap<String, String[]>(); 
    for (String[] row : parser.iterate(new File("E:/test.csv"))) { 

     String key = row[0]; 
     String[] existing = rows.get(key); 
     if (existing == null) { 
      rows.put(key, row); 
     } else { 
      existing[1] += " and " + row[1]; 
     } 
    } 

    //print the result 
    for(String[] row : rows.values()){ 
     System.out.println(row[0] + " - " + row[1]); 
    } 
} 

這會打印出:

3189 - Active on this date 2015-03-15-17.03.06.000000 and This is for date 2015-04-21-11.04.11.000000 
3190 - It happens on this date 2016-04-22-09.04.27.000000 and Inactive on this date 2016-04-23-09.04.46.000000 

希望它可以幫助