2016-02-14 56 views
1

所以我正在編寫一個解析CSV的程序。我使用split方法將值分隔成一個字符串數組,但我讀過一些文章,它使用substring和indexOf更快。我基本上寫了我會用這兩種方法做什麼,似乎分裂會更好。有人可以解釋這是如何更好,或者如果我沒有正確使用這些方法?這是我寫的:拆分方法vs子串和索引

int indexOne = 0, indexTwo; 
for (int i = 0; i < 4; i++) //there's four diff values in one line 
{ 
    if (line.indexOf(",", indexOne) != -1) 
    { 
     indexTwo = line.indexOf(",", indexOne); 
     lineArr[i] = line.substring(indexOne, indexTwo); 
     indexOne = indexTwo+1; 
    } 
} 
+0

你可能會鏈接一些這些文章? –

+0

考慮使用lodash或下劃線或類似的東西來處理這樣的事情。 – Michael

+1

@AustinD這裏有一個鏈接http://demeranville.com/battle-of-the-tokenizers-delimited-text-parser-performance/有人把它放在stackexchange的評論這裏是該線程http://programmers.stackexchange.com/questions/221997 /最快路徑分割-a-delimited-string-in-java – trevalexandro

回答

1

下面是隨甲骨文JDK 8更新73.你可以在「快速路徑」的情況看,當你在一個字符字符串傳遞源採取的代碼,它屬於使用indexOf的循環與您的邏輯類似。

簡短的回答是,你的代碼有點快,但我會留給你決定是否足以避免在你的用例中使用split。

就我個人而言,我傾向於同意@pczeus評論使用分裂,除非您確實有證據表明它引起了問題。

public String[] split(String regex, int limit) { 
    /* fastpath if the regex is a 
    (1)one-char String and this character is not one of the 
     RegEx's meta characters ".$|()[{^?*+\\", or 
    (2)two-char String and the first char is the backslash and 
     the second is not the ascii digit or ascii letter. 
    */ 
    char ch = 0; 
    if (((regex.value.length == 1 && 
     ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) || 
     (regex.length() == 2 && 
      regex.charAt(0) == '\\' && 
      (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 && 
      ((ch-'a')|('z'-ch)) < 0 && 
      ((ch-'A')|('Z'-ch)) < 0)) && 
     (ch < Character.MIN_HIGH_SURROGATE || 
     ch > Character.MAX_LOW_SURROGATE)) 
    { 
     int off = 0; 
     int next = 0; 
     boolean limited = limit > 0; 
     ArrayList<String> list = new ArrayList<>(); 
     while ((next = indexOf(ch, off)) != -1) { 
      if (!limited || list.size() < limit - 1) { 
       list.add(substring(off, next)); 
       off = next + 1; 
      } else { // last one 
       //assert (list.size() == limit - 1); 
       list.add(substring(off, value.length)); 
       off = value.length; 
       break; 
      } 
     } 
     // If no match was found, return this 
     if (off == 0) 
      return new String[]{this}; 

     // Add remaining segment 
     if (!limited || list.size() < limit) 
      list.add(substring(off, value.length)); 

     // Construct result 
     int resultSize = list.size(); 
     if (limit == 0) { 
      while (resultSize > 0 && list.get(resultSize - 1).length() == 0) { 
       resultSize--; 
      } 
     } 
     String[] result = new String[resultSize]; 
     return list.subList(0, resultSize).toArray(result); 
    } 
    return Pattern.compile(regex).split(this, limit); 
}