2015-03-31 47 views
0

我有一個ArrayListDico我試圖將它拆分成多個ArrayLists,但這會導致一些重複。當我分裂時ArrayList中的重複

這是DICO類:

public class Dico implements Comparable { 
    private final String m_term; 
    private double m_weight; 
    private final int m_Id_doc; 

    public Dico(int Id_Doc, String Term, double tf_ief) { 
     this.m_Id_doc = Id_Doc; 
     this.m_term = Term; 
     this.m_weight = tf_ief; 
    } 

    public String getTerm() { 
     return this.m_term; 
    } 

    public double getWeight() { 
     return this.m_weight; 
    } 

    public void setWeight(double weight) { 
     this.m_weight = weight; 
    } 

    public int getDocId() { 
     return this.m_Id_doc; 
    } 

    @Override 
    public int compareTo(Object another) throws ClassCastException { 
     if (!(another instanceof Dico)) 
      throw new ClassCastException("A Dico object expected."); 
     int anotherDocid = ((Dico) another).getDocId(); 
     return this.getDocId() - anotherDocid; 
    } 

    @Override 
    public String toString() { 
     return "id" + getDocId() + "term" + getTerm() + "weight" + getWeight() + ""; 
    } 
} 

而且split_dico功能是使用要做到這一點:

public static void split_dico(List<Dico> list) { 
    int[] changes = new int[list.size() + 1]; // allow for max changes--> contain index of subList 
    Arrays.fill(changes, -1); // if an index is not used, will remain -1 
    changes[0] = 0; 
    int change = 1; 
    int id = list.get(0).getDocId(); 
    for (int i = 1; i < list.size(); i++) { 
     Dico dic_entry = list.get(i); 
     if (id != dic_entry.getDocId()) { 
      changes[change++] = i; 
      id = dic_entry.getDocId(); 
     } 
    } 
    changes[change] = list.size(); // end of last change segment 
    List<List<Dico>> sublists = new ArrayList<>(change); 
    for (int i = 0; i < change; i++) { 
     sublists.add(list.subList(changes[i], changes[i + 1])); 
     System.out.println(sublists); 
    } 
} 

測試:

List<Dico> list = Arrays.asList(new Dico(1, "foo", 1), 
    new Dico(7, "zoo", 5), 
    new Dico(2, "foo", 1), 
    new Dico(3, "foo", 1), 
    new Dico(1, "bar", 2), 
    new Dico(4, "zoo", 0.5), 
    new Dico(2, "bar", 2), 
    new Dico(3, "baz", 3)); 
Collections.sort(list_new); 
split_dico(list_new); 

輸出:

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6]] 

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6]] 

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2]] 

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2], [doc id : 4 term : zoo weight : 0.15]] 

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2], [doc id : 4 term : zoo weight : 0.15], [doc id : 7 term : zoo weight : 1.5]] 

我不明白這個功能的問題。

+0

不要使用'Comparable'原始類型。改爲使用「可比較的」。 – Tom 2015-04-01 22:12:08

回答

1

在您的打印循環中,您正在打印整個列表子列表後添加一個新的子列表。

相反,根據您的要求,你應該只當你與填充子列表

+0

如果我這樣做,我將打印所有子列表,但我想單獨使用它,每個列表包含必須包含具有相同ID的文檔。 我有解決方案來拆分該ArrayList,但複雜性爲100萬條款這麼高。 我尋找最快溶液 – tommy 2015-04-01 08:47:11

+0

溶液是下一個: 爲(列表子列表:子列表) { 的System.out.println(子表); } thanks @micklesh – tommy 2015-04-01 08:51:45

0

我對這個愚蠢的問題,它是如此rediculus,我想得更多的速度soltion對不起完成打印:

public static void split_dico(List<Dico> list) 
    { 
    int[] changes = new int[list.size() + 1]; // allow for max changes--> contain index of subList 
Arrays.fill(changes, -1); // if an index is not used, will remain -1 
changes[0] = 0; 
int change = 1; 
int id = list.get(0).getDocId(); 
for (int i = 1; i < list.size(); i++) 
{ 
    Dico dic_entry = list.get(i); 
    if (id != dic_entry.getDocId()) 
    { 
     changes[change++] = i; 
     id = dic_entry.getDocId(); 
    } 
} 
changes[change] = list.size(); // end of last change segment 
List<List<Dico>> sublists = new ArrayList<>(change); 
for (int i = 0; i < change; i++) 
{ 
    sublists.add(list.subList(changes[i], changes[i + 1])); 

} 
    for (int i = 1; i < sublists.size(); i++) 
{ 
     lists <Dico> = sublists.get(i); 
     system.out.println(lists); 

} 
} 

OUTPUT:

[[doc id : 1 term : foo weight : 2.2, doc id : 1 term : bar weight : 6.6], [doc id : 2 term : foo weight : 2.2, doc id : 2 term : bar weight : 6.6], [doc id : 3 term : foo weight : 2.2], [doc id : 4 term : zoo weight : 0.15], [doc id : 7 term : zoo weight : 1.5]]