2011-05-23 44 views
2

將LINQ運行到對象GroupBy()許多項目(千兆字節)的方法可能會耗費內存。如果IEnumerable<T>已經按鍵排序,我們可以編寫一個不會消耗太多內存的GroupByGroupBy中保存的內存

我在哪裏可以找到具有此類方法的庫?

回答

3

框架中沒有任何東西可以做到這一點。如果你需要一個完整的IGrouping<,>實施這將是稍硬

static IEnumerable<IList<TElement>> GroupByChanges<TElement, TKey> 
    (this IEnumerable<TElement> source, 
    Func<TElement, TKey> projection) 
{ 
    // TODO: Argument validation, splitting this into two methods 
    // to achieve eager validation. 
    // TODO: Allow a custom comparer to be used, possibly even 
    // an IComparer<T> instead of an IEqualityComparer<T> 
    IEqualityComparer<TKey> comparer = EqualityComparer<TKey>.Default; 

    using (IEnumerator<TElement> iterator = source.GetEnumerator()) 
    { 
     if (!iterator.MoveNext()) 
     { 
      yield break; 
     } 
     TKey currentKey = projection(iterator.Current); 
     IList<TElement> currentList = new List<TElement> { iterator.Current }; 
     while (iterator.MoveNext()) 
     { 
      TKey key = projection(iterator.Current); 
      if (!comparer.Equals(currentKey, key)) 
      { 
       yield return currentList; 
       currentList = new List<TElement>(); 
      } 
      currentList.Add(iterator.Current); 
     } 
     yield return currentList; 
    } 
} 

- 但你總是搶我Edulinq implementation:如果您不需要實際IGrouping<,>您可以使用此。

GroupByChanges的實施將非常小的改變 - 只是改變了currentList分配中的關鍵傳遞給Grouping構造:

Grouping<TKey, TElement> currentGroup = new Grouping<TKey, TElement>(currentKey) 
    { iterator.Current }; 
1

您的問題非常具體。你很難找到一個已經這樣做的庫。如果您的物品是按照您用來分組的鍵排序的,那麼您自己對該列表進行「分組」,這是一項近乎平凡的任務。

+0

我不認爲這是特定的。 – 2011-05-23 12:19:53

1

你可以很容易地實現它自己:

public static class Extensions 
{ 

    public static IEnumerable<IGrouping<TKey, TSource>> GroupByAlreadyOrdered<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) 
    { 
     return source.GroupByAlreadyOrdered(keySelector, null); 
    } 

    public static IEnumerable<IGrouping<TKey, TSource>> GroupByAlreadyOrdered<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer) 
    { 
     TKey currentKey = default(TKey); 
     bool first = true; 
     List<TSource> currentGroup = null; 
     comparer = comparer ?? EqualityComparer<TKey>.Default; 

     foreach (var item in source) 
     { 
      TKey key = keySelector(item); 
      if (first || !comparer.Equals(key, currentKey)) 
      { 
       if (currentGroup != null && currentGroup.Any()) 
       { 
        yield return new Grouping<TKey, TSource>(currentKey, currentGroup); 
       } 
       currentGroup = new List<TSource>(); 
      } 

      currentGroup.Add(item); 
      first = false; 
      currentKey = key; 
     } 
     // Last group 
     if (currentGroup != null && currentGroup.Any()) 
     { 
      yield return new Grouping<TKey, TSource>(currentKey, currentGroup); 
     } 
    } 

    private class Grouping<TKey, TElement> : IGrouping<TKey, TElement> 
    { 
     private readonly TKey _key; 
     private readonly IEnumerable<TElement> _elements; 

     public Grouping(TKey key, IEnumerable<TElement> elements) 
     { 
      _key = key; 
      _elements = elements; 
     } 

     public TKey Key 
     { 
      get { return _key; } 
     } 

     public IEnumerator<TElement> GetEnumerator() 
     { 
      return _elements.GetEnumerator(); 
     } 

     IEnumerator IEnumerable.GetEnumerator() 
     { 
      return GetEnumerator(); 
     } 
    } 

} 
+0

您的代碼有效。它比'GroupBy'稍慢,而我的版本比你的和.NET的都快 – 2011-05-23 12:56:12

0

像托馬斯'但稍快

public static IEnumerable<IGrouping<TKey, TSource>> FastGroupBy<TSource, TKey>(
    this IEnumerable<TSource> source, 
    Func<TSource, TKey> keySelector) 
{ 
    using (var enumerator = source.GetEnumerator()) 
    { 
     if (enumerator.MoveNext()) 
     { 
      Grouping<TKey, TSource> grouping; 
      List<TSource> list = new List<TSource>(); 
      TKey key = keySelector(enumerator.Current); 
      list.Add(enumerator.Current); 
      while (enumerator.MoveNext()) 
      { 
       var currentKey = keySelector(enumerator.Current); 
       if (key.Equals(currentKey)) 
       { 
        list.Add(enumerator.Current); 
        continue; 
       } 

       grouping = new Grouping<TKey, TSource>(key, list); 
       yield return grouping; 

       key = currentKey; 
       list = new List<TSource>(); 
       list.Add(enumerator.Current); 
      } 

      grouping = new Grouping<TKey, TSource>(key, list); 
      yield return grouping; 
     } 
    } 
}