几个Linq.Enumerable函数需要一个IEqualityComparer
.是否有一个方便的包装类适应delegate(T,T)=>bool
实现IEqualityComparer
?编写一个很容易(如果你忽略了定义正确的哈希码的问题),但我想知道是否有开箱即用的解决方案.
具体来说,我想对Dictionary
s 进行集合操作,仅使用Keys来定义成员资格(同时根据不同的规则保留值).
GetHashCode
其他人已经评论过这样一个事实:任何自定义IEqualityComparer
实现都应该包含一个GetHashCode
方法 ; 但是没有人愿意在任何细节上解释原因.
这就是原因.您的问题特别提到了LINQ扩展方法; 几乎所有这些都依赖哈希码来正常工作,因为它们在内部利用哈希表来提高效率.
就拿Distinct
,例如.如果所有使用的Equals
方法都是一种方法,请考虑这种扩展方法的含义.如果您只有一个项目已按顺序扫描,您如何确定Equals
?您枚举了您已查看的整个值集合并检查匹配项.这将导致Distinct
使用最坏情况的O(N 2)算法而不是O(N )算法!
幸运的是,事实并非如此.Distinct
不只是使用Equals
; 它也用GetHashCode
.事实上,如果没有提供适当的产品,它绝对不能正常工作IEqualityComparer
GetHashCode
.下面是一个说明这一点的人为例子.
说我有以下类型:
class Value { public string Name { get; private set; } public int Number { get; private set; } public Value(string name, int number) { Name = name; Number = number; } public override string ToString() { return string.Format("{0}: {1}", Name, Number); } }
现在说我有一个List
,我想找到所有具有不同名称的元素.这是Distinct
使用自定义相等比较器的完美用例.因此,让我们使用Comparer
类从阿库的回答:
var comparer = new Comparer((x, y) => x.Name == y.Name);
现在,如果我们有一堆Value
具有相同Name
属性的元素,它们应该全部折叠成一个返回的值Distinct
,对吧?让我们来看看...
var values = new List(); var random = new Random(); for (int i = 0; i < 10; ++i) { values.Add("x", random.Next()); } var distinct = values.Distinct(comparer); foreach (Value x in distinct) { Console.WriteLine(x); }
输出:
x: 1346013431 x: 1388845717 x: 1576754134 x: 1104067189 x: 1144789201 x: 1862076501 x: 1573781440 x: 646797592 x: 655632802 x: 1206819377
嗯,这没用,是吗?
怎么样GroupBy
?我们试试看:
var grouped = values.GroupBy(x => x, comparer); foreach (IGroupingg in grouped) { Console.WriteLine("[KEY: '{0}']", g); foreach (Value x in g) { Console.WriteLine(x); } }
输出:
[KEY = 'x: 1346013431'] x: 1346013431 [KEY = 'x: 1388845717'] x: 1388845717 [KEY = 'x: 1576754134'] x: 1576754134 [KEY = 'x: 1104067189'] x: 1104067189 [KEY = 'x: 1144789201'] x: 1144789201 [KEY = 'x: 1862076501'] x: 1862076501 [KEY = 'x: 1573781440'] x: 1573781440 [KEY = 'x: 646797592'] x: 646797592 [KEY = 'x: 655632802'] x: 655632802 [KEY = 'x: 1206819377'] x: 1206819377
再说一遍:没有用.
如果你考虑一下,在内部Distinct
使用HashSet
(或等效的),以及在内部GroupBy
使用类似的东西是有意义的Dictionary
.这可以解释为什么这些方法不起作用?我们试试这个:
var uniqueValues = new HashSet(values, comparer); foreach (Value x in uniqueValues) { Console.WriteLine(x); }
输出:
x: 1346013431 x: 1388845717 x: 1576754134 x: 1104067189 x: 1144789201 x: 1862076501 x: 1573781440 x: 646797592 x: 655632802 x: 1206819377
是的......开始有意义吗?
希望从这些例子可以清楚地看出为什么GetHashCode
在任何IEqualityComparer
实现中包含一个适当的实例是如此重要.
扩展orip的答案:
这里可以做一些改进.
首先,我会采取一个Func
而不是Func
; 这将防止在实际keyExtractor
本身中装入值类型键.
其次,我实际上添加了一个where TKey : IEquatable
约束; 这将阻止在Equals
调用中装箱(object.Equals
接受一个object
参数;你需要一个IEquatable
实现来获取TKey
参数而不用装箱).显然,这可能会造成太严格的限制,因此您可以创建没有约束的基类和带有它的派生类.
以下是生成的代码的外观:
public class KeyEqualityComparer: IEqualityComparer { protected readonly Func keyExtractor; public KeyEqualityComparer(Func keyExtractor) { this.keyExtractor = keyExtractor; } public virtual bool Equals(T x, T y) { return this.keyExtractor(x).Equals(this.keyExtractor(y)); } public int GetHashCode(T obj) { return this.keyExtractor(obj).GetHashCode(); } } public class StrictKeyEqualityComparer : KeyEqualityComparer where TKey : IEquatable { public StrictKeyEqualityComparer(Func keyExtractor) : base(keyExtractor) { } public override bool Equals(T x, T y) { // This will use the overload that accepts a TKey parameter // instead of an object parameter. return this.keyExtractor(x).Equals(this.keyExtractor(y)); } }
当您想要自定义相等性检查时,99%的时间您有兴趣定义要比较的键,而不是比较本身.
这可能是一个优雅的解决方案(Python的列表排序方法的概念).
用法:
var foo = new List{ "abc", "de", "DE" }; // case-insensitive distinct var distinct = foo.Distinct(new KeyEqualityComparer ( x => x.ToLower() ) );
本KeyEqualityComparer
类:
public class KeyEqualityComparer: IEqualityComparer { private readonly Func keyExtractor; public KeyEqualityComparer(Func keyExtractor) { this.keyExtractor = keyExtractor; } public bool Equals(T x, T y) { return this.keyExtractor(x).Equals(this.keyExtractor(y)); } public int GetHashCode(T obj) { return this.keyExtractor(obj).GetHashCode(); } }
我担心没有这样的包装盒开箱即用.然而,创建一个并不难:
class Comparer: IEqualityComparer { private readonly Func _comparer; public Comparer(Func comparer) { if (comparer == null) throw new ArgumentNullException("comparer"); _comparer = comparer; } public bool Equals(T x, T y) { return _comparer(x, y); } public int GetHashCode(T obj) { return obj.ToString().ToLower().GetHashCode(); } } ... Func f = (x, y) => x == y; var comparer = new Comparer (f); Console.WriteLine(comparer.Equals(1, 1)); Console.WriteLine(comparer.Equals(1, 2));
通常情况下,我会通过在答案上评论@Sam来解决这个问题(我已经对原始帖子进行了一些编辑,以便在不改变行为的情况下对其进行清理.)
以下是我对@Sam 的答案的重复,对[IMNSHO]关键修复默认的散列策略: -
class FuncEqualityComparer: IEqualityComparer { readonly Func _comparer; readonly Func _hash; public FuncEqualityComparer( Func comparer ) : this( comparer, t => 0 ) // NB Cannot assume anything about how e.g., t.GetHashCode() interacts with the comparer's behavior { } public FuncEqualityComparer( Func comparer, Func hash ) { _comparer = comparer; _hash = hash; } public bool Equals( T x, T y ) { return _comparer( x, y ); } public int GetHashCode( T obj ) { return _hash( obj ); } }
与丹涛的答案相同,但有一些改进:
依赖EqualityComparer<>.Default
于进行实际比较,以避免对struct
已实现的值类型进行装箱IEquatable<>
.
自EqualityComparer<>.Default
使用以来它不会爆炸null.Equals(something)
.
提供静态包装器IEqualityComparer<>
,它将有一个静态方法来创建比较器实例 - 简化调用.相比
Equality.CreateComparer(p => p.ID);
同
new EqualityComparer(p => p.ID);
添加了指定IEqualityComparer<>
密钥的重载.
班级:
public static class Equality{ public static IEqualityComparer CreateComparer (Func keySelector) { return CreateComparer(keySelector, null); } public static IEqualityComparer CreateComparer (Func keySelector, IEqualityComparer comparer) { return new KeyEqualityComparer (keySelector, comparer); } class KeyEqualityComparer : IEqualityComparer { readonly Func keySelector; readonly IEqualityComparer comparer; public KeyEqualityComparer(Func keySelector, IEqualityComparer comparer) { if (keySelector == null) throw new ArgumentNullException("keySelector"); this.keySelector = keySelector; this.comparer = comparer ?? EqualityComparer .Default; } public bool Equals(T x, T y) { return comparer.Equals(keySelector(x), keySelector(y)); } public int GetHashCode(T obj) { return comparer.GetHashCode(keySelector(obj)); } } }
你可以像这样使用它:
var comparer1 = Equality.CreateComparer(p => p.ID); var comparer2 = Equality .CreateComparer(p => p.Name); var comparer3 = Equality .CreateComparer(p => p.Birthday.Year); var comparer4 = Equality .CreateComparer(p => p.Name, StringComparer.CurrentCultureIgnoreCase);
人是一个简单的类:
class Person { public int ID { get; set; } public string Name { get; set; } public DateTime Birthday { get; set; } }
public class FuncEqualityComparer: IEqualityComparer { readonly Func _comparer; readonly Func _hash; public FuncEqualityComparer( Func comparer ) : this( comparer, t => t.GetHashCode()) { } public FuncEqualityComparer( Func comparer, Func hash ) { _comparer = comparer; _hash = hash; } public bool Equals( T x, T y ) { return _comparer( x, y ); } public int GetHashCode( T obj ) { return _hash( obj ); } }
随着扩展: -
public static class SequenceExtensions { public static bool SequenceEqual( this IEnumerable first, IEnumerable second, Func comparer ) { return first.SequenceEqual( second, new FuncEqualityComparer ( comparer ) ); } public static bool SequenceEqual ( this IEnumerable first, IEnumerable second, Func comparer, Func hash ) { return first.SequenceEqual( second, new FuncEqualityComparer ( comparer, hash ) ); } }
orip的答案很棒.
这里有一个小扩展方法,使它更容易:
public static IEnumerableDistinct (this IEnumerable list, Func keyExtractor) { return list.Distinct(new KeyEqualityComparer (keyExtractor)); } var distinct = foo.Distinct(x => x.ToLower())