在System.Linq
命名空间,我们现在可以扩展我们IEnumerable
的有Any()
和Count()
扩展方法.
最近我被告知如果我想检查一个集合中是否包含一个或多个项目,我应该使用.Any()
扩展方法而不是.Count() > 0
扩展方法,因为.Count()
扩展方法必须迭代所有项目.
其次,一些集合具有属性(未扩展方法),其是Count
或Length
.使用它们会更好吗,而不是.Any()
或.Count()
?
是啊/是?
如果你开始的东西,有一个.Length
或.Count
(如ICollection
,IList
,List
,等) -那么这将是最快的选择,因为它不需要去通过GetEnumerator()
/ MoveNext()
/ Dispose()
所要求的顺序Any()
,检查是否有非空IEnumerable
序列.
对于刚刚IEnumerable
,然后Any()
将通常更快,因为它只有看一次迭代.但是,请注意LINQ-to-Objects实现Count()
确实检查ICollection
(.Count
用作优化) - 因此,如果您的基础数据源直接是列表/集合,则不会有太大的区别.不要问我为什么不使用非通用的ICollection
......
当然,如果您使用LINQ来过滤它等(Where
等),您将有一个基于迭代器块的序列,因此这种ICollection
优化是无用的.
一般用IEnumerable
:坚持Any()
;-p
注意:当实体框架4是实际的时候,我写了这个答案.这个答案的要点是不要进入琐碎.Any()
VS .Count()
性能测试.关键是要表明EF远非完美.较新的版本更好......但是如果您的代码部分速度很慢并且使用EF,请使用直接TSQL进行测试并比较性能而不是依赖于假设(.Any()
总是比这更快.Count() > 0
).
虽然我同意大多数最高投票的答案和评论 - 尤其是关于开发者意图更好的点Any
信号- 我的情况是,SQL在SQL Server(EntityFramework 4)上的数量级更快.Count() > 0
这是查询Any
超时异常(在~200.000记录上):
con = db.Contacts. Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated && !a.NewsletterLogs.Any(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr) ).OrderBy(a => a.ContactId). Skip(position - 1). Take(1).FirstOrDefault();
Count
版本以毫秒为单位执行:
con = db.Contacts. Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated && a.NewsletterLogs.Count(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr) == 0 ).OrderBy(a => a.ContactId). Skip(position - 1). Take(1).FirstOrDefault();
我需要找到一种方法,同时看到LINQs产生什么确切的SQL -但它显然之间存在着巨大的性能差异Count
,并Any
在某些情况下,不幸的是,似乎你不能只坚持Any
在所有情况下.
编辑:这是生成的SQL.你可以看到美女;)
ANY
:
exec sp_executesql N'SELECT TOP (1) [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created] FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number] FROM ( SELECT [Extent1].[ContactId] AS [ContactId], [Extent1].[CompanyId] AS [CompanyId], [Extent1].[ContactName] AS [ContactName], [Extent1].[FullName] AS [FullName], [Extent1].[ContactStatusId] AS [ContactStatusId], [Extent1].[Created] AS [Created] FROM [dbo].[Contact] AS [Extent1] WHERE ([Extent1].[CompanyId] = @p__linq__0) AND ([Extent1].[ContactStatusId] <= 3) AND ( NOT EXISTS (SELECT 1 AS [C1] FROM [dbo].[NewsletterLog] AS [Extent2] WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId]) )) ) AS [Project2] ) AS [Project2] WHERE [Project2].[row_number] > 99 ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4
COUNT
:
exec sp_executesql N'SELECT TOP (1) [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created] FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number] FROM ( SELECT [Project1].[ContactId] AS [ContactId], [Project1].[CompanyId] AS [CompanyId], [Project1].[ContactName] AS [ContactName], [Project1].[FullName] AS [FullName], [Project1].[ContactStatusId] AS [ContactStatusId], [Project1].[Created] AS [Created] FROM ( SELECT [Extent1].[ContactId] AS [ContactId], [Extent1].[CompanyId] AS [CompanyId], [Extent1].[ContactName] AS [ContactName], [Extent1].[FullName] AS [FullName], [Extent1].[ContactStatusId] AS [ContactStatusId], [Extent1].[Created] AS [Created], (SELECT COUNT(1) AS [A1] FROM [dbo].[NewsletterLog] AS [Extent2] WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])) AS [C1] FROM [dbo].[Contact] AS [Extent1] ) AS [Project1] WHERE ([Project1].[CompanyId] = @p__linq__0) AND ([Project1].[ContactStatusId] <= 3) AND (0 = [Project1].[C1]) ) AS [Project2] ) AS [Project2] WHERE [Project2].[row_number] > 99 ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4
似乎纯粹的EXISTS工作比计算Count更糟糕,然后用Count == 0做Where.
如果你们在我的调查结果中看到一些错误,请告诉我.无论Any vs Count讨论如何,可以从所有这些中解脱出来的是,当重写为存储过程时,任何更复杂的LINQ都会更好;).
由于这是相当受欢迎的主题和答案不同,我不得不重新审视问题.
测试环境: EF 6.1.3,SQL Server,300k记录
表型号:
class TestTable { [Key] public int Id { get; set; } public string Name { get; set; } public string Surname { get; set; } }
测试代码:
class Program { static void Main() { using (var context = new TestContext()) { context.Database.Log = Console.WriteLine; context.TestTables.Where(x => x.Surname.Contains("Surname")).Any(x => x.Id > 1000); context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Any(x => x.Id > 1000); context.TestTables.Where(x => x.Surname.Contains("Surname")).Count(x => x.Id > 1000); context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Count(x => x.Id > 1000); Console.ReadLine(); } } }
结果:
任何()~3ms
第一次查询的Count()~230ms,第二次查询的约400ms
备注:
对于我的情况,EF没有生成像他在帖子中提到的@Ben那样的SQL.
编辑:它已在EF版本6.1.1中修复.这个答案不再是实际的
对于SQL Server和EF4-6,Count()的执行速度比Any()快两倍.
当你运行Table.Any()时,它会产生类似的东西(警告:不要伤害大脑试图理解它)
SELECT CASE WHEN ( EXISTS (SELECT 1 AS [C1] FROM [Table] AS [Extent1] )) THEN cast(1 as bit) WHEN ( NOT EXISTS (SELECT 1 AS [C1] FROM [Table] AS [Extent2] )) THEN cast(0 as bit) END AS [C1] FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
这需要根据您的条件进行2次扫描.
我不喜欢写,Count() > 0
因为它隐藏了我的意图.我更喜欢使用自定义谓词:
public static class QueryExtensions { public static bool Exists(this IQueryable source, Expression > predicate) { return source.Count(predicate) > 0; } }
这取决于数据集有多大以及您的性能要求是什么?
如果它没有什么巨大的使用最可读的形式,这对我来说是任何形式,因为它更短,更易读,而不是方程式.