当前位置:  开发笔记 > 编程语言 > 正文

哪种方法表现更好:.Any()vs .Count()> 0?

如何解决《哪种方法表现更好:.Any()vs.Count()>0?》经验,为你挑选了5个好方法。

System.Linq命名空间,我们现在可以扩展我们IEnumerable的有Any()Count() 扩展方法.

最近我被告知如果我想检查一个集合中是否包含一个或多个项目,我应该使用.Any()扩展方法而不是.Count() > 0扩展方法,因为.Count()扩展方法必须迭代所有项目.

其次,一些集合具有属性(未扩展方法),其是CountLength.使用它们会更好吗,而不是.Any().Count()

是啊/是?



1> Marc Gravell..:

如果你开始的东西,有一个.Length.Count(如ICollection,IList,List,等) -那么这将是最快的选择,因为它不需要去通过GetEnumerator()/ MoveNext()/ Dispose()所要求的顺序Any(),检查是否有非空IEnumerable序列.

对于刚刚IEnumerable,然后Any()通常更快,因为它只有看一次迭代.但是,请注意LINQ-to-Objects实现Count()确实检查ICollection(.Count用作优化) - 因此,如果您的基础数据源直接是列表/集合,则不会有太大的区别.不要问我为什么不使用非通用的ICollection......

当然,如果您使用LINQ来过滤它等(Where等),您将有一个基于迭代器块的序列,因此这种ICollection优化是无用的.

一般用IEnumerable:坚持Any();-p


我认为在大多数情况下使用Any()还有另一个原因.它标志着开发人员的精确意图.如果你不知道项目的数量,但只有有一些,那么somecollection.Any()比somecollection更简单和清晰.Count> 0
@JLRishe - 我仍然认为`someCollection.Count> 0`和`someCollection.Any()`一样清晰,并且具有更高性能和不需要LINQ的额外好处.当然,这是一个非常简单的案例,使用LINQ运算符的其他构造将传达开发人员的意图比等效的非LINQ选项更清晰.
@huttelihut - 你知道有多少开发人员真的被语句`(somecollection.Count> 0)`搞糊涂了?在引入LINQ的.Any()方法之前我们的所有代码都难以理解吗?
Marc:ICollection 实际上并非来自ICollection.我也很惊讶,但Reflector不会撒谎.
是否()实现检查ICollection接口并检查Count属性?
不,它没有(检查反射器)
一个更好的选择可能是创建自己的IsEmpty()扩展方法,尝试强制转换为集合或数组(使用Count或Length),然后返回Any().在所有情况下都进行了优化,并使意图明确.
@CraigTP,不要将属性`Count`与`Count()`方法混淆.两个不同的事情和问题是关于方法,而不是属性......

2> nikib3ro..:

注意:当实体框架4是实际的时候,我写了这个答案.这个答案的要点是不要进入琐碎.Any()VS .Count()性能测试.关键是要表明EF远非完美.较新的版本更好......但是如果您的代码部分速度很慢并且使用EF,请使用直接TSQL进行测试并比较性能而不是依赖于假设(.Any()总是比这更快.Count() > 0).


虽然我同意大多数最高投票的答案和评论 - 尤其是关于开发者意图更好的点Any信号- 我的情况是,SQL在SQL Server(EntityFramework 4)上的数量级更快.Count() > 0

这是查询Any超时异常(在~200.000记录上):

con = db.Contacts.
    Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated
        && !a.NewsletterLogs.Any(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr)
    ).OrderBy(a => a.ContactId).
    Skip(position - 1).
    Take(1).FirstOrDefault();

Count 版本以毫秒为单位执行:

con = db.Contacts.
    Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated
        && a.NewsletterLogs.Count(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr) == 0
    ).OrderBy(a => a.ContactId).
    Skip(position - 1).
    Take(1).FirstOrDefault();

我需要找到一种方法,同时看到LINQs产生什么确切的SQL -但它显然之间存在着巨大的性能差异Count,并Any在某些情况下,不幸的是,似乎你不能只坚持Any在所有情况下.

编辑:这是生成的SQL.你可以看到美女;)

ANY:

exec sp_executesql N'SELECT TOP (1) 
[Project2].[ContactId] AS [ContactId], 
[Project2].[CompanyId] AS [CompanyId], 
[Project2].[ContactName] AS [ContactName], 
[Project2].[FullName] AS [FullName], 
[Project2].[ContactStatusId] AS [ContactStatusId], 
[Project2].[Created] AS [Created]
FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]
    FROM ( SELECT 
        [Extent1].[ContactId] AS [ContactId], 
        [Extent1].[CompanyId] AS [CompanyId], 
        [Extent1].[ContactName] AS [ContactName], 
        [Extent1].[FullName] AS [FullName], 
        [Extent1].[ContactStatusId] AS [ContactStatusId], 
        [Extent1].[Created] AS [Created]
        FROM [dbo].[Contact] AS [Extent1]
        WHERE ([Extent1].[CompanyId] = @p__linq__0) AND ([Extent1].[ContactStatusId] <= 3) AND ( NOT EXISTS (SELECT 
            1 AS [C1]
            FROM [dbo].[NewsletterLog] AS [Extent2]
            WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])
        ))
    )  AS [Project2]
)  AS [Project2]
WHERE [Project2].[row_number] > 99
ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4

COUNT:

exec sp_executesql N'SELECT TOP (1) 
[Project2].[ContactId] AS [ContactId], 
[Project2].[CompanyId] AS [CompanyId], 
[Project2].[ContactName] AS [ContactName], 
[Project2].[FullName] AS [FullName], 
[Project2].[ContactStatusId] AS [ContactStatusId], 
[Project2].[Created] AS [Created]
FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]
    FROM ( SELECT 
        [Project1].[ContactId] AS [ContactId], 
        [Project1].[CompanyId] AS [CompanyId], 
        [Project1].[ContactName] AS [ContactName], 
        [Project1].[FullName] AS [FullName], 
        [Project1].[ContactStatusId] AS [ContactStatusId], 
        [Project1].[Created] AS [Created]
        FROM ( SELECT 
            [Extent1].[ContactId] AS [ContactId], 
            [Extent1].[CompanyId] AS [CompanyId], 
            [Extent1].[ContactName] AS [ContactName], 
            [Extent1].[FullName] AS [FullName], 
            [Extent1].[ContactStatusId] AS [ContactStatusId], 
            [Extent1].[Created] AS [Created], 
            (SELECT 
                COUNT(1) AS [A1]
                FROM [dbo].[NewsletterLog] AS [Extent2]
                WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])) AS [C1]
            FROM [dbo].[Contact] AS [Extent1]
        )  AS [Project1]
        WHERE ([Project1].[CompanyId] = @p__linq__0) AND ([Project1].[ContactStatusId] <= 3) AND (0 = [Project1].[C1])
    )  AS [Project2]
)  AS [Project2]
WHERE [Project2].[row_number] > 99
ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4

似乎纯粹的EXISTS工作比计算Count更糟糕,然后用Count == 0做Where.

如果你们在我的调查结果中看到一些错误,请告诉我.无论Any vs Count讨论如何,可以从所有这些中解脱出来的是,当重写为存储过程时,任何更复杂的LINQ都会更好;).


基于SQL,我只能说:两个查询看起来都很糟糕.我知道有一个原因我通常会编写自己的TSQL ...
很想看到每个场景的每个linq查询生成的一些Sql Query计划.

3> kamil-mrzygl..:

由于这是相当受欢迎的主题和答案不同,我不得不重新审视问题.

测试环境: EF 6.1.3,SQL Server,300k记录

表型号:

class TestTable
{
    [Key]
    public int Id { get; set; }

    public string Name { get; set; }

    public string Surname { get; set; }
}

测试代码:

class Program
{
    static void Main()
    {
        using (var context = new TestContext())
        {
            context.Database.Log = Console.WriteLine;

            context.TestTables.Where(x => x.Surname.Contains("Surname")).Any(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Any(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname")).Count(x => x.Id > 1000);
            context.TestTables.Where(x => x.Surname.Contains("Surname") && x.Name.Contains("Name")).Count(x => x.Id > 1000);

            Console.ReadLine();
        }
    }
}

结果:

任何()~3ms

第一次查询的Count()~230ms,第二次查询的约400ms

备注:

对于我的情况,EF没有生成像他在帖子中提到的@Ben那样的SQL.


为了进行适当的比较,你应该做`Count()> 0`.:d

4> Ben..:

编辑:它已在EF版本6.1.1中修复.这个答案不再是实际的

对于SQL Server和EF4-6,Count()的执行速度比Any()快两倍.

当你运行Table.Any()时,它会产生类似的东西(警告:不要伤害大脑试图理解它)

SELECT 
CASE WHEN ( EXISTS (SELECT 
    1 AS [C1]
    FROM [Table] AS [Extent1]
)) THEN cast(1 as bit) WHEN ( NOT EXISTS (SELECT 
    1 AS [C1]
    FROM [Table] AS [Extent2]
)) THEN cast(0 as bit) END AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]

这需要根据您的条件进行2次扫描.

我不喜欢写,Count() > 0因为它隐藏了我的意图.我更喜欢使用自定义谓词:

public static class QueryExtensions
{
    public static bool Exists(this IQueryable source, Expression> predicate)
    {
        return source.Count(predicate) > 0;
    }
}



5> Timothy Gonz..:

这取决于数据集有多大以及您的性能要求是什么?

如果它没有什么巨大的使用最可读的形式,这对我来说是任何形式,因为它更短,更易读,而不是方程式.

推荐阅读
oDavid_仔o_880
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有