当前位置:  开发笔记 > 编程语言 > 正文

C#中的字符串基准 - 重构速度/可维护性

如何解决《C#中的字符串基准-重构速度/可维护性》经验,为你挑选了1个好方法。

我在自己的时间里一直在修补小函数,试图找到重构它们的方法(我最近阅读了Martin Fowler的书" 重构:改进现有代码的设计").我MakeNiceString()在更新它附近的代码库的另一部分时发现了以下函数,它看起来像是一个很好的候选人.事实上,没有真正的理由来替换它,但是它足够小并且做了一些小的事情,因此很容易遵循,但仍然可以获得"良好"的体验.

private static string MakeNiceString(string str)
        {
            char[] ca = str.ToCharArray();
            string result = null;
            int i = 0;
            result += System.Convert.ToString(ca[0]);
            for (i = 1; i <= ca.Length - 1; i++)
            {
                if (!(char.IsLower(ca[i])))
                {
                    result += " ";
                }
                result += System.Convert.ToString(ca[i]);
            }
            return result;
        }


static string SplitCamelCase(string str)
    {
        string[] temp = Regex.Split(str, @"(?

第一个函数MakeNiceString()是我在工作中更新的一些代码中找到的函数.该函数的目的是将ThisIsAString转换为This Is A String.它在代码中的六个位置使用,并且在整个方案中非常微不足道.

我将第二个函数纯粹作为学术练习构建,以确定使用正则表达式是否需要更长时间.

好吧,结果如下:

有10次迭代:

MakeNiceString took 2649 ticks
SplitCamelCase took 2502 ticks

然而,它在长途运输中发生了巨大的变化:

10,000次迭代:

MakeNiceString took 121625 ticks
SplitCamelCase took 443001 ticks

重构 MakeNiceString()

重构过程MakeNiceString()始于简单地删除正在发生的转换.这样做会产生以下结果:

MakeNiceString took 124716 ticks
ImprovedMakeNiceString took 118486

这是Refactor#1之后的代码:

private static string ImprovedMakeNiceString(string str)
        { //Removed Convert.ToString()
            char[] ca = str.ToCharArray();
            string result = null;
            int i = 0;
            result += ca[0];
            for (i = 1; i <= ca.Length - 1; i++)
            {
                if (!(char.IsLower(ca[i])))
                {
                    result += " ";
                }
                result += ca[i];
            }
            return result;
        }

重构#2 - 使用 StringBuilder

我的第二个任务是使用StringBuilder而不是String.由于String是不可变的,因此在整个循环中创建了不必要的副本.使用它的基准如下,代码如下:

static string RefactoredMakeNiceString(string str)
        {
            char[] ca = str.ToCharArray();
            StringBuilder sb = new StringBuilder((str.Length * 5 / 4));
            int i = 0;
            sb.Append(ca[0]);
            for (i = 1; i <= ca.Length - 1; i++)
            {
                if (!(char.IsLower(ca[i])))
                {
                    sb.Append(" ");
                }
                sb.Append(ca[i]);
            }
            return sb.ToString();
        }

这导致以下基准:

MakeNiceString Took:           124497 Ticks   //Original
SplitCamelCase Took:           464459 Ticks   //Regex
ImprovedMakeNiceString Took:   117369 Ticks   //Remove Conversion
RefactoredMakeNiceString Took:  38542 Ticks   //Using StringBuilder

for循环更改为循环会foreach导致以下基准测试结果:

static string RefactoredForEachMakeNiceString(string str)
        {
            char[] ca = str.ToCharArray();
            StringBuilder sb1 = new StringBuilder((str.Length * 5 / 4));
            sb1.Append(ca[0]);
            foreach (char c in ca)
            {
                if (!(char.IsLower(c)))
                {
                    sb1.Append(" ");
                }
                sb1.Append(c);
            }
            return sb1.ToString();
        }
RefactoredForEachMakeNiceString    Took:  45163 Ticks

正如您所看到的那样,维护方面,foreach循环将是最容易维护并具有"最干净"的外观.它比for循环稍慢,但更容易遵循.

替代重构:使用编译 Regex

在循环开始之前我将正则表达式移到了正确的位置,希望因为它只编译一次,它会执行得更快.我发现的(我确定我在某处有一个错误)是不会发生的,就像它应该:

static void runTest5()
        {
            Regex rg = new Regex(@"(?

最终基准结果:

MakeNiceString Took                   139363 Ticks
SplitCamelCase Took                   489174 Ticks
ImprovedMakeNiceString Took           115478 Ticks
RefactoredMakeNiceString Took          38819 Ticks
RefactoredForEachMakeNiceString Took   44700 Ticks
CompiledRegex Took                    227021 Ticks

或者,如果您更喜欢毫秒:

MakeNiceString Took                  38 ms
SplitCamelCase Took                 123 ms
ImprovedMakeNiceString Took          33 ms
RefactoredMakeNiceString Took        11 ms
RefactoredForEachMakeNiceString Took 12 ms
CompiledRegex Took                   63 ms

所以百分比收益是:

MakeNiceString                   38 ms   Baseline
SplitCamelCase                  123 ms   223% slower
ImprovedMakeNiceString           33 ms   13.15% faster
RefactoredMakeNiceString         11 ms   71.05% faster
RefactoredForEachMakeNiceString  12 ms   68.42% faster
CompiledRegex                    63 ms   65.79% slower

(请检查我的数学)

最后,我将替换那里的东西RefactoredForEachMakeNiceString(),当我在它的时候,我将把它重命名为有用的东西,比如SplitStringOnUpperCase.

基准测试:

要进行基准测试,我只需Stopwatch为每个方法调用调用一个新的:

       string myString = "ThisIsAUpperCaseString";
       Stopwatch sw = new Stopwatch();
       sw.Start();
       runTest();
       sw.Stop();

     static void runTest()
        {

            for (int i = 0; i < 10000; i++)
            {
                MakeNiceString(myString);
            }


        }

问题

是什么导致这些功能在"长期"中变得如此不同,以及

如何改进此功能a)更易于维护或b)运行更快?

我如何对这些进行内存基准测试以查看哪些内存使用较少?


感谢您迄今为止的回复.我已经插入了@Jon Skeet提出的所有建议,并希望得到关于我所提出的更新问题的反馈意见.

注意:这个问题旨在探索在C#中重构字符串处理函数的方法.我复制/粘贴了第一个代码as is.我很清楚你可以删除System.Convert.ToString()第一种方法,我就是这么做的.如果有人知道删除的任何影响System.Convert.ToString(),那么知道也会有所帮助.

Jon Skeet.. 17

1)使用StringBuilder,最好设置合理的初始容量(例如字符串长度*5/4,每四个字符允许一个额外的空格).

2)尝试使用foreach循环而不是for循环 - 它可能更简单

3)您不需要首先将字符串转换为char数组 - foreach将在字符串上工作,或使用索引器.

4)不要在任何地方进行额外的字符串转换 - 调用Convert.ToString(char)然后附加该字符串是没有意义的; 不需要单个字符串

5)对于第二个选项,只需在方法之外构建一次正则表达式.尝试使用RegexOptions.Compiled.

编辑:好的,完整的基准测试结果.我已经尝试了一些其他的东西,并且还使用相当多的迭代执行代码以获得更准确的结果.这只能在Eee PC上运行,所以毫无疑问它会在"真正的"PC上运行得更快,但我怀疑广泛的结果是合适的.首先是代码:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Reflection;
using System.Text;
using System.Text.RegularExpressions;

class Benchmark
{
    const string TestData = "ThisIsAUpperCaseString";
    const string ValidResult = "This Is A Upper Case String";
    const int Iterations = 1000000;

    static void Main(string[] args)
    {
        Test(BenchmarkOverhead);
        Test(MakeNiceString);
        Test(ImprovedMakeNiceString);
        Test(RefactoredMakeNiceString);
        Test(MakeNiceStringWithStringIndexer);
        Test(MakeNiceStringWithForeach);
        Test(MakeNiceStringWithForeachAndLinqSkip);
        Test(MakeNiceStringWithForeachAndCustomSkip);
        Test(SplitCamelCase);
        Test(SplitCamelCaseCachedRegex);
        Test(SplitCamelCaseCompiledRegex);        
    }

    static void Test(Func function)
    {
        Console.Write("{0}... ", function.Method.Name);
        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < Iterations; i++)
        {
            string result = function(TestData);
            if (result.Length != ValidResult.Length)
            {
                throw new Exception("Bad result: " + result);
            }
        }
        sw.Stop();
        Console.WriteLine(" {0}ms", sw.ElapsedMilliseconds);
        GC.Collect();
    }

    private static string BenchmarkOverhead(string str)
    {
        return ValidResult;
    }

    private static string MakeNiceString(string str)
    {
        char[] ca = str.ToCharArray();
        string result = null;
        int i = 0;
        result += System.Convert.ToString(ca[0]);
        for (i = 1; i <= ca.Length - 1; i++)
        {
            if (!(char.IsLower(ca[i])))
            {
                result += " ";
            }
            result += System.Convert.ToString(ca[i]);
        }
        return result;
    }

    private static string ImprovedMakeNiceString(string str)
    { //Removed Convert.ToString()
        char[] ca = str.ToCharArray();
        string result = null;
        int i = 0;
        result += ca[0];
        for (i = 1; i <= ca.Length - 1; i++)
        {
            if (!(char.IsLower(ca[i])))
            {
                result += " ";
            }
            result += ca[i];
        }
        return result;
    }

    private static string RefactoredMakeNiceString(string str)
    {
        char[] ca = str.ToCharArray();
        StringBuilder sb = new StringBuilder((str.Length * 5 / 4));
        int i = 0;
        sb.Append(ca[0]);
        for (i = 1; i <= ca.Length - 1; i++)
        {
            if (!(char.IsLower(ca[i])))
            {
                sb.Append(" ");
            }
            sb.Append(ca[i]);
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithStringIndexer(string str)
    {
        StringBuilder sb = new StringBuilder((str.Length * 5 / 4));
        sb.Append(str[0]);
        for (int i = 1; i < str.Length; i++)
        {
            char c = str[i];
            if (!(char.IsLower(c)))
            {
                sb.Append(" ");
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithForeach(string str)
    {
        StringBuilder sb = new StringBuilder(str.Length * 5 / 4);
        bool first = true;      
        foreach (char c in str)
        {
            if (!first && char.IsUpper(c))
            {
                sb.Append(" ");
            }
            sb.Append(c);
            first = false;
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithForeachAndLinqSkip(string str)
    {
        StringBuilder sb = new StringBuilder(str.Length * 5 / 4);
        sb.Append(str[0]);
        foreach (char c in str.Skip(1))
        {
            if (char.IsUpper(c))
            {
                sb.Append(" ");
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithForeachAndCustomSkip(string str)
    {
        StringBuilder sb = new StringBuilder(str.Length * 5 / 4);
        sb.Append(str[0]);
        foreach (char c in new SkipEnumerable(str, 1))
        {
            if (char.IsUpper(c))
            {
                sb.Append(" ");
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private static string SplitCamelCase(string str)
    {
        string[] temp = Regex.Split(str, @"(? : IEnumerable
    {
        private readonly IEnumerable original;
        private readonly int skip;

        public SkipEnumerable(IEnumerable original, int skip)
        {
            this.original = original;
            this.skip = skip;
        }

        public IEnumerator GetEnumerator()
        {
            IEnumerator ret = original.GetEnumerator();
            for (int i=0; i < skip; i++)
            {
                ret.MoveNext();
            }
            return ret;
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }
}

结果如下:

BenchmarkOverhead...  22ms
MakeNiceString...  10062ms
ImprovedMakeNiceString...  12367ms
RefactoredMakeNiceString...  3489ms
MakeNiceStringWithStringIndexer...  3115ms
MakeNiceStringWithForeach...  3292ms
MakeNiceStringWithForeachAndLinqSkip...  5702ms
MakeNiceStringWithForeachAndCustomSkip...  4490ms
SplitCamelCase...  68267ms
SplitCamelCaseCachedRegex...  52529ms
SplitCamelCaseCompiledRegex...  26806ms

正如您所看到的,字符串索引器版本是赢家 - 它也是非常简单的代码.

希望这会有所帮助......不要忘记,肯定会有其他我没有想过的选择!



1> Jon Skeet..:

1)使用StringBuilder,最好设置合理的初始容量(例如字符串长度*5/4,每四个字符允许一个额外的空格).

2)尝试使用foreach循环而不是for循环 - 它可能更简单

3)您不需要首先将字符串转换为char数组 - foreach将在字符串上工作,或使用索引器.

4)不要在任何地方进行额外的字符串转换 - 调用Convert.ToString(char)然后附加该字符串是没有意义的; 不需要单个字符串

5)对于第二个选项,只需在方法之外构建一次正则表达式.尝试使用RegexOptions.Compiled.

编辑:好的,完整的基准测试结果.我已经尝试了一些其他的东西,并且还使用相当多的迭代执行代码以获得更准确的结果.这只能在Eee PC上运行,所以毫无疑问它会在"真正的"PC上运行得更快,但我怀疑广泛的结果是合适的.首先是代码:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Reflection;
using System.Text;
using System.Text.RegularExpressions;

class Benchmark
{
    const string TestData = "ThisIsAUpperCaseString";
    const string ValidResult = "This Is A Upper Case String";
    const int Iterations = 1000000;

    static void Main(string[] args)
    {
        Test(BenchmarkOverhead);
        Test(MakeNiceString);
        Test(ImprovedMakeNiceString);
        Test(RefactoredMakeNiceString);
        Test(MakeNiceStringWithStringIndexer);
        Test(MakeNiceStringWithForeach);
        Test(MakeNiceStringWithForeachAndLinqSkip);
        Test(MakeNiceStringWithForeachAndCustomSkip);
        Test(SplitCamelCase);
        Test(SplitCamelCaseCachedRegex);
        Test(SplitCamelCaseCompiledRegex);        
    }

    static void Test(Func function)
    {
        Console.Write("{0}... ", function.Method.Name);
        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < Iterations; i++)
        {
            string result = function(TestData);
            if (result.Length != ValidResult.Length)
            {
                throw new Exception("Bad result: " + result);
            }
        }
        sw.Stop();
        Console.WriteLine(" {0}ms", sw.ElapsedMilliseconds);
        GC.Collect();
    }

    private static string BenchmarkOverhead(string str)
    {
        return ValidResult;
    }

    private static string MakeNiceString(string str)
    {
        char[] ca = str.ToCharArray();
        string result = null;
        int i = 0;
        result += System.Convert.ToString(ca[0]);
        for (i = 1; i <= ca.Length - 1; i++)
        {
            if (!(char.IsLower(ca[i])))
            {
                result += " ";
            }
            result += System.Convert.ToString(ca[i]);
        }
        return result;
    }

    private static string ImprovedMakeNiceString(string str)
    { //Removed Convert.ToString()
        char[] ca = str.ToCharArray();
        string result = null;
        int i = 0;
        result += ca[0];
        for (i = 1; i <= ca.Length - 1; i++)
        {
            if (!(char.IsLower(ca[i])))
            {
                result += " ";
            }
            result += ca[i];
        }
        return result;
    }

    private static string RefactoredMakeNiceString(string str)
    {
        char[] ca = str.ToCharArray();
        StringBuilder sb = new StringBuilder((str.Length * 5 / 4));
        int i = 0;
        sb.Append(ca[0]);
        for (i = 1; i <= ca.Length - 1; i++)
        {
            if (!(char.IsLower(ca[i])))
            {
                sb.Append(" ");
            }
            sb.Append(ca[i]);
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithStringIndexer(string str)
    {
        StringBuilder sb = new StringBuilder((str.Length * 5 / 4));
        sb.Append(str[0]);
        for (int i = 1; i < str.Length; i++)
        {
            char c = str[i];
            if (!(char.IsLower(c)))
            {
                sb.Append(" ");
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithForeach(string str)
    {
        StringBuilder sb = new StringBuilder(str.Length * 5 / 4);
        bool first = true;      
        foreach (char c in str)
        {
            if (!first && char.IsUpper(c))
            {
                sb.Append(" ");
            }
            sb.Append(c);
            first = false;
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithForeachAndLinqSkip(string str)
    {
        StringBuilder sb = new StringBuilder(str.Length * 5 / 4);
        sb.Append(str[0]);
        foreach (char c in str.Skip(1))
        {
            if (char.IsUpper(c))
            {
                sb.Append(" ");
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private static string MakeNiceStringWithForeachAndCustomSkip(string str)
    {
        StringBuilder sb = new StringBuilder(str.Length * 5 / 4);
        sb.Append(str[0]);
        foreach (char c in new SkipEnumerable(str, 1))
        {
            if (char.IsUpper(c))
            {
                sb.Append(" ");
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private static string SplitCamelCase(string str)
    {
        string[] temp = Regex.Split(str, @"(? : IEnumerable
    {
        private readonly IEnumerable original;
        private readonly int skip;

        public SkipEnumerable(IEnumerable original, int skip)
        {
            this.original = original;
            this.skip = skip;
        }

        public IEnumerator GetEnumerator()
        {
            IEnumerator ret = original.GetEnumerator();
            for (int i=0; i < skip; i++)
            {
                ret.MoveNext();
            }
            return ret;
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }
}

结果如下:

BenchmarkOverhead...  22ms
MakeNiceString...  10062ms
ImprovedMakeNiceString...  12367ms
RefactoredMakeNiceString...  3489ms
MakeNiceStringWithStringIndexer...  3115ms
MakeNiceStringWithForeach...  3292ms
MakeNiceStringWithForeachAndLinqSkip...  5702ms
MakeNiceStringWithForeachAndCustomSkip...  4490ms
SplitCamelCase...  68267ms
SplitCamelCaseCachedRegex...  52529ms
SplitCamelCaseCompiledRegex...  26806ms

正如您所看到的,字符串索引器版本是赢家 - 它也是非常简单的代码.

希望这会有所帮助......不要忘记,肯定会有其他我没有想过的选择!

推荐阅读
重庆制造漫画社
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有