我在自己的时间里一直在修补小函数,试图找到重构它们的方法(我最近阅读了Martin Fowler的书" 重构:改进现有代码的设计").我MakeNiceString()
在更新它附近的代码库的另一部分时发现了以下函数,它看起来像是一个很好的候选人.事实上,没有真正的理由来替换它,但是它足够小并且做了一些小的事情,因此很容易遵循,但仍然可以获得"良好"的体验.
private static string MakeNiceString(string str) { char[] ca = str.ToCharArray(); string result = null; int i = 0; result += System.Convert.ToString(ca[0]); for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += System.Convert.ToString(ca[i]); } return result; } static string SplitCamelCase(string str) { string[] temp = Regex.Split(str, @"(?第一个函数
MakeNiceString()
是我在工作中更新的一些代码中找到的函数.该函数的目的是将ThisIsAString转换为This Is A String.它在代码中的六个位置使用,并且在整个方案中非常微不足道.我将第二个函数纯粹作为学术练习构建,以确定使用正则表达式是否需要更长时间.
好吧,结果如下:
有10次迭代:
MakeNiceString took 2649 ticks SplitCamelCase took 2502 ticks然而,它在长途运输中发生了巨大的变化:
10,000次迭代:
MakeNiceString took 121625 ticks SplitCamelCase took 443001 ticks
重构
MakeNiceString()
重构过程
MakeNiceString()
始于简单地删除正在发生的转换.这样做会产生以下结果:MakeNiceString took 124716 ticks ImprovedMakeNiceString took 118486这是Refactor#1之后的代码:
private static string ImprovedMakeNiceString(string str) { //Removed Convert.ToString() char[] ca = str.ToCharArray(); string result = null; int i = 0; result += ca[0]; for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += ca[i]; } return result; }重构#2 - 使用
StringBuilder
我的第二个任务是使用
StringBuilder
而不是String
.由于String
是不可变的,因此在整个循环中创建了不必要的副本.使用它的基准如下,代码如下:static string RefactoredMakeNiceString(string str) { char[] ca = str.ToCharArray(); StringBuilder sb = new StringBuilder((str.Length * 5 / 4)); int i = 0; sb.Append(ca[0]); for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { sb.Append(" "); } sb.Append(ca[i]); } return sb.ToString(); }这导致以下基准:
MakeNiceString Took: 124497 Ticks //Original SplitCamelCase Took: 464459 Ticks //Regex ImprovedMakeNiceString Took: 117369 Ticks //Remove Conversion RefactoredMakeNiceString Took: 38542 Ticks //Using StringBuilder将
for
循环更改为循环会foreach
导致以下基准测试结果:static string RefactoredForEachMakeNiceString(string str) { char[] ca = str.ToCharArray(); StringBuilder sb1 = new StringBuilder((str.Length * 5 / 4)); sb1.Append(ca[0]); foreach (char c in ca) { if (!(char.IsLower(c))) { sb1.Append(" "); } sb1.Append(c); } return sb1.ToString(); }RefactoredForEachMakeNiceString Took: 45163 Ticks正如您所看到的那样,维护方面,
foreach
循环将是最容易维护并具有"最干净"的外观.它比for
循环稍慢,但更容易遵循.替代重构:使用编译
Regex
在循环开始之前我将正则表达式移到了正确的位置,希望因为它只编译一次,它会执行得更快.我发现的(我确定我在某处有一个错误)是不会发生的,就像它应该:
static void runTest5() { Regex rg = new Regex(@"(?最终基准结果:
MakeNiceString Took 139363 Ticks SplitCamelCase Took 489174 Ticks ImprovedMakeNiceString Took 115478 Ticks RefactoredMakeNiceString Took 38819 Ticks RefactoredForEachMakeNiceString Took 44700 Ticks CompiledRegex Took 227021 Ticks或者,如果您更喜欢毫秒:
MakeNiceString Took 38 ms SplitCamelCase Took 123 ms ImprovedMakeNiceString Took 33 ms RefactoredMakeNiceString Took 11 ms RefactoredForEachMakeNiceString Took 12 ms CompiledRegex Took 63 ms所以百分比收益是:
MakeNiceString 38 ms Baseline SplitCamelCase 123 ms 223% slower ImprovedMakeNiceString 33 ms 13.15% faster RefactoredMakeNiceString 11 ms 71.05% faster RefactoredForEachMakeNiceString 12 ms 68.42% faster CompiledRegex 63 ms 65.79% slower(请检查我的数学)
最后,我将替换那里的东西
RefactoredForEachMakeNiceString()
,当我在它的时候,我将把它重命名为有用的东西,比如SplitStringOnUpperCase
.基准测试:
要进行基准测试,我只需
Stopwatch
为每个方法调用调用一个新的:string myString = "ThisIsAUpperCaseString"; Stopwatch sw = new Stopwatch(); sw.Start(); runTest(); sw.Stop(); static void runTest() { for (int i = 0; i < 10000; i++) { MakeNiceString(myString); } }
问题
是什么导致这些功能在"长期"中变得如此不同,以及
如何改进此功能a)更易于维护或b)运行更快?
我如何对这些进行内存基准测试以查看哪些内存使用较少?
感谢您迄今为止的回复.我已经插入了@Jon Skeet提出的所有建议,并希望得到关于我所提出的更新问题的反馈意见.
注意:这个问题旨在探索在C#中重构字符串处理函数的方法.我复制/粘贴了第一个代码
as is
.我很清楚你可以删除System.Convert.ToString()
第一种方法,我就是这么做的.如果有人知道删除的任何影响System.Convert.ToString()
,那么知道也会有所帮助.
Jon Skeet.. 17
1)使用StringBuilder,最好设置合理的初始容量(例如字符串长度*5/4,每四个字符允许一个额外的空格).
2)尝试使用foreach循环而不是for循环 - 它可能更简单
3)您不需要首先将字符串转换为char数组 - foreach将在字符串上工作,或使用索引器.
4)不要在任何地方进行额外的字符串转换 - 调用Convert.ToString(char)然后附加该字符串是没有意义的; 不需要单个字符串
5)对于第二个选项,只需在方法之外构建一次正则表达式.尝试使用RegexOptions.Compiled.
编辑:好的,完整的基准测试结果.我已经尝试了一些其他的东西,并且还使用相当多的迭代执行代码以获得更准确的结果.这只能在Eee PC上运行,所以毫无疑问它会在"真正的"PC上运行得更快,但我怀疑广泛的结果是合适的.首先是代码:
using System; using System.Collections; using System.Collections.Generic; using System.Diagnostics; using System.Linq; using System.Reflection; using System.Text; using System.Text.RegularExpressions; class Benchmark { const string TestData = "ThisIsAUpperCaseString"; const string ValidResult = "This Is A Upper Case String"; const int Iterations = 1000000; static void Main(string[] args) { Test(BenchmarkOverhead); Test(MakeNiceString); Test(ImprovedMakeNiceString); Test(RefactoredMakeNiceString); Test(MakeNiceStringWithStringIndexer); Test(MakeNiceStringWithForeach); Test(MakeNiceStringWithForeachAndLinqSkip); Test(MakeNiceStringWithForeachAndCustomSkip); Test(SplitCamelCase); Test(SplitCamelCaseCachedRegex); Test(SplitCamelCaseCompiledRegex); } static void Test(Funcfunction) { Console.Write("{0}... ", function.Method.Name); Stopwatch sw = Stopwatch.StartNew(); for (int i=0; i < Iterations; i++) { string result = function(TestData); if (result.Length != ValidResult.Length) { throw new Exception("Bad result: " + result); } } sw.Stop(); Console.WriteLine(" {0}ms", sw.ElapsedMilliseconds); GC.Collect(); } private static string BenchmarkOverhead(string str) { return ValidResult; } private static string MakeNiceString(string str) { char[] ca = str.ToCharArray(); string result = null; int i = 0; result += System.Convert.ToString(ca[0]); for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += System.Convert.ToString(ca[i]); } return result; } private static string ImprovedMakeNiceString(string str) { //Removed Convert.ToString() char[] ca = str.ToCharArray(); string result = null; int i = 0; result += ca[0]; for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += ca[i]; } return result; } private static string RefactoredMakeNiceString(string str) { char[] ca = str.ToCharArray(); StringBuilder sb = new StringBuilder((str.Length * 5 / 4)); int i = 0; sb.Append(ca[0]); for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { sb.Append(" "); } sb.Append(ca[i]); } return sb.ToString(); } private static string MakeNiceStringWithStringIndexer(string str) { StringBuilder sb = new StringBuilder((str.Length * 5 / 4)); sb.Append(str[0]); for (int i = 1; i < str.Length; i++) { char c = str[i]; if (!(char.IsLower(c))) { sb.Append(" "); } sb.Append(c); } return sb.ToString(); } private static string MakeNiceStringWithForeach(string str) { StringBuilder sb = new StringBuilder(str.Length * 5 / 4); bool first = true; foreach (char c in str) { if (!first && char.IsUpper(c)) { sb.Append(" "); } sb.Append(c); first = false; } return sb.ToString(); } private static string MakeNiceStringWithForeachAndLinqSkip(string str) { StringBuilder sb = new StringBuilder(str.Length * 5 / 4); sb.Append(str[0]); foreach (char c in str.Skip(1)) { if (char.IsUpper(c)) { sb.Append(" "); } sb.Append(c); } return sb.ToString(); } private static string MakeNiceStringWithForeachAndCustomSkip(string str) { StringBuilder sb = new StringBuilder(str.Length * 5 / 4); sb.Append(str[0]); foreach (char c in new SkipEnumerable (str, 1)) { if (char.IsUpper(c)) { sb.Append(" "); } sb.Append(c); } return sb.ToString(); } private static string SplitCamelCase(string str) { string[] temp = Regex.Split(str, @"(? : IEnumerable { private readonly IEnumerable original; private readonly int skip; public SkipEnumerable(IEnumerable original, int skip) { this.original = original; this.skip = skip; } public IEnumerator GetEnumerator() { IEnumerator ret = original.GetEnumerator(); for (int i=0; i < skip; i++) { ret.MoveNext(); } return ret; } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } } 结果如下:
BenchmarkOverhead... 22ms MakeNiceString... 10062ms ImprovedMakeNiceString... 12367ms RefactoredMakeNiceString... 3489ms MakeNiceStringWithStringIndexer... 3115ms MakeNiceStringWithForeach... 3292ms MakeNiceStringWithForeachAndLinqSkip... 5702ms MakeNiceStringWithForeachAndCustomSkip... 4490ms SplitCamelCase... 68267ms SplitCamelCaseCachedRegex... 52529ms SplitCamelCaseCompiledRegex... 26806ms正如您所看到的,字符串索引器版本是赢家 - 它也是非常简单的代码.
希望这会有所帮助......不要忘记,肯定会有其他我没有想过的选择!
1> Jon Skeet..:1)使用StringBuilder,最好设置合理的初始容量(例如字符串长度*5/4,每四个字符允许一个额外的空格).
2)尝试使用foreach循环而不是for循环 - 它可能更简单
3)您不需要首先将字符串转换为char数组 - foreach将在字符串上工作,或使用索引器.
4)不要在任何地方进行额外的字符串转换 - 调用Convert.ToString(char)然后附加该字符串是没有意义的; 不需要单个字符串
5)对于第二个选项,只需在方法之外构建一次正则表达式.尝试使用RegexOptions.Compiled.
编辑:好的,完整的基准测试结果.我已经尝试了一些其他的东西,并且还使用相当多的迭代执行代码以获得更准确的结果.这只能在Eee PC上运行,所以毫无疑问它会在"真正的"PC上运行得更快,但我怀疑广泛的结果是合适的.首先是代码:
using System; using System.Collections; using System.Collections.Generic; using System.Diagnostics; using System.Linq; using System.Reflection; using System.Text; using System.Text.RegularExpressions; class Benchmark { const string TestData = "ThisIsAUpperCaseString"; const string ValidResult = "This Is A Upper Case String"; const int Iterations = 1000000; static void Main(string[] args) { Test(BenchmarkOverhead); Test(MakeNiceString); Test(ImprovedMakeNiceString); Test(RefactoredMakeNiceString); Test(MakeNiceStringWithStringIndexer); Test(MakeNiceStringWithForeach); Test(MakeNiceStringWithForeachAndLinqSkip); Test(MakeNiceStringWithForeachAndCustomSkip); Test(SplitCamelCase); Test(SplitCamelCaseCachedRegex); Test(SplitCamelCaseCompiledRegex); } static void Test(Funcfunction) { Console.Write("{0}... ", function.Method.Name); Stopwatch sw = Stopwatch.StartNew(); for (int i=0; i < Iterations; i++) { string result = function(TestData); if (result.Length != ValidResult.Length) { throw new Exception("Bad result: " + result); } } sw.Stop(); Console.WriteLine(" {0}ms", sw.ElapsedMilliseconds); GC.Collect(); } private static string BenchmarkOverhead(string str) { return ValidResult; } private static string MakeNiceString(string str) { char[] ca = str.ToCharArray(); string result = null; int i = 0; result += System.Convert.ToString(ca[0]); for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += System.Convert.ToString(ca[i]); } return result; } private static string ImprovedMakeNiceString(string str) { //Removed Convert.ToString() char[] ca = str.ToCharArray(); string result = null; int i = 0; result += ca[0]; for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += ca[i]; } return result; } private static string RefactoredMakeNiceString(string str) { char[] ca = str.ToCharArray(); StringBuilder sb = new StringBuilder((str.Length * 5 / 4)); int i = 0; sb.Append(ca[0]); for (i = 1; i <= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { sb.Append(" "); } sb.Append(ca[i]); } return sb.ToString(); } private static string MakeNiceStringWithStringIndexer(string str) { StringBuilder sb = new StringBuilder((str.Length * 5 / 4)); sb.Append(str[0]); for (int i = 1; i < str.Length; i++) { char c = str[i]; if (!(char.IsLower(c))) { sb.Append(" "); } sb.Append(c); } return sb.ToString(); } private static string MakeNiceStringWithForeach(string str) { StringBuilder sb = new StringBuilder(str.Length * 5 / 4); bool first = true; foreach (char c in str) { if (!first && char.IsUpper(c)) { sb.Append(" "); } sb.Append(c); first = false; } return sb.ToString(); } private static string MakeNiceStringWithForeachAndLinqSkip(string str) { StringBuilder sb = new StringBuilder(str.Length * 5 / 4); sb.Append(str[0]); foreach (char c in str.Skip(1)) { if (char.IsUpper(c)) { sb.Append(" "); } sb.Append(c); } return sb.ToString(); } private static string MakeNiceStringWithForeachAndCustomSkip(string str) { StringBuilder sb = new StringBuilder(str.Length * 5 / 4); sb.Append(str[0]); foreach (char c in new SkipEnumerable (str, 1)) { if (char.IsUpper(c)) { sb.Append(" "); } sb.Append(c); } return sb.ToString(); } private static string SplitCamelCase(string str) { string[] temp = Regex.Split(str, @"(? : IEnumerable { private readonly IEnumerable original; private readonly int skip; public SkipEnumerable(IEnumerable original, int skip) { this.original = original; this.skip = skip; } public IEnumerator GetEnumerator() { IEnumerator ret = original.GetEnumerator(); for (int i=0; i < skip; i++) { ret.MoveNext(); } return ret; } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } } 结果如下:
BenchmarkOverhead... 22ms MakeNiceString... 10062ms ImprovedMakeNiceString... 12367ms RefactoredMakeNiceString... 3489ms MakeNiceStringWithStringIndexer... 3115ms MakeNiceStringWithForeach... 3292ms MakeNiceStringWithForeachAndLinqSkip... 5702ms MakeNiceStringWithForeachAndCustomSkip... 4490ms SplitCamelCase... 68267ms SplitCamelCaseCachedRegex... 52529ms SplitCamelCaseCompiledRegex... 26806ms正如您所看到的,字符串索引器版本是赢家 - 它也是非常简单的代码.
希望这会有所帮助......不要忘记,肯定会有其他我没有想过的选择!