任何人都拥有良好的资源或在C#中为FileInfo
数组提供自然顺序排序的样本?我正在实现IComparer
我的各种界面.
最简单的方法是在Windows中调用P/Invoke内置函数,并将其作为比较函数用于IComparer
:
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)] private static extern int StrCmpLogicalW(string psz1, string psz2);
迈克尔·卡普兰(Michael Kaplan)提供了一些关于此功能如何工作的示例,以及为使Vista更直观地工作所做的更改.此功能的优点是它将与其运行的Windows版本具有相同的行为,但这确实意味着它在Windows版本之间有所不同,因此您需要考虑这是否对您来说是个问题.
所以完整的实现将是这样的:
[SuppressUnmanagedCodeSecurity] internal static class SafeNativeMethods { [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)] public static extern int StrCmpLogicalW(string psz1, string psz2); } public sealed class NaturalStringComparer : IComparer{ public int Compare(string a, string b) { return SafeNativeMethods.StrCmpLogicalW(a, b); } } public sealed class NaturalFileInfoNameComparer : IComparer { public int Compare(FileInfo a, FileInfo b) { return SafeNativeMethods.StrCmpLogicalW(a.Name, b.Name); } }
只是想我会加入这个(用我能找到的最简洁的解决方案):
public static IOrderedEnumerableOrderByAlphaNumeric (this IEnumerable source, Func selector) { int max = source .SelectMany(i => Regex.Matches(selector(i), @"\d+").Cast ().Select(m => (int?)m.Value.Length)) .Max() ?? 0; return source.OrderBy(i => Regex.Replace(selector(i), @"\d+", m => m.Value.PadLeft(max, '0'))); }
上面将字符串中的任何数字填充到所有字符串中所有数字的最大长度,并使用生成的字符串进行排序.
转换为(int?
)是为了允许没有任何数字的字符串集合(.Max()
在空的可枚举抛出上InvalidOperationException
).
现有的实现都不是很好,所以我自己编写了.结果几乎与现代版本的Windows资源管理器(Windows 7/8)使用的排序相同.我见过的唯一区别是1)虽然Windows曾经(例如XP)处理任意长度的数字,但它现在限制为19位 - 我的是无限制的,2)Windows与某些Unicode数字集合给出不一致的结果 - 我的工作罚款(虽然它没有在数字上比较来自代理对的数字;也没有Windows),3)我不能区分不同类型的非主要排序权重,如果它们出现在不同的部分(例如"e-1é"vs" é1e-" - 数字之前和之后的部分具有变音符号和标点符号权重差异).
public static int CompareNatural(string strA, string strB) { return CompareNatural(strA, strB, CultureInfo.CurrentCulture, CompareOptions.IgnoreCase); } public static int CompareNatural(string strA, string strB, CultureInfo culture, CompareOptions options) { CompareInfo cmp = culture.CompareInfo; int iA = 0; int iB = 0; int softResult = 0; int softResultWeight = 0; while (iA < strA.Length && iB < strB.Length) { bool isDigitA = Char.IsDigit(strA[iA]); bool isDigitB = Char.IsDigit(strB[iB]); if (isDigitA != isDigitB) { return cmp.Compare(strA, iA, strB, iB, options); } else if (!isDigitA && !isDigitB) { int jA = iA + 1; int jB = iB + 1; while (jA < strA.Length && !Char.IsDigit(strA[jA])) jA++; while (jB < strB.Length && !Char.IsDigit(strB[jB])) jB++; int cmpResult = cmp.Compare(strA, iA, jA - iA, strB, iB, jB - iB, options); if (cmpResult != 0) { // Certain strings may be considered different due to "soft" differences that are // ignored if more significant differences follow, e.g. a hyphen only affects the // comparison if no other differences follow string sectionA = strA.Substring(iA, jA - iA); string sectionB = strB.Substring(iB, jB - iB); if (cmp.Compare(sectionA + "1", sectionB + "2", options) == cmp.Compare(sectionA + "2", sectionB + "1", options)) { return cmp.Compare(strA, iA, strB, iB, options); } else if (softResultWeight < 1) { softResult = cmpResult; softResultWeight = 1; } } iA = jA; iB = jB; } else { char zeroA = (char)(strA[iA] - (int)Char.GetNumericValue(strA[iA])); char zeroB = (char)(strB[iB] - (int)Char.GetNumericValue(strB[iB])); int jA = iA; int jB = iB; while (jA < strA.Length && strA[jA] == zeroA) jA++; while (jB < strB.Length && strB[jB] == zeroB) jB++; int resultIfSameLength = 0; do { isDigitA = jA < strA.Length && Char.IsDigit(strA[jA]); isDigitB = jB < strB.Length && Char.IsDigit(strB[jB]); int numA = isDigitA ? (int)Char.GetNumericValue(strA[jA]) : 0; int numB = isDigitB ? (int)Char.GetNumericValue(strB[jB]) : 0; if (isDigitA && (char)(strA[jA] - numA) != zeroA) isDigitA = false; if (isDigitB && (char)(strB[jB] - numB) != zeroB) isDigitB = false; if (isDigitA && isDigitB) { if (numA != numB && resultIfSameLength == 0) { resultIfSameLength = numA < numB ? -1 : 1; } jA++; jB++; } } while (isDigitA && isDigitB); if (isDigitA != isDigitB) { // One number has more digits than the other (ignoring leading zeros) - the longer // number must be larger return isDigitA ? 1 : -1; } else if (resultIfSameLength != 0) { // Both numbers are the same length (ignoring leading zeros) and at least one of // the digits differed - the first difference determines the result return resultIfSameLength; } int lA = jA - iA; int lB = jB - iB; if (lA != lB) { // Both numbers are equivalent but one has more leading zeros return lA > lB ? -1 : 1; } else if (zeroA != zeroB && softResultWeight < 2) { softResult = cmp.Compare(strA, iA, 1, strB, iB, 1, options); softResultWeight = 2; } iA = jA; iB = jB; } } if (iA < strA.Length || iB < strB.Length) { return iA < strA.Length ? 1 : -1; } else if (softResult != 0) { return softResult; } return 0; }
签名与Comparison
代表匹配:
string[] files = Directory.GetFiles(@"C:\"); Array.Sort(files, CompareNatural);
这是一个包装类,用作IComparer
:
public class CustomComparer: IComparer { private Comparison _comparison; public CustomComparer(Comparison comparison) { _comparison = comparison; } public int Compare(T x, T y) { return _comparison(x, y); } }
例:
string[] files = Directory.EnumerateFiles(@"C:\") .OrderBy(f => f, new CustomComparer(CompareNatural)) .ToArray();
这是我用于测试的一组很好的文件名:
Funcexpand = (s) => { int o; while ((o = s.IndexOf('\\')) != -1) { int p = o + 1; int z = 1; while (s[p] == '0') { z++; p++; } int c = Int32.Parse(s.Substring(p, z)); s = s.Substring(0, o) + new string(s[o - 1], c) + s.Substring(p + z); } return s; }; string encodedFileNames = "KDEqLW4xMiotbjEzKjAwMDFcMDY2KjAwMlwwMTcqMDA5XDAxNyowMlwwMTcqMDlcMDE3KjEhKjEtISox" + "LWEqMS4yNT8xLjI1KjEuNT8xLjUqMSoxXDAxNyoxXDAxOCoxXDAxOSoxXDA2NioxXDA2NyoxYSoyXDAx" + "NyoyXDAxOCo5XDAxNyo5XDAxOCo5XDA2Nio9MSphMDAxdGVzdDAxKmEwMDF0ZXN0aW5nYTBcMzEqYTAw" + "Mj9hMDAyIGE/YTAwMiBhKmEwMDIqYTAwMmE/YTAwMmEqYTAxdGVzdGluZ2EwMDEqYTAxdnNmcyphMSph" + "MWEqYTF6KmEyKmIwMDAzcTYqYjAwM3E0KmIwM3E1KmMtZSpjZCpjZipmIDEqZipnP2cgMT9oLW4qaG8t" + "bipJKmljZS1jcmVhbT9pY2VjcmVhbT9pY2VjcmVhbS0/ajBcNDE/ajAwMWE/ajAxP2shKmsnKmstKmsx" + "KmthKmxpc3QqbTAwMDNhMDA1YSptMDAzYTAwMDVhKm0wMDNhMDA1Km0wMDNhMDA1YSpuMTIqbjEzKm8t" + "bjAxMypvLW4xMipvLW40P28tbjQhP28tbjR6P28tbjlhLWI1Km8tbjlhYjUqb24wMTMqb24xMipvbjQ/" + "b240IT9vbjR6P29uOWEtYjUqb245YWI1Km/CrW4wMTMqb8KtbjEyKnAwMCpwMDEqcDAxwr0hKnAwMcK9" + "KnAwMcK9YSpwMDHCvcK+KnAwMipwMMK9KnEtbjAxMypxLW4xMipxbjAxMypxbjEyKnItMDAhKnItMDAh" + "NSpyLTAwIe+8lSpyLTAwYSpyLe+8kFwxIS01KnIt77yQXDEhLe+8lSpyLe+8kFwxISpyLe+8kFwxITUq" + "ci3vvJBcMSHvvJUqci3vvJBcMWEqci3vvJBcMyE1KnIwMCEqcjAwLTUqcjAwLjUqcjAwNSpyMDBhKnIw" + "NSpyMDYqcjQqcjUqctmg2aYqctmkKnLZpSpy27Dbtipy27Qqctu1KnLfgN+GKnLfhCpy34UqcuClpuCl" + "rCpy4KWqKnLgpasqcuCnpuCnrCpy4KeqKnLgp6sqcuCppuCprCpy4KmqKnLgqasqcuCrpuCrrCpy4Kuq" + "KnLgq6sqcuCtpuCtrCpy4K2qKnLgrasqcuCvpuCvrCpy4K+qKnLgr6sqcuCxpuCxrCpy4LGqKnLgsasq" + "cuCzpuCzrCpy4LOqKnLgs6sqcuC1puC1rCpy4LWqKnLgtasqcuC5kOC5lipy4LmUKnLguZUqcuC7kOC7" + "lipy4LuUKnLgu5UqcuC8oOC8pipy4LykKnLgvKUqcuGBgOGBhipy4YGEKnLhgYUqcuGCkOGClipy4YKU" + "KnLhgpUqcuGfoOGfpipy4Z+kKnLhn6UqcuGgkOGglipy4aCUKnLhoJUqcuGlhuGljCpy4aWKKnLhpYsq" + "cuGnkOGnlipy4aeUKnLhp5UqcuGtkOGtlipy4a2UKnLhrZUqcuGusOGutipy4a60KnLhrrUqcuGxgOGx" + "hipy4bGEKnLhsYUqcuGxkOGxlipy4bGUKnLhsZUqcuqYoFwx6pilKnLqmKDqmKUqcuqYoOqYpipy6pik" + "KnLqmKUqcuqjkOqjlipy6qOUKnLqo5UqcuqkgOqkhipy6qSEKnLqpIUqcuqpkOqplipy6qmUKnLqqZUq" + "cvCQkqAqcvCQkqUqcvCdn5gqcvCdn50qcu+8kFwxISpy77yQXDEt77yVKnLvvJBcMS7vvJUqcu+8kFwx" + "YSpy77yQXDHqmKUqcu+8kFwx77yO77yVKnLvvJBcMe+8lSpy77yQ77yVKnLvvJDvvJYqcu+8lCpy77yV" + "KnNpKnPEsSp0ZXN02aIqdGVzdNmi2aAqdGVzdNmjKnVBZS0qdWFlKnViZS0qdUJlKnVjZS0xw6kqdWNl" + "McOpLSp1Y2Uxw6kqdWPDqS0xZSp1Y8OpMWUtKnVjw6kxZSp3ZWlhMSp3ZWlhMip3ZWlzczEqd2Vpc3My" + "KndlaXoxKndlaXoyKndlacOfMSp3ZWnDnzIqeSBhMyp5IGE0KnknYTMqeSdhNCp5K2EzKnkrYTQqeS1h" + "Myp5LWE0KnlhMyp5YTQqej96IDA1MD96IDIxP3ohMjE/ejIwP3oyMj96YTIxP3rCqTIxP1sxKl8xKsKt" + "bjEyKsKtbjEzKsSwKg=="; string[] fileNames = Encoding.UTF8.GetString(Convert.FromBase64String(encodedFileNames)) .Replace("*", ".txt?").Split(new[] { "?" }, StringSplitOptions.RemoveEmptyEntries) .Select(n => expand(n)).ToArray();
适用于linq orderby的纯C#解决方案:
http://zootfroot.blogspot.com/2009/09/natural-sort-compare-with-linq-orderby.html
public class NaturalSortComparer: IComparer , IDisposable { private bool isAscending; public NaturalSortComparer(bool inAscendingOrder = true) { this.isAscending = inAscendingOrder; } #region IComparer Members public int Compare(string x, string y) { throw new NotImplementedException(); } #endregion #region IComparer Members int IComparer .Compare(string x, string y) { if (x == y) return 0; string[] x1, y1; if (!table.TryGetValue(x, out x1)) { x1 = Regex.Split(x.Replace(" ", ""), "([0-9]+)"); table.Add(x, x1); } if (!table.TryGetValue(y, out y1)) { y1 = Regex.Split(y.Replace(" ", ""), "([0-9]+)"); table.Add(y, y1); } int returnVal; for (int i = 0; i < x1.Length && i < y1.Length; i++) { if (x1[i] != y1[i]) { returnVal = PartCompare(x1[i], y1[i]); return isAscending ? returnVal : -returnVal; } } if (y1.Length > x1.Length) { returnVal = 1; } else if (x1.Length > y1.Length) { returnVal = -1; } else { returnVal = 0; } return isAscending ? returnVal : -returnVal; } private static int PartCompare(string left, string right) { int x, y; if (!int.TryParse(left, out x)) return left.CompareTo(right); if (!int.TryParse(right, out y)) return left.CompareTo(right); return x.CompareTo(y); } #endregion private Dictionary table = new Dictionary (); public void Dispose() { table.Clear(); table = null; } }
我的解决方案
void Main() { new[] {"a4","a3","a2","a10","b5","b4","b400","1","C1d","c1d2"}.OrderBy(x => x, new NaturalStringComparer()).Dump(); } public class NaturalStringComparer : IComparer{ private static readonly Regex _re = new Regex(@"(?<=\D)(?=\d)|(?<=\d)(?=\D)", RegexOptions.Compiled); public int Compare(string x, string y) { x = x.ToLower(); y = y.ToLower(); if(string.Compare(x, 0, y, 0, Math.Min(x.Length, y.Length)) == 0) { if(x.Length == y.Length) return 0; return x.Length < y.Length ? -1 : 1; } var a = _re.Split(x); var b = _re.Split(y); int i = 0; while(true) { int r = PartCompare(a[i], b[i]); if(r != 0) return r; ++i; } } private static int PartCompare(string x, string y) { int a, b; if(int.TryParse(x, out a) && int.TryParse(y, out b)) return a.CompareTo(b); return x.CompareTo(y); } }
结果:
1 a2 a3 a4 a10 b4 b5 b400 C1d c1d2
Matthews Horsley的答案是最快的方法,它不会改变行为,具体取决于您的程序运行的Windows版本.但是,通过创建一次正则表达式并使用RegexOptions.Compiled可以更快.我还添加了插入字符串比较器的选项,以便您可以根据需要忽略大小写,并提高可读性.
public static IEnumerableOrderByNatural (this IEnumerable items, Func selector, StringComparer stringComparer = null) { var regex = new Regex(@"\d+", RegexOptions.Compiled); int maxDigits = items .SelectMany(i => regex.Matches(selector(i)).Cast ().Select(digitChunk => (int?)digitChunk.Value.Length)) .Max() ?? 0; return items.OrderBy(i => regex.Replace(selector(i), match => match.Value.PadLeft(maxDigits, '0')), stringComparer ?? StringComparer.CurrentCulture); }
使用方式
var sortedEmployees = employees.OrderByNatural(emp => emp.Name);
这需要450毫秒来排序100,000个字符串,而默认的.net字符串比较需要300毫秒 - 相当快!
你需要小心 - 我隐约回想起StrCmpLogicalW,或类似的东西,并不严格传递,我观察到.NET的排序方法有时会陷入无限循环,如果比较函数打破了这个规则.
传递性比较将始终报告
这是我的代码,用于对包含字母和数字字符的字符串进行排序.
首先,这个扩展方法:
public static IEnumerableAlphanumericSort(this IEnumerable me) { return me.OrderBy(x => Regex.Replace(x, @"\d+", m => m.Value.PadLeft(50, '0'))); }
然后,只需在代码中的任何位置使用它,如下所示:
Listtest = new List () { "The 1st", "The 12th", "The 2nd" }; test = test.AlphanumericSort();
它是如何工作的?用零替换:
Original | Regex Replace | The | Returned List | Apply PadLeft | Sorting | List | | | "The 1st" | "The 001st" | "The 001st" | "The 1st" "The 12th" | "The 012th" | "The 002nd" | "The 2nd" "The 2nd" | "The 002nd" | "The 012th" | "The 12th"
使用倍数:
Alphabetical Sorting | Alphanumeric Sorting | "Page 21, Line 42" | "Page 3, Line 7" "Page 21, Line 5" | "Page 3, Line 32" "Page 3, Line 32" | "Page 21, Line 5" "Page 3, Line 7" | "Page 21, Line 42"
希望这会有所帮助.
添加到Greg Beech的答案(因为我一直在寻找它),如果你想从Linq使用它,你可以使用OrderBy
它需要一个IComparer
.例如:
var items = new List(); // fill items var sorted = items.OrderBy(item => item.Name, new NaturalStringComparer());