当前位置:  开发笔记 > 编程语言 > 正文

C#中的自然排序顺序

如何解决《C#中的自然排序顺序》经验,为你挑选了9个好方法。

任何人都拥有良好的资源或在C#中为FileInfo数组提供自然顺序排序的样本?我正在实现IComparer我的各种界面.



1> Greg Beech..:

最简单的方法是在Windows中调用P/Invoke内置函数,并将其作为比较函数用于IComparer:

[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);

迈克尔·卡普兰(Michael Kaplan)提供了一些关于此功能如何工作的示例,以及为使Vista更直观地工作所做的更改.此功能的优点是它将与其运行的Windows版本具有相同的行为,但这确实意味着它在Windows版本之间有所不同,因此您需要考虑这是否对您来说是个问题.

所以完整的实现将是这样的:

[SuppressUnmanagedCodeSecurity]
internal static class SafeNativeMethods
{
    [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
    public static extern int StrCmpLogicalW(string psz1, string psz2);
}

public sealed class NaturalStringComparer : IComparer
{
    public int Compare(string a, string b)
    {
        return SafeNativeMethods.StrCmpLogicalW(a, b);
    }
}

public sealed class NaturalFileInfoNameComparer : IComparer
{
    public int Compare(FileInfo a, FileInfo b)
    {
        return SafeNativeMethods.StrCmpLogicalW(a.Name, b.Name);
    }
}


@linquize - 他说.NET不是Mono,所以Linux/OSX并不是真正的问题.发布此答案时,2008年Windows Phone/Metro不存在.你经常在Silverlight中进行文件操作?因此对OP而言,可能是大多数其他人,这是一个合适的答案.无论如何,你可以自由地提供更好的答案; 这就是这个网站的工作方式.
这不是便携式的,只能在Win32中使用,但在Linux/MacOS/Silverlight/Windows Phone/Metro中不起作用
很好的答案.警告:这对于Win2000来说是行不通的,因为那些操作系统仍在运行的人很少.另一方面,Kaplan的博客和MSDN文档之间有足够的提示来创建类似的功能.
这并不意味着原来的答案是错误的.我只是添加了最新信息的附加信息
仅供参考,如果您继承自Comparer <T>而不是实现IComparer <T>,则将获得IComparer(非泛型)接口的内置实现,该接口调用您的泛型方法,供API使用用那个代替。基本上也是免费的:只需删除“ I”并将“ public int Compare(...)”更改为“ public overlay int Compare(...)”。与IEqualityComparer <T>和EqualityComparer <T>相同。

2> 小智..:

只是想我会加入这个(用我能找到的最简洁的解决方案):

public static IOrderedEnumerable OrderByAlphaNumeric(this IEnumerable source, Func selector)
{
    int max = source
        .SelectMany(i => Regex.Matches(selector(i), @"\d+").Cast().Select(m => (int?)m.Value.Length))
        .Max() ?? 0;

    return source.OrderBy(i => Regex.Replace(selector(i), @"\d+", m => m.Value.PadLeft(max, '0')));
}

上面将字符串中的任何数字填充到所有字符串中所有数字的最大长度,并使用生成的字符串进行排序.

转换为(int?)是为了允许没有任何数字的字符串集合(.Max()在空的可枚举抛出上InvalidOperationException).


这既美观又无法阅读.我认为Linq的好处将意味着(至少)最佳的平均和最佳表现,所以我想我会选择它.尽管缺乏清晰度.非常感谢@Matthew Horsley
我想你可以使用`.DefaultIfEmpty().Max()`而不是强制转换为`int?`.另外值得做一个`source.ToList()`以避免重新枚举可枚举.

3> J.D...:

现有的实现都不是很好,所以我自己编写了.结果几乎与现代版本的Windows资源管理器(Windows 7/8)使用的排序相同.我见过的唯一区别是1)虽然Windows曾经(例如XP)处理任意长度的数字,但它现在限制为19位 - 我的是无限制的,2)Windows与某些Unicode数字集合给出不一致的结果 - 我的工作罚款(虽然它没有在数字上比较来自代理对的数字;也没有Windows),3)我不能区分不同类型的非主要排序权重,如果它们出现在不同的部分(例如"e-1é"vs" é1e-" - 数字之前和之后的部分具有变音符号和标点符号权重差异).

public static int CompareNatural(string strA, string strB) {
    return CompareNatural(strA, strB, CultureInfo.CurrentCulture, CompareOptions.IgnoreCase);
}

public static int CompareNatural(string strA, string strB, CultureInfo culture, CompareOptions options) {
    CompareInfo cmp = culture.CompareInfo;
    int iA = 0;
    int iB = 0;
    int softResult = 0;
    int softResultWeight = 0;
    while (iA < strA.Length && iB < strB.Length) {
        bool isDigitA = Char.IsDigit(strA[iA]);
        bool isDigitB = Char.IsDigit(strB[iB]);
        if (isDigitA != isDigitB) {
            return cmp.Compare(strA, iA, strB, iB, options);
        }
        else if (!isDigitA && !isDigitB) {
            int jA = iA + 1;
            int jB = iB + 1;
            while (jA < strA.Length && !Char.IsDigit(strA[jA])) jA++;
            while (jB < strB.Length && !Char.IsDigit(strB[jB])) jB++;
            int cmpResult = cmp.Compare(strA, iA, jA - iA, strB, iB, jB - iB, options);
            if (cmpResult != 0) {
                // Certain strings may be considered different due to "soft" differences that are
                // ignored if more significant differences follow, e.g. a hyphen only affects the
                // comparison if no other differences follow
                string sectionA = strA.Substring(iA, jA - iA);
                string sectionB = strB.Substring(iB, jB - iB);
                if (cmp.Compare(sectionA + "1", sectionB + "2", options) ==
                    cmp.Compare(sectionA + "2", sectionB + "1", options))
                {
                    return cmp.Compare(strA, iA, strB, iB, options);
                }
                else if (softResultWeight < 1) {
                    softResult = cmpResult;
                    softResultWeight = 1;
                }
            }
            iA = jA;
            iB = jB;
        }
        else {
            char zeroA = (char)(strA[iA] - (int)Char.GetNumericValue(strA[iA]));
            char zeroB = (char)(strB[iB] - (int)Char.GetNumericValue(strB[iB]));
            int jA = iA;
            int jB = iB;
            while (jA < strA.Length && strA[jA] == zeroA) jA++;
            while (jB < strB.Length && strB[jB] == zeroB) jB++;
            int resultIfSameLength = 0;
            do {
                isDigitA = jA < strA.Length && Char.IsDigit(strA[jA]);
                isDigitB = jB < strB.Length && Char.IsDigit(strB[jB]);
                int numA = isDigitA ? (int)Char.GetNumericValue(strA[jA]) : 0;
                int numB = isDigitB ? (int)Char.GetNumericValue(strB[jB]) : 0;
                if (isDigitA && (char)(strA[jA] - numA) != zeroA) isDigitA = false;
                if (isDigitB && (char)(strB[jB] - numB) != zeroB) isDigitB = false;
                if (isDigitA && isDigitB) {
                    if (numA != numB && resultIfSameLength == 0) {
                        resultIfSameLength = numA < numB ? -1 : 1;
                    }
                    jA++;
                    jB++;
                }
            }
            while (isDigitA && isDigitB);
            if (isDigitA != isDigitB) {
                // One number has more digits than the other (ignoring leading zeros) - the longer
                // number must be larger
                return isDigitA ? 1 : -1;
            }
            else if (resultIfSameLength != 0) {
                // Both numbers are the same length (ignoring leading zeros) and at least one of
                // the digits differed - the first difference determines the result
                return resultIfSameLength;
            }
            int lA = jA - iA;
            int lB = jB - iB;
            if (lA != lB) {
                // Both numbers are equivalent but one has more leading zeros
                return lA > lB ? -1 : 1;
            }
            else if (zeroA != zeroB && softResultWeight < 2) {
                softResult = cmp.Compare(strA, iA, 1, strB, iB, 1, options);
                softResultWeight = 2;
            }
            iA = jA;
            iB = jB;
        }
    }
    if (iA < strA.Length || iB < strB.Length) {
        return iA < strA.Length ? 1 : -1;
    }
    else if (softResult != 0) {
        return softResult;
    }
    return 0;
}

签名与Comparison代表匹配:

string[] files = Directory.GetFiles(@"C:\");
Array.Sort(files, CompareNatural);

这是一个包装类,用作IComparer:

public class CustomComparer : IComparer {
    private Comparison _comparison;

    public CustomComparer(Comparison comparison) {
        _comparison = comparison;
    }

    public int Compare(T x, T y) {
        return _comparison(x, y);
    }
}

例:

string[] files = Directory.EnumerateFiles(@"C:\")
    .OrderBy(f => f, new CustomComparer(CompareNatural))
    .ToArray();

这是我用于测试的一组很好的文件名:

Func expand = (s) => { int o; while ((o = s.IndexOf('\\')) != -1) { int p = o + 1;
    int z = 1; while (s[p] == '0') { z++; p++; } int c = Int32.Parse(s.Substring(p, z));
    s = s.Substring(0, o) + new string(s[o - 1], c) + s.Substring(p + z); } return s; };
string encodedFileNames =
    "KDEqLW4xMiotbjEzKjAwMDFcMDY2KjAwMlwwMTcqMDA5XDAxNyowMlwwMTcqMDlcMDE3KjEhKjEtISox" +
    "LWEqMS4yNT8xLjI1KjEuNT8xLjUqMSoxXDAxNyoxXDAxOCoxXDAxOSoxXDA2NioxXDA2NyoxYSoyXDAx" +
    "NyoyXDAxOCo5XDAxNyo5XDAxOCo5XDA2Nio9MSphMDAxdGVzdDAxKmEwMDF0ZXN0aW5nYTBcMzEqYTAw" +
    "Mj9hMDAyIGE/YTAwMiBhKmEwMDIqYTAwMmE/YTAwMmEqYTAxdGVzdGluZ2EwMDEqYTAxdnNmcyphMSph" +
    "MWEqYTF6KmEyKmIwMDAzcTYqYjAwM3E0KmIwM3E1KmMtZSpjZCpjZipmIDEqZipnP2cgMT9oLW4qaG8t" +
    "bipJKmljZS1jcmVhbT9pY2VjcmVhbT9pY2VjcmVhbS0/ajBcNDE/ajAwMWE/ajAxP2shKmsnKmstKmsx" +
    "KmthKmxpc3QqbTAwMDNhMDA1YSptMDAzYTAwMDVhKm0wMDNhMDA1Km0wMDNhMDA1YSpuMTIqbjEzKm8t" +
    "bjAxMypvLW4xMipvLW40P28tbjQhP28tbjR6P28tbjlhLWI1Km8tbjlhYjUqb24wMTMqb24xMipvbjQ/" +
    "b240IT9vbjR6P29uOWEtYjUqb245YWI1Km/CrW4wMTMqb8KtbjEyKnAwMCpwMDEqcDAxwr0hKnAwMcK9" +
    "KnAwMcK9YSpwMDHCvcK+KnAwMipwMMK9KnEtbjAxMypxLW4xMipxbjAxMypxbjEyKnItMDAhKnItMDAh" +
    "NSpyLTAwIe+8lSpyLTAwYSpyLe+8kFwxIS01KnIt77yQXDEhLe+8lSpyLe+8kFwxISpyLe+8kFwxITUq" +
    "ci3vvJBcMSHvvJUqci3vvJBcMWEqci3vvJBcMyE1KnIwMCEqcjAwLTUqcjAwLjUqcjAwNSpyMDBhKnIw" +
    "NSpyMDYqcjQqcjUqctmg2aYqctmkKnLZpSpy27Dbtipy27Qqctu1KnLfgN+GKnLfhCpy34UqcuClpuCl" +
    "rCpy4KWqKnLgpasqcuCnpuCnrCpy4KeqKnLgp6sqcuCppuCprCpy4KmqKnLgqasqcuCrpuCrrCpy4Kuq" +
    "KnLgq6sqcuCtpuCtrCpy4K2qKnLgrasqcuCvpuCvrCpy4K+qKnLgr6sqcuCxpuCxrCpy4LGqKnLgsasq" +
    "cuCzpuCzrCpy4LOqKnLgs6sqcuC1puC1rCpy4LWqKnLgtasqcuC5kOC5lipy4LmUKnLguZUqcuC7kOC7" +
    "lipy4LuUKnLgu5UqcuC8oOC8pipy4LykKnLgvKUqcuGBgOGBhipy4YGEKnLhgYUqcuGCkOGClipy4YKU" +
    "KnLhgpUqcuGfoOGfpipy4Z+kKnLhn6UqcuGgkOGglipy4aCUKnLhoJUqcuGlhuGljCpy4aWKKnLhpYsq" +
    "cuGnkOGnlipy4aeUKnLhp5UqcuGtkOGtlipy4a2UKnLhrZUqcuGusOGutipy4a60KnLhrrUqcuGxgOGx" +
    "hipy4bGEKnLhsYUqcuGxkOGxlipy4bGUKnLhsZUqcuqYoFwx6pilKnLqmKDqmKUqcuqYoOqYpipy6pik" +
    "KnLqmKUqcuqjkOqjlipy6qOUKnLqo5UqcuqkgOqkhipy6qSEKnLqpIUqcuqpkOqplipy6qmUKnLqqZUq" +
    "cvCQkqAqcvCQkqUqcvCdn5gqcvCdn50qcu+8kFwxISpy77yQXDEt77yVKnLvvJBcMS7vvJUqcu+8kFwx" +
    "YSpy77yQXDHqmKUqcu+8kFwx77yO77yVKnLvvJBcMe+8lSpy77yQ77yVKnLvvJDvvJYqcu+8lCpy77yV" +
    "KnNpKnPEsSp0ZXN02aIqdGVzdNmi2aAqdGVzdNmjKnVBZS0qdWFlKnViZS0qdUJlKnVjZS0xw6kqdWNl" +
    "McOpLSp1Y2Uxw6kqdWPDqS0xZSp1Y8OpMWUtKnVjw6kxZSp3ZWlhMSp3ZWlhMip3ZWlzczEqd2Vpc3My" +
    "KndlaXoxKndlaXoyKndlacOfMSp3ZWnDnzIqeSBhMyp5IGE0KnknYTMqeSdhNCp5K2EzKnkrYTQqeS1h" +
    "Myp5LWE0KnlhMyp5YTQqej96IDA1MD96IDIxP3ohMjE/ejIwP3oyMj96YTIxP3rCqTIxP1sxKl8xKsKt" +
    "bjEyKsKtbjEzKsSwKg==";
string[] fileNames = Encoding.UTF8.GetString(Convert.FromBase64String(encodedFileNames))
    .Replace("*", ".txt?").Split(new[] { "?" }, StringSplitOptions.RemoveEmptyEntries)
    .Select(n => expand(n)).ToArray();


好方案!当我在大约10,000个文件的正常情况下对它进行基准测试时,它比Matthew的正则表达式示例更快,并且与StrCmpLogicalW()的性能大致相同.上面的代码中有一个小错误:"while(strA [jA] == zeroA)jA ++;" 和"while(strB [jB] == zeroB)jB ++;" 应该是"while(jA 有bug.指数超出范围

4> James McCorm..:

适用于linq orderby的纯C#解决方案:

http://zootfroot.blogspot.com/2009/09/natural-sort-compare-with-linq-orderby.html

public class NaturalSortComparer : IComparer, IDisposable
{
    private bool isAscending;

    public NaturalSortComparer(bool inAscendingOrder = true)
    {
        this.isAscending = inAscendingOrder;
    }

    #region IComparer Members

    public int Compare(string x, string y)
    {
        throw new NotImplementedException();
    }

    #endregion

    #region IComparer Members

    int IComparer.Compare(string x, string y)
    {
        if (x == y)
            return 0;

        string[] x1, y1;

        if (!table.TryGetValue(x, out x1))
        {
            x1 = Regex.Split(x.Replace(" ", ""), "([0-9]+)");
            table.Add(x, x1);
        }

        if (!table.TryGetValue(y, out y1))
        {
            y1 = Regex.Split(y.Replace(" ", ""), "([0-9]+)");
            table.Add(y, y1);
        }

        int returnVal;

        for (int i = 0; i < x1.Length && i < y1.Length; i++)
        {
            if (x1[i] != y1[i])
            {
                returnVal = PartCompare(x1[i], y1[i]);
                return isAscending ? returnVal : -returnVal;
            }
        }

        if (y1.Length > x1.Length)
        {
            returnVal = 1;
        }
        else if (x1.Length > y1.Length)
        { 
            returnVal = -1; 
        }
        else
        {
            returnVal = 0;
        }

        return isAscending ? returnVal : -returnVal;
    }

    private static int PartCompare(string left, string right)
    {
        int x, y;
        if (!int.TryParse(left, out x))
            return left.CompareTo(right);

        if (!int.TryParse(right, out y))
            return left.CompareTo(right);

        return x.CompareTo(y);
    }

    #endregion

    private Dictionary table = new Dictionary();

    public void Dispose()
    {
        table.Clear();
        table = null;
    }
}


该代码最终来自http://www.codeproject.com/KB/recipes/NaturalComparer.aspx(不是面向LINQ的).
该博客文章为IComparer提供了Justin Jones(http://www.codeproject.com/KB/string/NaturalSortComparer.aspx),而不是Pascal Ganaye.

5> mpen..:

我的解决方案

void Main()
{
    new[] {"a4","a3","a2","a10","b5","b4","b400","1","C1d","c1d2"}.OrderBy(x => x, new NaturalStringComparer()).Dump();
}

public class NaturalStringComparer : IComparer
{
    private static readonly Regex _re = new Regex(@"(?<=\D)(?=\d)|(?<=\d)(?=\D)", RegexOptions.Compiled);

    public int Compare(string x, string y)
    {
        x = x.ToLower();
        y = y.ToLower();
        if(string.Compare(x, 0, y, 0, Math.Min(x.Length, y.Length)) == 0)
        {
            if(x.Length == y.Length) return 0;
            return x.Length < y.Length ? -1 : 1;
        }
        var a = _re.Split(x);
        var b = _re.Split(y);
        int i = 0;
        while(true)
        {
            int r = PartCompare(a[i], b[i]);
            if(r != 0) return r;
            ++i;
        }
    }

    private static int PartCompare(string x, string y)
    {
        int a, b;
        if(int.TryParse(x, out a) && int.TryParse(y, out b))
            return a.CompareTo(b);
        return x.CompareTo(y);
    }
}

结果:

1
a2
a3
a4
a10
b4
b5
b400
C1d
c1d2



6> Michael Park..:

Matthews Horsley的答案是最快的方法,它不会改变行为,具体取决于您的程序运行的Windows版本.但是,通过创建一次正则表达式并使用RegexOptions.Compiled可以更快.我还添加了插入字符串比较器的选项,以便您可以根据需要忽略大小写,并提高可读性.

    public static IEnumerable OrderByNatural(this IEnumerable items, Func selector, StringComparer stringComparer = null)
    {
        var regex = new Regex(@"\d+", RegexOptions.Compiled);

        int maxDigits = items
                      .SelectMany(i => regex.Matches(selector(i)).Cast().Select(digitChunk => (int?)digitChunk.Value.Length))
                      .Max() ?? 0;

        return items.OrderBy(i => regex.Replace(selector(i), match => match.Value.PadLeft(maxDigits, '0')), stringComparer ?? StringComparer.CurrentCulture);
    }

使用方式

var sortedEmployees = employees.OrderByNatural(emp => emp.Name);

这需要450毫秒来排序100,000个字符串,而默认的.net字符串比较需要300毫秒 - 相当快!


这值得阅读上述内容 - [正则表达式中的编译和重用](https://msdn.microsoft.com/en-us/library/8zbs0h2f.aspx)

7> Jonathan Gil..:

你需要小心 - 我隐约回想起StrCmpLogicalW,或类似的东西,并不严格传递,我观察到.NET的排序方法有时会陷入无限循环,如果比较函数打破了这个规则.

传递性比较将始终报告


@mhenry:请参阅https://connect.microsoft.com/VisualStudio/feedback/details/236900/string-compare-not-always-transitive和http://www.codeproject.com/KB/cs/StringCompareBug.aspx

8> 小智..:

这是我的代码,用于对包含字母和数字字符的字符串进行排序.

首先,这个扩展方法:

public static IEnumerable AlphanumericSort(this IEnumerable me)
{
    return me.OrderBy(x => Regex.Replace(x, @"\d+", m => m.Value.PadLeft(50, '0')));
}

然后,只需在代码中的任何位置使用它,如下所示:

List test = new List() { "The 1st", "The 12th", "The 2nd" };
test = test.AlphanumericSort();

它是如何工作的?用零替换:

  Original  | Regex Replace |      The      |   Returned
    List    | Apply PadLeft |    Sorting    |     List
            |               |               |
 "The 1st"  |  "The 001st"  |  "The 001st"  |  "The 1st"
 "The 12th" |  "The 012th"  |  "The 002nd"  |  "The 2nd"
 "The 2nd"  |  "The 002nd"  |  "The 012th"  |  "The 12th"

使用倍数:

 Alphabetical Sorting | Alphanumeric Sorting
                      |
 "Page 21, Line 42"   | "Page 3, Line 7"
 "Page 21, Line 5"    | "Page 3, Line 32"
 "Page 3, Line 32"    | "Page 21, Line 5"
 "Page 3, Line 7"     | "Page 21, Line 42"

希望这会有所帮助.



9> Wilka..:

添加到Greg Beech的答案(因为我一直在寻找它),如果你想从Linq使用它,你可以使用OrderBy它需要一个IComparer.例如:

var items = new List();

// fill items

var sorted = items.OrderBy(item => item.Name, new NaturalStringComparer());

推荐阅读
依然-狠幸福
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有