当前位置:  开发笔记 > 编程语言 > 正文

C#清理文件名

如何解决《C#清理文件名》经验,为你挑选了4个好方法。

我最近一直在将各种位置的MP3转移到存储库中.我一直使用ID3标签构建新的文件名(谢谢,TagLib-Sharp!),我注意到我得到了一个System.NotSupportedException:

"不支持给定路径的格式."

这是通过产生任一File.Copy()Directory.CreateDirectory().

没过多久就意识到我的文件名需要消毒.所以我做了一件显而易见的事:

public static string SanitizePath_(string path, char replaceChar)
{
    string dir = Path.GetDirectoryName(path);
    foreach (char c in Path.GetInvalidPathChars())
        dir = dir.Replace(c, replaceChar);

    string name = Path.GetFileName(path);
    foreach (char c in Path.GetInvalidFileNameChars())
        name = name.Replace(c, replaceChar);

    return dir + name;
}

令我惊讶的是,我继续得到例外.原来,':'不在集合中Path.GetInvalidPathChars(),因为它在路径根中有效.我认为这是有道理的 - 但这必须是一个非常普遍的问题.有没有人有一些消毒路径的短代码?最彻底的我已经想到了这一点,但感觉它可能是矫枉过正.

    // replaces invalid characters with replaceChar
    public static string SanitizePath(string path, char replaceChar)
    {
        // construct a list of characters that can't show up in filenames.
        // need to do this because ":" is not in InvalidPathChars
        if (_BadChars == null)
        {
            _BadChars = new List(Path.GetInvalidFileNameChars());
            _BadChars.AddRange(Path.GetInvalidPathChars());
            _BadChars = Utility.GetUnique(_BadChars);
        }

        // remove root
        string root = Path.GetPathRoot(path);
        path = path.Remove(0, root.Length);

        // split on the directory separator character. Need to do this
        // because the separator is not valid in a filename.
        List parts = new List(path.Split(new char[]{Path.DirectorySeparatorChar}));

        // check each part to make sure it is valid.
        for (int i = 0; i < parts.Count; i++)
        {
            string part = parts[i];
            foreach (char c in _BadChars)
            {
                part = part.Replace(c, replaceChar);
            }
            parts[i] = part;
        }

        return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
    }

任何改进使这个功能更快,更少巴洛克非常值得赞赏.



1> Andre..:

要清理文件名,您可以执行此操作

private static string MakeValidFileName( string name )
{
   string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
   string invalidRegStr = string.Format( @"([{0}]*\.+$)|([{0}]+)", invalidChars );

   return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}


好方法.不要忘记,保留的话仍然会咬你,你会留下挠头.来源:[维基百科文件名保留字](http://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words)
也许,但是当我遇到同样的问题时,这段代码肯定对我有帮助:)
另一个潜在的伟大SO用户走路......这个功能很棒.谢谢Adrevdm ......
如果句点位于文件名的末尾,则句点是无效字符,因此`GetInvalidFileNameChars`不包含它们.它不会在Windows中抛出异常,它只会将它们剥离,但如果你期望它存在,那么它可能会导致意外的行为.我修改了正则表达式来处理这种情况,使`.`被认为是无效字符之一,如果它在字符串的末尾.
问题是关于路径,而不是文件名,并且这些的无效字符是不同的.
我希望没有人介意,我对正则表达式进行了一些微调,以使其仅用一个下划线替换相邻的无效字符。

2> DenNukem..:

更短的解决方案:

var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');



3> fiat..:

基于安德烈的出色答案,但考虑到斯普德对保留字的评论,我制作了这个版本:

/// 
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// 
/// 
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// 
public static string CoerceValidFileName(string filename)
{
    var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
    var invalidReStr = string.Format(@"[{0}]+", invalidChars);

    var reservedWords = new []
    {
        "CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
        "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
        "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
    };

    var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
    foreach (var reservedWord in reservedWords)
    {
        var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
        sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
    }

    return sanitisedNamePart;
}

这些是我的单元测试

[Test]
public void CoerceValidFileName_SimpleValid()
{
    var filename = @"thisIsValid.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual(filename, result);
}

[Test]
public void CoerceValidFileName_SimpleInvalid()
{
    var filename = @"thisIsNotValid\3\\_3.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}

[Test]
public void CoerceValidFileName_InvalidExtension()
{
    var filename = @"thisIsNotValid.t\xt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid.t_xt", result);
}

[Test]
public void CoerceValidFileName_KeywordInvalid()
{
    var filename = "aUx.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("_reservedWord_.txt", result);
}

[Test]
public void CoerceValidFileName_KeywordValid()
{
    var filename = "auxillary.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("auxillary.txt", result);
}


较小的建议,因为该方法看起来像是朝这个方向发展:添加this关键字,它将成为方便的扩展方法。公共静态字符串CoerceValidFileName(此字符串文件名)
小错误:此方法不会更改没有文件扩展名(例如COM1)的保留字,这也是不允许的。建议的解决方法是将reserveWordPattern更改为`“ ^ {0}(\\。| $)”`,并将替换字符串更改为`“ __reservedWord_ $ 1”`

4> data..:
string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));


考虑`String.Concat(dirty ...)`而不是`Join(String.Empty ...`
推荐阅读
ERIK又
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有