当前位置:  开发笔记 > 编程语言 > 正文

如何以标准方式修剪前导/尾随空格?

如何解决《如何以标准方式修剪前导/尾随空格?》经验,为你挑选了6个好方法。

是否有一个干净的,最好是标准的方法来修剪C中字符串的前导和尾随空格?我会自己动手,但我认为这是一个同样常见的解决方案的常见问题.



1> Adam Rosenfi..:

如果你可以修改字符串:

// Note: This function returns a pointer to a substring of the original string.
// If the given string was allocated dynamically, the caller must not overwrite
// that pointer with the returned value, since the original pointer must be
// deallocated using the same allocator with which it was allocated.  The return
// value must NOT be deallocated using free() etc.
char *trimwhitespace(char *str)
{
  char *end;

  // Trim leading space
  while(isspace((unsigned char)*str)) str++;

  if(*str == 0)  // All spaces?
    return str;

  // Trim trailing space
  end = str + strlen(str) - 1;
  while(end > str && isspace((unsigned char)*end)) end--;

  // Write new null terminator character
  end[1] = '\0';

  return str;
}

如果你不能修改字符串,那么你可以使用基本相同的方法:

// Stores the trimmed input string into the given output buffer, which must be
// large enough to store the result.  If it is too small, the output is
// truncated.
size_t trimwhitespace(char *out, size_t len, const char *str)
{
  if(len == 0)
    return 0;

  const char *end;
  size_t out_size;

  // Trim leading space
  while(isspace((unsigned char)*str)) str++;

  if(*str == 0)  // All spaces?
  {
    *out = 0;
    return 1;
  }

  // Trim trailing space
  end = str + strlen(str) - 1;
  while(end > str && isspace((unsigned char)*end)) end--;
  end++;

  // Set output size to minimum of trimmed string length and buffer size minus 1
  out_size = (end - str) < len-1 ? (end - str) : len-1;

  // Copy trimmed string and add null terminator
  memcpy(out, str, out_size);
  out[out_size] = 0;

  return out_size;
}


@nvl:不."str"是一个局部变量,更改它不会改变传入的原始指针.C中的函数调用总是按值传递,不会传递引用.
你应该提到当字符串是malloc时,你必须在第一个例子中保留原始指针的副本,否则你将永远无法再次释放它.
@Raj:从传入的地址返回不同的地址没有任何固有的错误.这里没有要求返回的值是`free()`函数的有效参数.恰恰相反 - 我设计这个是为了避免内存分配以提高效率.如果传入的地址是动态分配的,那么调用者仍然负责释放该内存,并且调用者需要确保不使用此处返回的值覆盖该值.
@nvl:没有分配内存,所以没有内存可供释放.
对不起,第一个答案并不好,除非你不关心内存泄漏.你现在有两个重叠的字符串(原始的,它的尾部空格被修剪,而新的字符串).只能释放原始字符串,但如果这样做,则第二个指向释放的内存.
您必须将`isspace`的参数转换为`unsigned char`,否则您将调用未定义的行为.

2> indiv..:

这是将字符串移动到缓冲区的第一个位置的一个.您可能希望这种行为,以便如果您动态分配字符串,您仍然可以在trim()返回的同一指针上释放它:

char *trim(char *str)
{
    size_t len = 0;
    char *frontp = str;
    char *endp = NULL;

    if( str == NULL ) { return NULL; }
    if( str[0] == '\0' ) { return str; }

    len = strlen(str);
    endp = str + len;

    /* Move the front and back pointers to address the first non-whitespace
     * characters from each end.
     */
    while( isspace((unsigned char) *frontp) ) { ++frontp; }
    if( endp != frontp )
    {
        while( isspace((unsigned char) *(--endp)) && endp != frontp ) {}
    }

    if( frontp != str && endp == frontp )
            *str = '\0';
    else if( str + len - 1 != endp )
            *(endp + 1) = '\0';

    /* Shift the string so that it starts at str so that if it's dynamically
     * allocated, we can still free it on the returned pointer.  Note the reuse
     * of endp to mean the front of the string buffer now.
     */
    endp = str;
    if( frontp != str )
    {
            while( *frontp ) { *endp++ = *frontp++; }
            *endp = '\0';
    }

    return str;
}

测试正确性:

#include 
#include 
#include 

/* Paste function from above here. */

int main()
{
    /* The test prints the following:
    [nothing to trim] -> [nothing to trim]
    [    trim the front] -> [trim the front]
    [trim the back     ] -> [trim the back]
    [    trim front and back     ] -> [trim front and back]
    [ trim one char front and back ] -> [trim one char front and back]
    [ trim one char front] -> [trim one char front]
    [trim one char back ] -> [trim one char back]
    [                   ] -> []
    [ ] -> []
    [a] -> [a]
    [] -> []
    */

    char *sample_strings[] =
    {
            "nothing to trim",
            "    trim the front",
            "trim the back     ",
            "    trim front and back     ",
            " trim one char front and back ",
            " trim one char front",
            "trim one char back ",
            "                   ",
            " ",
            "a",
            "",
            NULL
    };
    char test_buffer[64];
    char comparison_buffer[64];
    size_t index, compare_pos;

    for( index = 0; sample_strings[index] != NULL; ++index )
    {
        // Fill buffer with known value to verify we do not write past the end of the string.
        memset( test_buffer, 0xCC, sizeof(test_buffer) );
        strcpy( test_buffer, sample_strings[index] );
        memcpy( comparison_buffer, test_buffer, sizeof(comparison_buffer));

        printf("[%s] -> [%s]\n", sample_strings[index],
                                 trim(test_buffer));

        for( compare_pos = strlen(comparison_buffer);
             compare_pos < sizeof(comparison_buffer);
             ++compare_pos )
        {
            if( test_buffer[compare_pos] != comparison_buffer[compare_pos] )
            {
                printf("Unexpected change to buffer @ index %u: %02x (expected %02x)\n",
                    compare_pos, (unsigned char) test_buffer[compare_pos], (unsigned char) comparison_buffer[compare_pos]);
            }
        }
    }

    return 0;
}

源文件是trim.c. 用'cc trim.c -o trim'编译.



3> jkramer..:

我的解决方案 字符串必须是可更改的.其他一些解决方案的优势在于它将非空间部分移动到开头,因此您可以继续使用旧指针,以防您以后必须释放它.

void trim(char * s) {
    char * p = s;
    int l = strlen(p);

    while(isspace(p[l - 1])) p[--l] = 0;
    while(* p && isspace(* p)) ++p, --l;

    memmove(s, p, l + 1);
}   

此版本使用strndup()创建字符串的副本,而不是在适当的位置编辑它.strndup()需要_GNU_SOURCE,所以也许你需要使用malloc()和strncpy()创建自己的strndup().

char * trim(char * s) {
    int l = strlen(s);

    while(isspace(s[l - 1])) --l;
    while(* s && isspace(* s)) ++s, --l;

    return strndup(s, l);
}


`trim()`调用UB,如果`s`是````,因为第一个`isspace()`调用将是`isspace(p [-1])`和`p [-1]`不一定引用一个合法地点.

4> 小智..:

这是我的C迷你库,用于修剪左,右,两者,全部,就地和分离,以及修剪一组指定的字符(或默认为空格).

strlib.h的内容:

#ifndef STRLIB_H_
#define STRLIB_H_ 1
enum strtrim_mode_t {
    STRLIB_MODE_ALL       = 0, 
    STRLIB_MODE_RIGHT     = 0x01, 
    STRLIB_MODE_LEFT      = 0x02, 
    STRLIB_MODE_BOTH      = 0x03
};

char *strcpytrim(char *d, // destination
                 char *s, // source
                 int mode,
                 char *delim
                 );

char *strtriml(char *d, char *s);
char *strtrimr(char *d, char *s);
char *strtrim(char *d, char *s); 
char *strkill(char *d, char *s);

char *triml(char *s);
char *trimr(char *s);
char *trim(char *s);
char *kill(char *s);
#endif

strlib.c的内容:

#include 

char *strcpytrim(char *d, // destination
                 char *s, // source
                 int mode,
                 char *delim
                 ) {
    char *o = d; // save orig
    char *e = 0; // end space ptr.
    char dtab[256] = {0};
    if (!s || !d) return 0;

    if (!delim) delim = " \t\n\f";
    while (*delim) 
        dtab[*delim++] = 1;

    while ( (*d = *s++) != 0 ) { 
        if (!dtab[0xFF & (unsigned int)*d]) { // Not a match char
            e = 0;       // Reset end pointer
        } else {
            if (!e) e = d;  // Found first match.

            if ( mode == STRLIB_MODE_ALL || ((mode != STRLIB_MODE_RIGHT) && (d == o)) ) 
                continue;
        }
        d++;
    }
    if (mode != STRLIB_MODE_LEFT && e) { // for everything but trim_left, delete trailing matches.
        *e = 0;
    }
    return o;
}

// perhaps these could be inlined in strlib.h
char *strtriml(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_LEFT, 0); }
char *strtrimr(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_RIGHT, 0); }
char *strtrim(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_BOTH, 0); }
char *strkill(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_ALL, 0); }

char *triml(char *s) { return strcpytrim(s, s, STRLIB_MODE_LEFT, 0); }
char *trimr(char *s) { return strcpytrim(s, s, STRLIB_MODE_RIGHT, 0); }
char *trim(char *s) { return strcpytrim(s, s, STRLIB_MODE_BOTH, 0); }
char *kill(char *s) { return strcpytrim(s, s, STRLIB_MODE_ALL, 0); }

一个主要例程就是这一切.如果src == dst,它会修剪到位,否则,它就像strcpy例程一样工作.它修剪了字符串delim中指定的一组字符,如果为空,则为空格.它修剪左,右,两个和所有(如tr).没有多少,它只在字符串上迭代一次.有些人可能会抱怨左边的修剪右边开始,但是,无论如何都不需要从左边开始的修剪.(不管怎样,你必须到达字符串的末尾才能进行正确的修剪,所以你可以随心所欲地完成工作.)可能会有关于流水线和缓存大小的争论等等 - 谁知道.由于解决方案从左到右工作并且只迭代一次,因此它也可以扩展为在流上工作.局限性:它并没有在工作的Unicode字符串.



5> Swiss..:

这是我尝试简单但正确的就地修剪功能.

void trim(char *str)
{
    int i;
    int begin = 0;
    int end = strlen(str) - 1;

    while (isspace((unsigned char) str[begin]))
        begin++;

    while ((end >= begin) && isspace((unsigned char) str[end]))
        end--;

    // Shift all characters back to the start of the string array.
    for (i = begin; i <= end; i++)
        str[i - begin] = str[i];

    str[i - begin] = '\0'; // Null terminate string.
}


建议改为`while((end> = begin)&& isspace(str [end]))`以防止在str是``"`时的UB.防止`str [-1]`.

6> chux - Reins..:

修剪派对晚了

特点:
1.快速修剪开头,如许多其他答案.
2.结束后,每个循环只进行1次测试即可修剪右边.像@jfm3,但适用于所有空白字符串)
3.为了避免未定义的行为,当char签名char,强制转换*sunsigned char.

字符处理 "在所有情况下,参数都是a int,其值应表示为unsigned char或等于宏的值EOF.如果参数具有任何其他值,则行为未定义." C11§7.41

#include 

// Return a pointer to the trimmed string
char *string_trim_inplace(char *s) {
  while (isspace((unsigned char) *s)) s++;
  if (*s) {
    char *p = s;
    while (*p) p++;
    while (isspace((unsigned char) *(--p)));
    p[1] = '\0';
  }

  // If desired, shift the trimmed string

  return s;
}

@chqrlie评论说上面没有移动修剪过的字符串.要这样做....

// Return a pointer to the (shifted) trimmed string
char *string_trim_inplace(char *s) {
  char *original = s;
  size_t len = 0;

  while (isspace((unsigned char) *s)) {
    s++;
  } 
  if (*s) {
    char *p = s;
    while (*p) p++;
    while (isspace((unsigned char) *(--p)));
    p[1] = '\0';
    // len = (size_t) (p - s);   // older errant code
    len = (size_t) (p - s + 1);  // Thanks to @theriver
  }

  return (s == original) ? s : memmove(original, s, len + 1);
}


是的,最后有人知道ctype未定义的行为.
推荐阅读
wurtjq
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有