当前位置:  开发笔记 > 编程语言 > 正文

如何判断字符是否是Java中的字母?

如何解决《如何判断字符是否是Java中的字母?》经验,为你挑选了2个好方法。

如何检查一个字符的字符串是否是一个字母 - 包括任何带重音的字母?

最近我不得不解决这个问题,所以在最近的VB6问题提醒我之后,我会自己回答.



1> Michael Myer..:

Character.isLetter()比string.matches()快得多,因为string.matches()每次都会编译一个新的Pattern.即使缓存模式,我认为是Letter()仍然会击败它.


编辑:再次遇到这个,并认为我会尝试提出一些实际的数字.这是我尝试基准测试,检查所有三种方法(matches()有和没有缓存Pattern,和Character.isLetter()).我还确保检查了有效和无效字符,以免歪曲事物.

import java.util.regex.*;

class TestLetter {
    private static final Pattern ONE_CHAR_PATTERN = Pattern.compile("\\p{L}");
    private static final int NUM_TESTS = 10000000;

    public static void main(String[] args) {
        long start = System.nanoTime();
        int counter = 0;
        for (int i = 0; i < NUM_TESTS; i++) {
            if (testMatches(Character.toString((char) (i % 128))))
                counter++;
        }
        System.out.println(NUM_TESTS + " tests of Pattern.matches() took " +
                (System.nanoTime()-start) + " ns.");
        System.out.println("There were " + counter + "/" + NUM_TESTS +
                " valid characters");
        /*********************************/
        start = System.nanoTime();
        counter = 0;
        for (int i = 0; i < NUM_TESTS; i++) {
            if (testCharacter(Character.toString((char) (i % 128))))
                counter++;
        }
        System.out.println(NUM_TESTS + " tests of isLetter() took " +
                (System.nanoTime()-start) + " ns.");
        System.out.println("There were " + counter + "/" + NUM_TESTS +
                " valid characters");
        /*********************************/
        start = System.nanoTime();
        counter = 0;
        for (int i = 0; i < NUM_TESTS; i++) {
            if (testMatchesNoCache(Character.toString((char) (i % 128))))
                counter++;
        }
        System.out.println(NUM_TESTS + " tests of String.matches() took " +
                (System.nanoTime()-start) + " ns.");
        System.out.println("There were " + counter + "/" + NUM_TESTS +
                " valid characters");
    }

    private static boolean testMatches(final String c) {
        return ONE_CHAR_PATTERN.matcher(c).matches();
    }
    private static boolean testMatchesNoCache(final String c) {
        return c.matches("\\p{L}");
    }
    private static boolean testCharacter(final String c) {
        return Character.isLetter(c.charAt(0));
    }
}

我的输出:

10000000 tests of Pattern.matches() took 4325146672 ns.
There were 4062500/10000000 valid characters
10000000 tests of isLetter() took 546031201 ns.
There were 4062500/10000000 valid characters
10000000 tests of String.matches() took 11900205444 ns.
There were 4062500/10000000 valid characters

即使使用缓存,这几乎要好8倍Pattern.(并且未缓存比缓存差3倍.)


你应该在`testCharacter()`中使用`c.codePointAt(0)`而不是`c.charAt(0)`; 否则BMP之外的角色会失败.

2> Peter Hilton..:

只检查一封信是否在AZ中,因为它不包含带有重音符号的字母或其他字母表中的字母.

我发现您可以将正则表达式类用于"Unicode字母"或其区分大小写的变体之一:

string.matches("\\p{L}"); // Unicode letter
string.matches("\\p{Lu}"); // Unicode upper-case letter

您也可以使用Character类执行此操作:

Character.isLetter(character);

但如果您需要检查多个字母,那就不太方便了.

推荐阅读
贴进你的心聆听你的世界
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有