软件一般采用三种方式来决定文本的字符集和编码: 检测文件头标识,提示用户选择,根据一定的规则猜测 最标准的途径是检测文本最开头的几个字节,开头字节charset/encoding,如下表: ef bb bf utf-8 fe ff utf-16/ucs-2, little endian ff fe utf-16/ucs-2, big endian ff fe 00 00 utf-32/ucs-4, little endian. 00 00 fe ff utf-32/ucs-4, big-endian.
网友评论