Analysis of the causes of accidents caused by Unicode signature BOM

Analysis of the causes of accidents caused by Unicode signature BOM
Maybe you are using include files here, which is usually done for headers and footers. When I opened the included file, I found that the item "Include Unicode Signature BOM" in the page properties was checked. Then I tell you that the accident was caused by this BOM.

unicode-bom

Today, I encountered another BOM accident when writing a JS script.
I inserted an external JS into the page, and inside it there was this sentence: $.getJSON("/my/newmsg", function(data){alert(data);}); Other browsers could pop up the content normally, but IE did not. I was depressed for nearly an hour. I suspected that this sentence was written incorrectly, that the JSON data format was wrong, and that I had a problem with my character...
Later, I suspected that the encoding was wrong, so I saw the damn BOM checked. As soon as I removed it, the miracle emerged from under the dark cloud.
Although I am lazy and rarely update my blog, I have to come up and record this incident because it is really unexpected. JS can also cause accidents due to BOM – -|

There is a concept of BOM in the Unicode specification.
BOM is the abbreviation of Byte Order Mark, which is a byte order mark. This thing cannot be seen in an ordinary text editor. Can it be said to be a file header? Can it only be seen in a binary editor? That may be the case.
In UCS encoding, there is a character called "ZERO WIDTH NO-BREAK SPACE", and its encoding is FEFF. FFFE is a character that does not exist in UCS, so it should not appear in actual transmission. The UCS specification recommends that we transmit the character "ZERO WIDTH NO-BREAK SPACE" before transmitting the byte stream. In this way, if the receiver receives FEFF, it means that the byte stream is Big-Endian; if it receives FFFE, it means that the byte stream is Little-Endian. Therefore, the characters "ZERO WIDTH NO-BREAK SPACE" are also called BOM.
UTF-8 does not require BOM to indicate byte order, but can use BOM to indicate encoding. The UTF-8 encoding of the characters "ZERO WIDTH NO-BREAK SPACE" is EF BB BF. So if the receiver receives a byte stream starting with EF BB BF, it knows that it is UTF-8 encoded. Windows uses BOM to mark the encoding of text files.

<<:  How to use partitioning to optimize MySQL data processing for billions of data

>>:  The process of setting up an environment for integration testing using remote Docker

Recommend

Tutorial diagram of using Jenkins for automated deployment under Windows

Today we will talk about how to use Jenkins+power...

Detailed explanation of Linux netfilter/iptables knowledge points

Netfilter Netfilter is a packet processing module...

How to use multi-core CPU to speed up your Linux commands (GNU Parallel)

Have you ever had the need to compute a very larg...

Complete MySQL Learning Notes

Table of contents MyISAM and InnoDB Reasons for p...

The forgotten button tag

Note: This article has been translated by someone ...

CSS--overflow:hidden in project examples

Here are some examples of how I use this property ...

Mysql classic high-level/command line operation (quick) (recommended)

Since I need to learn how to build servers and da...

Three ways to achieve background blur in CSS3 (summary)

1. Normal background blur Code: <Style> htm...

HTML uses the title attribute to display text when the mouse hovers

Copy code The code is as follows: <a href=# ti...

JavaScript Prototype Details

Table of contents 1. Overview 1.1 What is a proto...

PNG Alpha Transparency in IE6 (Complete Collection)

Many people say that IE6 does not support PNG tra...

Solution to the problem that MySQL commands cannot be entered in Chinese

Find the problem Recently, when I connected to th...

How to install Docker using scripts under Linux Centos

What is the main function of Docker? At present, ...

Vue uses vue meta info to set the title and meta information of each page

title: vue uses vue-meta-info to set the title an...