Show Ticket

Duplicates of this ticket: 8EB8F194

Status: open, reported by Mike Czepiel on 2006-09-13 (request)

Option to write/not write a BOM on UTF-8 files

Pretty sure Textmate is always saving a BOM (Byte Order Mark) on UTF-8 encoded files.

Would be nice to be able to optionally not save this /strip these from files. Have been running into issues with some server configurations.

Additional information on the matter located here:
http://www.w3.org/International/questions/qa-utf8-bom

From my understanding the BOM is optional in UTF-8 anyway as the byteorder is irrelevant, unlike UTF-16 +


"In UTF-16 and UTF-32 encodings, unless there is some alternative indicator, the BOM is essential to ensure correct interpretation of the file's contents. Each character in the file is represented by 2 or 4 bytes of data and the order in which these bytes are stored in the file is significant; the BOM indicates this order.

In the UTF-8 encoding, the presence of the BOM is not essential because, unlike the UTF-16 or UTF-32 encodings, there is no alternative sequence of bytes in a character."
Note added by Allan Odgaard on 2006-09-13 23:43:53

TextMate will preserve the BOM already in the file, but it will never add one to UTF-8 files.

So you likely created the files in another editor which did place BOM in these files.

As of such, TM ought to strip BOM from UTF-8 files automatically, as they are not only optional, they are downright counterproductive for the ASCII compatibility -- the current behavior is mainly a) because it tries to preserve line endings, encoding, and thus BOM, to cause the least amount of trouble, and b) some users actually rely on the BOM (because they send web pages with a BOM instead of the proper content-type encoding (and rely on browsers to pick up on this)).