/usr/web/sources/patch/applied/file-utf-parse/readme

Plan 9 from Bell Labs’s /usr/web/sources/patch/applied/file-utf-parse/readme


change character classification from unicode first to this priority:
ascii, utf8, binary, latin1.  use a private function to recognize utf8.
these changes allow us to recognize 0x10ffff > utf > 0xffff and latin1.
dbcs recognition is also possible; that code is deferred for a subsequent
patch.

the utf-8 range 0xa0-0xff is now called "latin". not "Extended Latin".

(Return to Plan 9 Home Page)