unicode

Autres langues

Langue: ja

Version: 2001-05-11 (openSuse - 09/10/07)

Autres sections - même nom

Section: 7 (Divers)

̾Á°

Unicode - ÈÆÍÑʸ»ú½¸¹ç

ÀâÌÀ

¹ñºÝµ¬³Ê ISO 10646 ¤Ï ÈÆÍÑʸ»ú½¸¹ç (Universal Character Set (UCS)) ¤òÄêµÁ¤·¤Æ¤¤¤ë¡£ UCS ¤Ï¾µ¬³Ê¤Îʸ»ú½¸¹ç¤Îʸ»ú¤òÁ´¤Æ´Þ¤ó¤Ç¤¤¤ë¡£ ¤µ¤é¤Ë¡¢ ÁÐÊý¸þ¤Î¸ß´¹À­ (round-trip compatibility) ¤òÊݾڤ¹¤ë¡£ Î㤨¤Ð¾¤ÎÉä¹æ¤«¤é UCS ¤ËÊÑ´¹¤·¤µ¤é¤Ë¸µ¤ÎÉä¹æ¤ËÊÑ´¹¤·¤¿¤È¤·¤Æ¤â¡¢ ²¿¤Î¾ðÊó¤â¼º¤Ê¤ï¤ì¤Ê¤¤¤è¤¦¤ËÊÑ´¹¥Æ¡¼¥Ö¥ë¤òºîÀ®¤¹¤ë¤³¤È¤¬¤Ç¤­¤ë¡£

UCS ¤Ï¸½¼ÂŪ¤ËÃΤé¤ì¤Æ¤¤¤ëÁ´¤Æ¤Î¸À¸ì¤òɽ¸½¤¹¤ë¤Î¤ËɬÍפÊʸ»ú¤ò´Þ¤ó¤Ç¤¤¤ë¡£ ¤³¤ì¤Ë¤Ï¥é¥Æ¥óʸ»ú¡¢¥®¥ê¥·¥ãʸ»ú¡¢¥­¥ê¥ëʸ»ú¡¢¥Ø¥Ö¥é¥¤Ê¸»ú¡¢¥¢¥é¥Ó¥¢Ê¸»ú¡¢ ¥¢¥ë¥á¥Ë¥¢Ê¸»ú¡¢¥°¥ë¥¸¥¢Ê¸»ú¤À¤±¤Ç¤Ê¤¯¡¢Ãæ¹ñ¡¦ÆüËÜ¡¦´Ú¹ñ¤Ç»È¤ï¤ì¤Æ¤¤¤ë´Á»ú¡¢ ¤µ¤é¤Ë¤Ï¡¢Ê¿²¾Ì¾¡¢ÊÒ²¾Ì¾¡¢¥Ï¥ó¥°¥ëʸ»ú¡¢ ¥Ç¡¼¥ô¥¡¥Ê¡¼¥¬¥ê¡¼Ê¸»ú¡¢¥Ù¥ó¥¬¥ëʸ»ú¡¢¥°¥ë¥à¥­¡¼Ê¸»ú¡¢¥°¥¸¥ã¥é¡¼¥Èʸ»ú¡¢ ¥ª¥ê¥ä¡¼Ê¸»ú¡¢¥¿¥ß¡¼¥ëʸ»ú¡¢¥Æ¥ë¥°Ê¸»ú¡¢¥«¥Ê¥éʸ»ú¡¢¥Þ¥é¥ä¡¼¥é¥àʸ»ú¡¢ ¥¿¥¤Ê¸»ú¡¢¥é¥ª¥¹Ê¸»ú¡¢¥¯¥á¡¼¥ëʸ»ú¡¢¥Ü¥Ý¥â¥Õ¥©Ê¸»ú (Ãí²»»úÊì)¡¢ ¥Á¥Ù¥Ã¥Èʸ»ú¡¢¥ë¡¼¥óʸ»ú¡¢¥¨¥Á¥ª¥Ô¥¢Ê¸»ú¡¢¥«¥Ê¥À²»Àáʸ»ú¡¢ ¥Á¥§¥í¥­¡¼Ê¸»ú¡¢¥â¥ó¥´¥ëʸ»ú¡¢ ¥ª¥¬¥àʸ»ú¡¢¥ß¥ã¥ó¥Þ¡¼Ê¸»ú¡¢¥·¥ó¥Ï¥éʸ»ú¡¢ ¥¿¡¼¥Êʸ»ú¡¢¥¤ (׳) ʸ»ú¤Ê¤É¤¬´Þ¤Þ¤ì¤ë¡£ ¤Þ¤À¥«¥Ð¡¼¤µ¤ì¤Æ¤¤¤Ê¤¤Ê¸»ú¤ËÉÕ¤¤¤Æ¤â¡¢ ¥³¥ó¥Ô¥å¡¼¥¿¤Ç»ÈÍѤ¹¤ë¤¿¤á¤Ë ¤É¤Î¤è¤¦¤Ê¥¨¥ó¥³¡¼¥É¤¬¤â¤Ã¤È¤âÎɤ¤¤«¤È¤¤¤¦¸¦µæ¤¬¿Ê¤á¤é¤ì¤Æ¤ª¤ê¡¢ ºÇ½ªÅª¤Ë¤ÏÄɲ䵤ì¤ë¤À¤í¤¦¡£ ¥Ò¥¨¥í¥°¥ê¥Õ¤äÎò»ËŪ¤Ê¤¤¤í¤¤¤í¤Ê¥¤¥ó¥É¡á¥è¡¼¥í¥Ã¥Ñ¸À¸ì¤À¤±¤Ç¤Ê¤¯¡¢ ¥Æ¥ó¥°¥ï¡¼¥ëʸ»ú¡¢¥­¥¢¥¹Ê¸»ú¡¢¥¯¥ê¥ó¥´¥óʸ»ú¤Ê¤É¤Î¿Í¹©Åª¤Ê¸À¸ì¤âÁª¤Ð¤ì¤Æ¤¤¤ë¡£ UCS ¤Ï¡¢¤³¤ì¤é¤Îʸ»ú¤Ë²Ã¤¨¤Æ¡¢TeX, PostScript, APL, MS-DOS, MS-Windows, Macintosh, OCR ¥Õ¥©¥ó¥È¡¢¿ô¿¤¯¤Î¥ï¡¼¥É¥×¥í¥»¥Ã¥µ¡¼¤ä ½ÐÈÇ¥·¥¹¥Æ¥à¡¢¤Ê¤É¤¬Ä󶡤¹¤ë ¿Þ·Áµ­¹æ¡¦°õ»úµ­¹æ¡¦¿ô³Øµ­¹æ¡¦²Ê³Øµ­¹æ¤Ê¤É¤Î¿¤¯¤ò´Þ¤à¤è¤¦¤Ë¤Ê¤Ã¤¿¡£

UCS µ¬³Ê (ISO 10646) ¤Ï 31¥Ó¥Ã¥È¤Îʸ»ú½¸¹ç¥¢¡¼¥­¥Æ¥¯¥Á¥ã¡¼ ¤òµ­½Ò¤·¤Æ¤ª¤ê¡¢128 ¸Ä¤Î 24 ¥Ó¥Ã¥È ·² (group) ¤«¤é¹½À®¤µ¤ì¤Æ¤¤¤ë¡£ ³Æ·²¤Ï 256 ¸Ä¤Î 16 ¥Ó¥Ã¥È ÌÌ (plane) ¤Ëʬ³ä¤µ¤ì¤Æ¤ª¤ê¡¢³Æʸ»ú¤Ï 256 ¸Ä¤Î 8 ¥Ó¥Ã¥È ¶è (row) ¤Î 256 ÅÀ (column) ¤ÎÃæ¤Ë°ÌÃÖ¤¹¤ë¡£ ¤³¤Îµ¬³Ê¤Î Part 1 (ISO 10646-1) ¤Ç¤Ï¡¢ºÇ½é¤Î 65534 ¸Ä¤Î¥³¡¼¥É°ÌÃÖ (0x0000 ¡Á 0xfffd) ¤òÄêµÁ¤·¤Æ¤¤¤ë¡£ ¤³¤ì¤ÏÂè 0 ·²¤ÎÂè 0 Ì̤Ǥ¢¤ë ´ðËÜ¿¸À¸ìÌÌ (Basic Multilingual Plane (BMP)) ¤ò¹½À®¤¹¤ë¡£ ¤³¤Îµ¬³Ê¤Î Part 2 (ISO 10646-2) ¤Ç¤Ï¡¢Âè 0 ·²¤Î BMP ¤Î³°Éô¤Ç¤¢¤ë 0x10000 ¡Á 0x10ffff ¤ÎÈϰϤˤ¢¤ë Êä½õÌÌ ¤Ëʸ»ú¤òÄɲä·¤¿¡£ ¤³¤Îµ¬³Ê¤Ç¤Ï 0x10ffff ¤ò±Û¤¨¤¿°ÌÃÖ¤Ëʸ»ú¤òÄɲ乤ëͽÄê¤Ï¤Ê¤¤¤Î¤Ç¡¢ ͽÁۤǤ­¤ë¾­Íè¤Ë¤ª¤¤¤Æ¤Ï¡¢ Á´¥³¡¼¥É¶õ´Ö¤Î¤¦¤Á¥°¥ë¡¼¥× 0 ¤Î°ìÉôʬ¤Ï¼ÂºÝ¤Ë¤Ï»È¤ï¤ì¤ë¤³¤È¤Ï¤Ê¤¤¡£ BMP ¤Ë¤Ï¾¤Îʸ»ú½¸¹ç¤Ç°ìÈ̤˻Ȥï¤ì¤ëÁ´¤Æ¤Îʸ»ú¤¬´Þ¤Þ¤ì¤Æ¤¤¤ë¡£ ISO 10646-2 ¤ÇÄɲ䵤줿Êä½õÌ̤ϡ¢ ÆÃÄê¤Î²Ê³ØʬÌ¼­½ñ½ÐÈÇ¡¦°õºþ»º¶È¡¦¹â¼¡¥×¥í¥È¥³¥ë¡¦ ²¿¤«¤Î¥Õ¥¡¥ó¤Î´Ö¤Ê¤É¤Ç»È¤ï¤ì¤ëÆüì¤Êʸ»ú¤À¤±¤ò¥«¥Ð¡¼¤¹¤ë¡£

UCS ʸ»ú¤ò 2 ¥Ð¥¤¥È¤Î¥ï¡¼¥É¤Çɽ¸½¤¹¤ë¤Î¤¬ UCS-2 ·Á¼°¤Ç¤¢¤ë (BMP ʸ»ú¤Î¤ß)¡£ ¤Þ¤¿¡¢ UCS-4 ¤Ç¤Ïʸ»ú¤ò 4 ¥Ð¥¤¥È¤Î¥ï¡¼¥É¤Çɽ¸½¤¹¤ë¡£ ¤µ¤é¤Ë¡¢ASCII ¤ò½èÍý¤¹¤ë¥½¥Õ¥È¥¦¥§¥¢¤Ø¤Î²¼°Ì¸ß´¹¤Î¤¿¤á¤Ë UTF-8 ¥¨¥ó¥³¡¼¥É·Á¼°¤¬¤¢¤ë¡£ ¤Þ¤¿¡¢0x10ffff ¤Þ¤Ç¤ÎÈó BMP ʸ»ú¤ò°·¤¦ UCS-2 Âбþ¥½¥Õ¥È¥¦¥§¥¢¤È¤Î¸ß´¹¤Î¤¿¤á¤Ë UTF-16 ¥¨¥ó¥³¡¼¥É·Á¼°¤¬¤¢¤ë¡£

UCS ʸ»ú½¸¹ç¤Î 0x0000 ¤«¤é 0x007f ¤Ï¡¢¸ÅŵŪ¤Ê US-ASCII ʸ»ú½¸¹ç¤Îʸ»ú¤ÈƱ¤¸¤Ç¤¢¤ë¡£ ¤Þ¤¿ 0x0000 ¤«¤é 0x00ff ¤ÎÈϰϤǤϡ¢ ISO 8859-1 Latin-1 ʸ»ú½¸¹ç¤Îʸ»ú¤ÈƱ¤¸¤Ç¤¢¤ë¡£

¹çÀ®Ê¸»ú (COMBINING CHARACTERS)

UCS ¤Î¤¤¤¯¤Ä¤«¤Î¥³¡¼¥É¡¦¥Ý¥¤¥ó¥È¤Ï ¹çÀ®Ê¸»ú (combining characters) ¤Ë³ä¤êÅö¤Æ¤é¤ì¤Æ¤¤¤ë¡£ ¤³¤ì¤é¤Ï¥¿¥¤¥×¥é¥¤¥¿¡¼¤Î°ÜÆ°¤·¤Ê¤¤¥¢¥¯¥»¥ó¥È¡¦¥­¡¼¤Ë»÷¤Æ¤¤¤ë¡£ ¹çÀ®Ê¸»ú¤ÏľÁ°¤Îʸ»ú¤Ë¥¢¥¯¥»¥ó¥È¤Î¤ß¤ò²Ã¤¨¤ë¡£ ºÇ¤â½ÅÍפʥ¢¥¯¥»¥ó¥ÈÉÕ¤­¤Îʸ»ú¤Ï¤½¤ì¼«¿È¤Î¥³¡¼¥É¤ò UCS ¤Ë»ý¤Ã¤Æ¤¤¤ë¡£ °ìÊý¤Ç¹çÀ®Ê¸»úµ¡¹½¤ÏÁ´¤Æ¤Îʸ»ú¤Ë¥¢¥¯¥»¥ó¥È¤äȯ²»¶èÊÌÉä¹æ¤ò²Ã¤¨¤ë¤³¤È¤¬¤Ç¤­¤ë¡£ ¹çÀ®Ê¸»ú¤Ï¾ï¤Ë¤½¤ì¤¬½¤Àµ¤¹¤ëʸ»ú¤Ë³¤¯¡£ Î㤨¤Ð¥É¥¤¥Ä¸ì¤Îʸ»ú A ¥¦¥à¥é¥¦¥È ("Latin capital letter A with diaeresis") ¤Ï UCS ¤ËÁ°¤â¤Ã¤Æ½àÈ÷¤µ¤ì¤¿¥³¡¼¥É 0x00c4 ¤Ç¤â¡¢ Ä̾ï¤Î A "Latin capital letter A" ¤Ë "combining diaeresis (¹çÀ®Ê¬²»µ­¹æ)" ¤ò³¤±¤¿Áȹ礻 (0x0041 0x0308) ¤Î¤É¤Á¤é¤Ç¤âɽ¸½¤¹¤ë¤³¤È¤¬¤Ç¤­¤ë¡£

¹çÀ®Ê¸»ú¤Ï¡¢¥¿¥¤Ê¸»ú¤ä¿ô³Ø¿¢»ú¤Î¥¨¥ó¥³¡¼¥É¡¦ ¹ñºÝ²»À¼»úÊì¤ò»È¤¦¥æ¡¼¥¶¡¼¤Ê¤É¤Ë¤Ïɬ¿Ü¤Ç¤¢¤ë¡£

¼ÂÁõ¥ì¥Ù¥ë

Á´¤Æ¤Î¥·¥¹¥Æ¥à¤Ë¹çÀ®Ê¸»ú¤Î¤è¤¦¤Ê¿Ê¤ó¤À¥µ¥Ý¡¼¥È¤ò´üÂÔ¤·¤Æ¤¤¤ë¤ï¤±¤Ç¤Ï¤Ê¤¤¡£ ISO 10646-1 ¤Ï°Ê²¼¤Î»°Ãʳ¬¤Î UCS ¤Î¼ÂÁõ¥ì¥Ù¥ë¤ò»ØÄꤷ¤Æ¤¤¤ë¡£
Level 1
¹çÀ®Ê¸»ú¤È ¥Ï¥ó¥°¥ë¡¦¥¸¥ã¥âʸ»ú (¤¤¤í¤¤¤í¤Ê´Ú¹ñ¡¦Ä«Á¯Ê¸»ú¤ÎÉä¹æ²½¡£ ¤³¤ÎÉä¹æ²½¤Ç¤Ï¡¢¥Ï¥ó¥°¥ë²»Àá¤Î¥°¥ê¥Õ¤¬ 3 ¤Ä¤Þ¤¿¤Ï 2 ¤Ä¤ÎÊì²»¡¦»Ò²»¥³¡¼¥É¤ÎÁȤ߹ç¤ï¤»¤ÇÉä¹æ²½¤µ¤ì¤ë) ¤Ï¥µ¥Ý¡¼¥È¤·¤Ê¤¤¡£
Level 2
Level 1 ¤ÈƱÍͤÀ¤¬¡¢¹çÀ®Ê¸»ú¤òɬ¿Ü¤È¤¹¤ë¸À¸ì¤Î¤¿¤á¤Îʸ»ú (Î㤨¤Ð¡¢¥¿¥¤Ê¸»ú¡¦¥é¥ª¥¹Ê¸»ú¡¦¥Ø¥Ö¥é¥¤Ê¸»ú¡¦¥¢¥é¥Ó¥¢Ê¸»ú¡¦ ¥Ç¡¼¥ô¥¡¥Ê¡¼¥¬¥ê¡¼Ê¸»ú¡¦¥Þ¥ì¥ä¡¼¥é¥àʸ»ú¤Ê¤É) ¤Ï»È¤¨¤ë¡£
Level 3
Á´¤Æ¤Î UCS ʸ»ú¤ò¥µ¥Ý¡¼¥È¤¹¤ë¡£

¥æ¥Ë¥³¡¼¥É¡¦¥³¥ó¥½¡¼¥·¥¢¥à (Unicode Consortium) ¤«¤éȯ¹Ô¤µ¤ì¤¿ Unicode 3.0 Standard ¤Ï¡¢ISO 10646-1:2000 ¤Ëµ­½Ò¤µ¤ì¤¿ UCS Basic Multilingual Plane ¤Î level 3 ¼ÂÁõ¤ÈÁ´¤¯Æ±¤¸¤Ç¤¢¤ë¡£ Unicode 3.1 ¤Ç¤Ï ISO 10646-2 ¤ÎÊä½õÌ̤¬Äɲ䵤ì¤Æ¤¤¤ë¡£ Unicode Consortium ¤«¤éȯ¹Ô¤µ¤ì¤ë Unicode µ¬³Ê¤Èµ»½Ñ¥ì¥Ý¡¼¥È¤Ë¤è¤ê¡¢ ¤¤¤í¤¤¤í¤Êʸ»ú¤Î°ÕÌ£¤È¿ä¾©¤µ¤ì¤ë»ÈÍÑË¡¤Ë¤Ä¤¤¤Æ¤Î¹¹¤Ê¤ë¾ðÊó¤¬ÆÀ¤é¤ì¤ë¡£ ¤³¤ì¤é¤Îµ¬³Ê½ñ¤äµ»½Ñ¥ì¥Ý¡¼¥È¤Ç¡¢Unicode ʸ»úÎó¤ò ÊÔ½¸¡¦Ê¤ÙÂؤ¨¡¦Èæ³Ó¡¦Àµµ¬²½¡¦ÊÑ´¹¡¦É½¼¨¤¹¤ë¤¿¤á¤Î ¥¬¥¤¥É¥é¥¤¥ó¤È¥¢¥ë¥´¥ê¥º¥à¤¬Ê¬¤«¤ë¡£

LINUX ¤Ë¤ª¤±¤ë UNICODE

GNU/Linux ¤Ç¤Ï¡¢C ¸À¸ì¤Î·¿ wchar_t ¤ÏÉä¹æÉÕ¤­ 32 ¥Ó¥Ã¥ÈÀ°¿ô·¿¤Ç¤¢¤ë¡£ ¤½¤ÎÃÍ¤Ï C ¥é¥¤¥Ö¥é¥ê¤Ë¤è¤ê (¤¹¤Ù¤Æ¤Î¥í¥±¡¼¥ë¤Ë¤ª¤¤¤Æ) ¾ï¤Ë UCS ¥³¡¼¥É¤ÎÃͤȤ·¤Æ²ò¼á¤µ¤ì¤ë¡£ ¤³¤ì¤ò GNU C ¥é¥¤¥Ö¥é¥ê¤¬¥¢¥×¥ê¥±¡¼¥·¥ç¥ó¤ËÃΤ餻¤ë¤¿¤á¤Îµ¬Ìó¤È¤·¤Æ¡¢ Äê¿ô __STDC_ISO_10646__ ¤òÄêµÁ¤¹¤ë¡£ ¤³¤ì¤Ï ISO C 99 µ¬³Ê¤Ç»ØÄꤵ¤ì¤Æ¤¤¤ë¡£

ASCII ¸ß´¹¤Î UTF-8 ¥Þ¥ë¥Á¥Ð¥¤¥È¥¨¥ó¥³¡¼¥É¤Ç¤Ï¡¢Æþ½ÐÎÏ¥¹¥È¥ê¡¼¥à¡¦Ã¼ËöÄÌ¿®¡¦ ¥×¥ì¡¼¥ó¥Æ¥­¥¹¥È¥Õ¥¡¥¤¥ë¡¦¥Õ¥¡¥¤¥ë̾¡¦´Ä¶­ÊÑ¿ô¤Ë¤ª¤¤¤Æ¡¢ UCS/Unicode ¤ò ASCII ¤Î¤è¤¦¤Ë»È¤¦¤³¤È¤¬¤Ç¤­¤ë¡£ UTF-8 ¤òʸ»ú¥¨¥ó¥³¡¼¥É¤È¤·¤Æ»È¤¦¤³¤È¤ò Á´¤Æ¤Î¥¢¥×¥ê¥±¡¼¥·¥ç¥ó¤ËÃΤ餻¤ë¤¿¤á¤Ë¤Ï¡¢ ("LANG=en_GB.UTF-8" ¤Î¤è¤¦¤Ë) ´Ä¶­ÊÑ¿ô¤ò»È¤Ã¤ÆŬÀÚ¤Ê ¥í¥±¡¼¥ë (locale) ¤òÁªÂò¤·¤Ê¤±¤ì¤Ð¤Ê¤é¤Ê¤¤¡£

nl_langinfo(CODESET) ´Ø¿ô¤ÏÁªÂò¤µ¤ì¤¿¥¨¥ó¥³¡¼¥É¤Î̾Á°¤òÊÖ¤¹¡£ ÆâÉôŪ¤Ê wchar_t ʸ»ú¤äʸ»úÎó¤ò¥·¥¹¥Æ¥àʸ»úÎ󥨥󥳡¼¥É¤ËÊÑ´¹ (µÕÊÑ´¹) ¤¹¤ë¤Î¤Ë»È¤ï¤ì¤ë wctomb(3) ¤ä mbsrtowcs(3)¡¢ ¤µ¤é¤Ë¤Ï wcwidth(3) ¤È¤¤¤Ã¤¿¥é¥¤¥Ö¥é¥ê´Ø¿ô¤Ï¡¢ ʸ»ú½ÐÎϤǤɤì¤À¤±¥«¡¼¥½¥ë¤¬¿Ê¤ó¤À¤« (0-2) ¤òÊÖ¤¹¡£

°ìÈÌŪ¤Ë¸À¤¦¤È¡¢Linux ¤Ç¤Ï¸½ºß¤Î¤È¤³¤í BMP ¤Î level 1 ¼ÂÁõ¤Î¤ß¤ò»È¤¦¤Ù¤­¤Ç¤¢¤ë¡£ ¤¢¤ë¸À¸ì¤Îʸ»ú (¤È¤¯¤Ë¥¿¥¤Ê¸»ú) ¤Ç¤Ï¡¢ ¥Ù¡¼¥¹Ê¸»úÅö¤¿¤ê 2 ¤Ä¤Þ¤Ç¤Î¹çÀ®Ê¸»ú¤ò»È¤¦¤³¤È¤¬ UTF-8 üËö¥¨¥ß¥å¥ì¡¼¥¿¤È ISO 10646 ¥Õ¥©¥ó¥È (level 2) ¤Ç¥µ¥Ý¡¼¥È¤µ¤ì¤Æ¤¤¤ë¡£ ¤·¤«¤·°ìÈÌŪ¤Ë¸À¤¨¤Ð¡¢¤â¤·²Äǽ¤Ê¤é¤Ð¤¢¤é¤«¤¸¤á¹çÀ®¤·¤¿Ê¸»ú¤ò»È¤¦¤Ù¤­¤Ç¤¢¤ë (Unicode ¤Ç¤Ï¡¢¤³¤ì¤ò Normalization Form C (¹çÀ®Ê¸»ú¤ÎÀµµ¬²½·Á¼°) ¤È¤¤¤¦)¡£

¥×¥é¥¤¥Ù¡¼¥È¡¦¥¨¥ê¥¢

BMP ¤Î 0xe000 ¡Á 0xf8ff ¤ÎÈϰϤϡ¢µ¬³Ê¤Ç¤Ï¤¤¤«¤Ê¤ëʸ»ú¤â³ä¤êÅö¤Æ¤º¡¢ »äŪ¤Ê»ÈÍѤΤ¿¤á¤ËͽÌ󤵤ì¤Æ¤¤¤ë¡£ Linux ¥³¥ß¥å¥Ë¥Æ¥£¤Ç¤Ï¡¢ ¤³¤Î¥×¥é¥¤¥Ù¡¼¥È¡¦¥¨¥ê¥¢¤ò¤µ¤é¤ËºÙ¤«¤¯Ê¬³ä¤·¤Æ»ÈÍѤ¹¤ë¡£ 0xe000 ¡Á 0xefff ¤ÎÈϰϤϥ¨¥ó¥É¡¦¥æ¡¼¥¶¡¼¤¬¸Ä¡¹¤Ë»ÈÍѤ¹¤ë¤³¤È¤¬¤Ç¤­¤ë¡£ 0xf000 ¡Á 0xf8ff ¤ÎÈÏ°Ï¤Ï Linux Zone ¤Ç Á´¤Æ¤Î Linux ¥æ¡¼¥¶¡¼¤Ç¶¦Ä̤˻ÈÍѤ¹¤ë¡£ Linux Zone ¤Ø¤Îʸ»ú³ä¤êÅö¤Æ¤ÎÅÐÏ¿¤Ï¡¢ ¸½ºß H. Peter Anvin <Peter.Anvin@linux.org> ¤Ë¤è¤Ã¤Æ´ÉÍý¤µ¤ì¤Æ¤¤¤ë¡£

ʸ¸¥

*
Information technology --- Universal Multiple-Octet Coded Character Set (UCS) --- Part 1: Architecture and Basic Multilingual Plane. International Standard ISO/IEC 10646-1, International Organization for Standardization, Geneva, 2000.

¤³¤ì¤Ï UCS ¤Î¸ø¼°¤Ê»ÅÍͤǤ¢¤ë¡£ http://www.iso.ch/ ¤«¤éÃíʸ¤Ç¤­¤ë CD-ROM ¤Ç PDF ¥Õ¥¡¥¤¥ë¤È¤·¤ÆÆþ¼ê¤Ç¤­¤ë¡£

*
The Unicode Standard, Version 3.0. The Unicode Consortium, Addison-Wesley, Reading, MA, 2000, ISBN 0-201-61633-5.
*
S. Harbison, G. Steele. C: A Reference Manual. Fourth edition, Prentice Hall, Englewood Cliffs, 1995, ISBN 0-13-326224-3.

C ¥×¥í¥°¥é¥à¸À¸ì¤Ë¤Ä¤¤¤Æ¤Î¤È¤Æ¤âÎɤ¤»²¹Í½ñ¤Ç¤¢¤ë¡£ Âè»ÍÈǤǤϡ¢¥ï¥¤¥Éʸ»ú¤ä¥Þ¥ë¥Á¥Ð¥¤¥Èʸ»ú¥¨¥ó¥³¡¼¥É¤ò°·¤¦¤¿¤á¤Î ¿¤¯¤Î¿·¤·¤¤ C ¥é¥¤¥Ö¥é¥ê´Ø¿ô¤¬ ²Ã¤¨¤é¤ì¤¿ ISO C 90 µ¬³Ê¤Î 1994 Amendment 1 ¤ò¥«¥Ð¡¼¤·¤Æ¤¤¤ë¡£ ¤·¤«¤·¡¢¥ï¥¤¥Éʸ»ú¤ä¥Þ¥ë¥Á¥Ð¥¤¥Èʸ»ú¤Î¥µ¥Ý¡¼¥È¤ò ¹¹¤Ë²þÁ±¤·¤¿ ISO C 99 ¤Ï¡¢¤Þ¤À¥«¥Ð¡¼¤·¤Æ¤¤¤Ê¤¤¡£

*
Unicode µ»½Ñ¥ì¥Ý¡¼¥È¡£
http://www.unicode.org/unicode/reports/
*
Markus Kuhn: Unix/Linux ¤Î¤¿¤á¤Î UTF-8 ¤È Unicode ¤Î FAQ¡£
http://www.cl.cam.ac.uk/~mgk25/unicode.html
linux-utf8 ¥á¡¼¥ê¥ó¥°¥ê¥¹¥È¤ò¹ØÆɤ¹¤ë¤¿¤á¤Î¾ðÊ󤬤¢¤ë¡£ Linux ¤Ç Unicode ¤ò»È¤¦¾ì¹ç¤Î¥¢¥É¥Ð¥¤¥¹¤òõ¤¹¤Î¤Ë°ìÈÖÎɤ¤¾ì½ê¤Ç¤¢¤ë¡£
*
Bruno Haible: Unicode HOWTO.
ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO.html

¥Ð¥°

¤³¤Î¥Þ¥Ë¥å¥¢¥ë¡¦¥Ú¡¼¥¸¤òºÇ¸å¤Ë²þÄû¤·¤¿»þÅÀ¤Ç¡¢ GNU C ¥é¥¤¥Ö¥é¥ê¤Î UTF-8 ¥µ¥Ý¡¼¥È¤Ï´°À®¤·¤Æ¤¤¤ë¡£ XFree86 ¤Ë¤è¤ë¥µ¥Ý¡¼¥È¤Ï¿Ê¹ÔÃæ¤Ç¤¢¤ë¡£ UTF-8 ¥í¥±¡¼¥ë¤Ç²÷Ŭ¤Ë»È¤¨¤ë¥¢¥×¥ê¥±¡¼¥·¥ç¥ó (¿¤¯¤Îͭ̾¤Ê¥¨¥Ç¥£¥¿) ¤ÎºîÀ®¤Ï¡¢¤Þ¤À¿Ê¹ÔÃæ¤Ç¤¢¤ë¡£ Linux ¤Ç¤Î UCS ¥µ¥Ý¡¼¥È¤Ç¤ÏÄ̾ï CJK ¤Î 2 ¥ï¥¤¥Éʸ»ú¤¬Ä󶡤µ¤ì¤ë¡£ ñ½ã¤Ê½Å¤ÍÂǤÁ¤Ë¤è¤ë¹çÀ®Ê¸»ú¤¬Ä󶡤µ¤ì¤ë¾ì¹ç¤â¤¢¤ë¡£ ¤·¤«¤·¡¢±¦¤«¤éº¸¤Ø½ñ¤¯Ê¸»ú¤ä¥Ø¥Ö¥é¥¤Ê¸»ú¡¦¥¢¥é¥Ó¥¢Ê¸»ú¡¦¥¤¥ó¥É¸ì·Ïʸ»ú¤Ê¤É¤Î ¹ç»ú¤ÎÃÖ¤­´¹¤¨¤òɬÍפȤ¹¤ëʸ»ú¤Ï¥µ¥Ý¡¼¥È¤µ¤ì¤Æ¤¤¤Ê¤¤¡£ ¸½ºß¡¢¤³¤ì¤é¤Îʸ»ú¤ÏÀöÎý¤µ¤ì¤¿¥Æ¥­¥¹¥ÈÉÁ²è¥¨¥ó¥¸¥ó¤òÈ÷¤¨¤¿ GUI ¥¢¥×¥ê¥±¡¼¥·¥ç¥ó (HTML ¥Ó¥å¡¼¥¢¡¦¥ï¡¼¥É¥×¥í¥»¥Ã¥µ) ¤Ç¤Î¤ß ¥µ¥Ý¡¼¥È¤µ¤ì¤Æ¤¤¤ë¡£

´ØÏ¢¹àÌÜ

setlocale(3), charsets(7), utf-8(7)