View Quote
-
#2972 Up↑ /37 Down↓ [Report] 2007-10-08 20:27 GMT
* gsnedders has to remember that US-ASCII is 8-bit and Unicode 16-bit for his computing test tomorrow
<anne> heh
<anne> that sucks
<gsnedders> It's the sort of thing that normally makes me do badly.
<Hixie> "you want me to say that it's 8bit, but it's actually 7bit (assuming you are actually referring to ANSI_X3.4-1968)"
<gsnedders> Hixie: :)
<gsnedders> Hixie: I took it up with the teacher once…
<gsnedders> Hixie: Without a copy of ANSI_X3.4-1968 it's hard to prove, though
<gsnedders> Hixie: Been tempted to point him at unicode.org, though
<Hixie> and "You want me to say 16bit, but Unicode isn't an encoding format, so it doesn't actually have a size. It has codepoints from 0x00 to 0x10FFFF, and has encodings that use 7 bit components (UTF-7), 8 bit components (UTF-8), 16 bit components (UTF-16), and 32 bits (UTF-32); however in none of those encodings is a single character necessarily represented by a single codepoint and therefore even in those encodings it is hard to describe an actual size."
<gsnedders> :)
<Hixie> enjoy your test though
<anne> even for UTF-32?
* gsnedders wonders what would happen if he actually wrote that
<gsnedders> Hixie: there is only one way in UTF-8: non-shortest forms are illegal byte sequences
<Hixie> anne: combining codepoints
<gsnedders> Hixie: ah. those.
<Hixie> even if you apply a radical normalisation form like NFKC, you still can't guarentee that one character has one byte
<Hixie> or one "codepoint"
<Hixie> rather
<gsnedders> I'm tempted to write something like that, but the answers get sent to the exam board…
<Hixie> i'm not advising you either way :-)
<gsnedders> Hixie: on grounds that you don't want me to be technically wrong, and you don't want me to fail? :)
Kindly hosted by jX and in no way affiliated with the Mozilla Foundation.