- character set is a binding where every character is assigned a number (code units/points)
- similar to the numeral system, but different
- encoding scheme defines how a number (code units/points) is encoded into a sequence of bytes or binary
- characterencoding is the combination of character set and encoding scheme (i.e. the mapping of characters to a sequence of bytes or binary, and vice versa)
/data/variable-types---levels-(base---flat---complex)/computer-data-encoding--and--decoding/character-encoding-(character-set---encoding-scheme)/character-encoding_character-set_encoding-scheme.jpg)
Character Sets and/or Encoding Schemes
|
Common Standards |
Character Set |
Encoding Scheme |
Description |
|---|---|---|---|
|
✔ |
✗ |
mapping of characters to numbers | |
|
✔ |
✔ |
mapping of characters to a single byte (a superset of ASCII) | |
|
✔ |
✗ |
mapping of characters to numbers (a superset of ASCII) | |
|
✗ |
✔ |
mapping of numbers to one or more bytes |
Other Encoding Schemes
|
Other Encoding Schemes |
Description |
|---|---|
|
HTML Encoding is mainly used to represent various characters so that they can be safely used within an HTML document | |
|
When dealing with URLs, they can only contain printable ASCII characters (these are characters with ASCII codes between decimal 32 and 126, i.e. hex 0x20 – 0x7E). However, some characters within this range may have special meanings within the URL or within the HTTP protocol. URL encoding comes into play when we have either some characters with special meaning in the URL or want to have characters outside the printable range. To URL encode a character we simply prefix its hex value with a % |