------------------------------------------------ "Introduction to Endianness" ------------------------------------------------ C/O :: arp of DynamicHell Development Team ------------------------------------------------ http://dynamichell.org | irc.dynamichell.org ------------------------------------------------ Endianness is an issue that at some point every low-level programmer will come across. It describes the subject of data ordering and more specifically, in computing, the arrangement of data in memory. There are various types of endianness: big-endian, little-endian and middle-endian. It can refer to bit ordering, or more commonly to byte ordering. Thankfully not all computers are 80x86 based. There are various other architectures--PowerPC, SPARC, Motorola 68000, MIPS and ARM to name a few-- they all have their own ideas on how to manipulate and store data (if they didn't there'd be little point of having multiple architectures.) Each have their advantages and disadvantages, though the topic is beyond the scope of this introduction. Programming problems generally arise due to interaction of two seperate computers (of differing architectures) who do not use a common byte ordering system. Big-endian ========== Let us take a look how big-endian byte ordering systems stores its data in memory. Given the 32-bit hexadecimal integer value of 0x1917F4C0 a big-endian system will store the data as follows: Low addr|___________________| High addr | 19 | 17 | F4 | C0 | |-------------------| As you can see the largest byte (most significant byte) is stored first, followed by the remaining three bytes in decreasing order. Let us take a closer look at the data: 19 17 F4 C0 0001 1001 0001 0111 1111 0100 1100 0000 The important thing to notice here is that although these architectures use big-endian byte ordering, they do not use big-endian bit ordering (some may, though it is very uncommon.) Little-endian ============= Now little-endian byte ordering. For simplicity we will use the same 32-bit integer 0x1917F4C0. Litttle-endian machines will store the data as follows: Low addr|___________________| High addr | C0 | F4 | 17 | 19 | |-------------------| Let us look closely at the data: C0 F4 17 19 1100 0000 1111 0100 0001 0111 0001 1001 Again, rather confusingly, although each byte is stored from the least- significant byte, the bit endianness does not follow the pattern the byte ordering takes. It is important to distinguish between byte-ordering endianness and bit-ordering endianness. The latter issue is much less common, though worthy of note. Visually little-endian byte ordering with big-endian bit ordering makes sense. This, however, is rarely the case. Middle-endian ============= Middle-endian machines are much less common than big-endian and little-endian machines. However, they do exist. Their behaviour differs between architecture. Middle-endian machines are characterised by the mixture of little-endian and big-endian byte ordering depending on the data size; much more caution must be taken by the programmer. Problem and solution ==================== As stated earlier, one problem arises when two computers of differing architecures and byte-ordering systems try to communicate with each other. This is only a problem with data above octet size (if both machines have a common bit endianness). One example is the struct sockaddr_in--used to create a socket--which requires the member port to be in host byte order. On little-endian machines this value must be converted in order to assign the port as expected. For example: struct sockaddr_in client; client.port = 23; /* Wrong on little-endian machines!*/ client.port = htons(23); /* The solution... */ Luckily there are other functions which can aide the programmer in similar situations: /* #ifdef BSD #include #endif #ifdef LINUX #include #endif uint32_t htonl(uint32_t hostlong); Converts from unsigned int host (little-endian) byte order to (big-endian) network byte order. uint16_t htons(uint16_t hostshort); Converts from unsigned short int (little-endian) host byte order to (big-endian) network byte order. uint32_t ntohl(uint32_t netlong); Converts from unsigned int (big-endian) network byte order to (little- endian) host byte order. uint16_t ntohs(uint16_t netshort); Converts from unsigned short int (big-endian) network byte order to (little-endian) host byte order. */ Conclusion ========== Endianness can refer to byte ordering as well as bit ordering, though byte ordering is much more common and generally what is discussed under the term endianness. Little-endian bit ordering is very common amongst architectures, unlike big-endian bit ordering, which is rare. Endianness is something which must always be considered when working on projects that need to be portable, and communicate with other machines of potentially differing endianness. Copright (c) 2006. Alastair Poole. Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice, and the copyright notice, are preserved.