------------------------------------------------ "Binary and Hexadecimal in Practice" ------------------------------------------------ C/O :: arp of DynamicHell Development Team ------------------------------------------------ http://dynamichell.org | irc.dynamichell.org ------------------------------------------------ Introduction ============ As a computer user you will have come upon various types and sizes of data. Be it bytes, megabytes, gigabytes or perhaps even terabytes. But what does this data really consist of? Why is it this way? When can it be used practically? Hopefully by the end of this tutorial these questions will have been answered and you will have a better understanding of the inner-workings of a computer; especially how it processes and stores data. We'll explore the use of the decimal system, binary and hexadecimal representations of numbers and also partake in some practical application as well as some arithmetic. The Decimal System ================== It must be understood that counting in computing always starts from zero. This will become much clearer later. Though it's important to understand this from an early stage. It's highly likely that you are very familiar with this means of expressing numbers. (If you're not then I recommend you close this tutorial and find yourself a basic book on mathematics.) This system's name, base 10, originates from the fact that it is based upon powers of ten. Numbers can be expressed through the use of ten digits: 0 1 2 3 4 5 6 7 8 9. This is why this system is often referred to as base 10. Here follows various examples of this method in practice. 10 = (1x10) + 0 22 = (2x10) + 2 120 = (1x10x10)+(2x10)+0 393 = (3x10x10)+(9x10)+3 Obviously this is very useful and simple for people with ten fingers and toes. However, for a machine which almost wholly composes of digital switches, that can be either on (1) or off (0), things become a little tricky. The Binary System ================= This is where the binary system comes into play. Due to the fact that there are only two means of expressing a digit, 1 (on) and 0 (off) the system is often referred to as base 2. Each binary digit is commonly called a bit. The following table demonstrates how binary can be used to represent a number. In this case using 8-bit binary numbers. Decimal Binary ------- --------- 0 0000 0000 1 0000 0001 2 0000 0010 3 0000 0011 4 0000 0100 5 0000 0101 6 0000 0110 7 0000 0111 8 0000 1000 9 0000 1001 10 0000 1010 177 1011 0001 254 1111 1110 255 1111 1111 From right to left each binary digit (bit) represents twice the previous numbers value. For example: ---------------------------------------------------- | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | ---------------------------------------------------- 0 0 0 0 1 1 0 1 Equals in binary : 0000 1101 Which equals : 0+0+0+0+8+4+0+1 Equals in decimal : 13 Another example: ---------------------------------------------------- | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | ---------------------------------------------------- 0 0 0 1 0 1 0 1 Equals in binary : 0001 0101 Which equals : 0+0+0+16+0+4+0+1 Equals in decimal : 21 We currently assume that the number represented is always positive. Therefore all 8 bits of our number can be used to represent our number. With an 8-bit unsigned integer we can represent numbers ranging from 0 to 255 (2^8.) You can calculate the number range available with a given amount of bits by the following rule: Range from 0 to (2^N)-1. N being the number of bits that are available. However, you are not always going to be working with a positive result. We also need to be able to represent a negative value. This is where the sign bit comes into play. The sign bit is a bit used solely for representing whether a number is positive or negative. With a signed integer (one in which the value can be either negative or positive) the most-left bit (in our example the 8th bit from the right) is used to declare whether the integer is positive or negative. Therefore, with an 8-bit signed integer, only 7 bits can be used to represent our numerical value. Doesn't that mean that the maximum positive integer for an unsigned and signed integer is different? Yes it does, though the range of numbers (possible combinations) are generally the same. With 8-bit unsigned and signed integers we still have the ability to represent 256 numbers, though the possible numbers that can be represented differ between these two types. Therefore an unsigned integer (without a sign bit) is only capable of displaying a positive value -- as all bits are used to indicate the size of the number. Whereas a signed integer is capable of displaying both positive and negative values. To further highlight this point, especially the differences between signed and unsigned integer maximum sizes, we can examine the constants INT_MAX and UINT_MAX on an x86 Linux system. INT_MAX = 2147483647; INT_MIN = -2147483647; UINT_MAX = 4294967295; There is no such thing as UNIT_MIN as it is always guaranteed to be zero. Take note of the smaller limit for the signed integer (a result of the sign bit.) It must also be noted that signed and unsigned integers vary in sizes. You might be working with 32-bit, 64-bit or any other integer size. The rule for the most-left bit being the sign bit apples to all signed integers. For example a 16-bit signed integers 16th bit from the right would be the sign bit. For a signed integer we indicate that the number is negative by utilising and setting the most-left bit to 1, and positive by setting the most-left bit to 0. Here are a few examples: ---------------------------------------------------- | sign | 64 | 32 | 16 | 8 | 4 | 2 | 1 | ---------------------------------------------------- Decimal Binary ------- --------- 0 0000 0000 1 0000 0001 2 0000 0010 -4 1000 0100 -8 1000 1000 -10 1000 1010 Notice how the most-left bit is used to declare whether a number is positive or negative with signed integers. Whereas, with an unsigned integer, that bit would be used to expand our positive number. This is all well and good when working with relatively small numbers in binary, but it can hardly be practical to display and work with large binary numbers. Can it? The Hexadecimal System ====================== With a hexadecimal digit we can represent a number using 16 available digits. Which is unsurprisingly why it is often called base 16. These digits range from 0 to F as the following table displays. Hexadecimal Decimal Binary ----------- ------- -------- 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111 As both one hexadecimal digit and four binary digits can represent 16 values, ranging from 0 to 15, the hexadecimal system becomes useful for us when we want to represent large binary numbers. To make hexadecimal numbers easier to distinguish they are normally preceeded by 0x. Such as the 8-bit hexadecimal representation of 252 (0xFC) shows. Here follows a few examples of 32-bit unsigned integers and their hexadecimal counterparts. Example One ----------- Binary : 0010 0111 0001 0000 1010 1111 1010 1110 Broken Down : 2 7 1 0 A F A E Hexadecimal : 0x2710AFAE Example Two ----------- Binary : 1101 1110 1010 1101 1011 1110 1110 1111 Broken Down : D E A D B E E F Hexadecimal : 0xDEADBEEF Hopefully from the initial section of this tutorial you have become reasonably confident when expressing numbers in either binary or hexadecimal formats. Nevertheless, there is still more to learn, so before you proceed with the rest of this document I highly recommend you ensure a competent familiarity between yourself and the means of expressing numbers described previously. Binary Arithmetic ================= Binary addition --------------- The addition of positive binary numbers is pretty straight forward. Much like with base 10 addition, numbers are carried from right to left. A few examples follow: 0101 1011 91 +0010 0101 +37 ---------- --- 1000 0000 128 0010 0101 37 +0011 0110 +54 ---------- --- 0101 1011 91 0111 0101 117 +0010 1010 + 42 --------- --- 1001 1111 159 Binary subtraction ------------------ However, subtraction in binary is a little more complicated. You must first specify a fixed-bit width. In the following examples we will use an 8-bit width. The integer value which you are subtracting, before adding, must be converted into its two's complement form. Take the following example: 1010 1001 85 -0010 1010 -42 From this we expect the answer of 43. However the means of achieving this result may seem a little unusual. 0010 1010 is first converted into its two's complement form by inverting and adding one. So, it becomes 1101 0101 plus one, which is 1101 0110. We must then add the two's complement form of our number for subtracting (in this case 42) to the integer we are taking away from (in this case the number 85.) 1101 0110 Two's complement form of 42. + 1010 1001 +85. ----------- 1|0010 1011 Note the additional bit. But this is a 9-bit integer. Remember the fixed bit width? It was set to 8, therefore we disregard this ninth bit. Which leaves us 0010 1011. 1101 0110 Two's complement form of 42. +1010 1001 +85. ---------- 0010 1011 Equals 1+2+8+32 = 43. It worked! Although it seems unusual, when using this method binary integers of a fixed bit width can be subtracted successfully. Binary in Practice ================== Now having a good understanding of the binary and hexadecimal representation of numbers you may ask yourself, "Where am I likely to need such an understanding?" The answer is simple and sweet: on any network. With IPv4 each IP address is represented by a 32-bit number. For reasons of simplification the IP address is again broken down into four 8-bit numbers. As an IP address is an unsigned integer we understand that with each 8-bit number we have a possible range from 0 to 255 (or from 0 to [2^N]-1) , as discussed earlier. An example follows: IP address : 192.168.0.1 Binary equivalent : 11000000.10101000.00000000.00000001 But why is the binary important when the initial string of numbers is so much easier? Well, as a network administrator you may be required to designate a certain amount of IP addresses for use on your network. Binary representations become useful as it allows us to waste fewer addresses as well as aid our understanding of subnetting and masks. Rather than waste all 255 addresses on a class C network we can specify a range in the following way: 192.168.0.24/30 where the IP designates the start IP and the slash specifies the prefix (number of bits that are fixed.) It must be noted that the starting IP address must allow for (32 - prefix) bits. Examples follow. Example One ----------- IP address with prefix : 192.168.0.24/30 Binary equivalent : 11000011.10101000.00000000.00011000 So in this case 30 bits are fixed. Indicated below by 1s. And 2 bits which are variable. Indicated below by 0s. Available bits : 11111111.11111111.11111111.11111100 With the last two bits we have four possible combinations. 00, 01, 10 and 11. Therefore the range is from 192.168.0.24 to 192.168.0.27. Example Two ----------- IP address with prefix : 192.168.0.128/25 Binary equivalent : 11000011.10101000.00000000.10000000 In this case 25 bits are fixed. Again indicated by 1s below. With 7 bits which are variable. Indicated below by 0s. Available bits : 11111111.11111111.11111111.10000000 With the last 7 bits we have 2^7, 128 possible combinations. I won't list them all but I will state that the range is from 192.168.0.128 to 192.168.0.255. The following table shows this more clearly: Prefix Decimal Binary IPs Available ------ ------- ------ ------------ /24 0 0000 0000 256 /25 128 1000 0000 128 /26 192 1100 0000 64 /27 224 1110 0000 32 /28 240 1111 0000 16 /29 248 1111 1000 8 /30 252 1111 1100 4 /31 254 1111 1110 2 /32 255 1111 1111 1 Now working with netmasks (another method of specifying network size and often used instead of the IP/prefix technique). Normally you should have a good idea of the amount of IP addresses that you need to allocate. Let us assume that you need to allocate 32 IP addresses for your network. In this situation the last part of your netmask would be 255.255.255.(256-32)=224. Your netmask would be 255.255.255.224. Meaning there are 32 available ip addresses on your network. If on this network you have an IP address 192.168.0.1/24 you realise that there are 24 concrete bits with 8 that are variable. As the netmask specifies a range of 32 IPs, you know that the ip range on your network is from 192.168.0.1 to 192.168.0.32. If, however, you had an IP address of 192.168.0.78/24 on this network with netmask 255.255.255.224 you can ascertain that the IP range is from 192.168.0.64 to 192.168.0.96. IPs are allocated in blocks, in this case of 32. This obviously varies when using a different netmask. Hopefully these example of setting up network IP ranges shows you the potential for the usage of this basic understanding of binary in the world of computing. Appendix : Data Sizes ===================== A byte refers to the smallest unit that you can address. This does not necessarily mean 8 bits. On all operating systems a byte comprises exactly CHAR_BIT bits. A CHAR_BIT must be greater-than or equal to 8 bits. It is, therefore, not fixed at 8 bits. For example, A DSP chip might have 32-bit bytes and 32-bit integers; that would be 1 byte per integer. From this you can see that you can never just assume the size of data types for varying architectures other than the char type (which is guaranteed to be 1 byte.) So if the number of bits per byte can vary, doesn't that also mean that things like file sizes will be affected? Yes, as a 1 MB 16-bit byte is exactly the same as two 1 MB 8-bit bytes. Thus, there is no universally accepted definition of file size; what a strange world we live in! Copright (c) 2005. Alastair Poole. Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice, and the copyright notice, are preserved.