Want to join the rest of us on easy difficulty? Learn binary. This installment should be exciting for you. Why? Because learning this one simple class of theory will open the entire world of programming to you. You will be equipped to understand everything the minute you start to understand binary. This is the foundation upon which you will build your kingdom. Come with me. Let me show you a whole new world.
Data is as Data does:
First, everything is numbers. This is rule #1 of computing. Everything is numbers. What do we mean by this? To understand, we need to figure out how memory actually works.
Let's imagine that you run a warehouse. This warehouse stores boxes. Each of these boxes are placed on a shelf. Now being the clever little entrepreneur you are, you will want to be able to get the right box off of the shelf at the right time. So you'll have to have some way of figuring out what happens to be where. We could name each slot on the shelf, but that's going to be hard to remember.
Okay, great. We've got four slots, but let's say we have a thousand. Figuring out the exact location on the shelf is going to require knowing the order of the shelves, so we'll need some kind of a chart to figure out exactly where "Bob" is. So obviously names aren't going to do the trick. What about numbers? Numbers are easy. We can intuitively know that 3 is less than 5, but greater than 1, so as long as we number them in order it'll be trivial to figure out where we have to go to get a specific box. It also solves the mapping/labeling problem.
So we start to assign boxes via a number. When a customer stores something on the shelf, you hand them a number. When they come back and want a look in their box, you just ask for their number, retrieve the box, and all is good with the world.
This is more or less exactly how memory works. Each memory location is essentially a space on your shelf. Each "box" stores information that was placed there by an operation, and when a customer comes asking for that box again, you just shuttle on down the line to the number they've asked for, grab the box, and hand it to them.
Because everything in the computer is numbers, it doesn't matter what the numbers are. They are completely arbitrary. Computers have hardware and software that take these numbers and transform them into useful output from their inputs. It's all code. All of it. It's code. Data is meaningless until you give it meaning. It's just arbitrary numbers in a soup somewhere in the guts of the machine. You, the programmer are the one with the power to give this number soup meaning. Oh and the user. But... You know, fuck that guy. Damn end users.
What's in the box?
You might be wondering what's in these boxes? The answer is again: Numbers!
Memory is a collection of switches. These switches are called bits.
A bit has two potential positions, much like a lightswitch:
On and off are the two potential states of a bit. As with everything in computers, we're going to assign these two states values using numbers:
This gives us a total range of 2 because the switch has two total positions.
Let's add a second bit to create a single number. We now have four potential positions:
00 01 10 11
Now, as always with computers, each of these positions has a unique number identifying it:
00 = 0 01 = 1 10 = 2 11 = 3
Let's add a third bit so we can examine some patterns.
000 = 0 001 = 1 010 = 2 011 = 3 100 = 4 101 = 5 110 = 6 111 = 7
The astute among you may have started to notice a pattern with the way that these bits align to their numeric (decimal) value. First, the total range of any unique number of bits is 2n, where n is number of bits. This is because computers run off of base 2 numerics. You, the human use base 10 numerics. Most cultures through human history use base 10 because, well, we've got 10 fingers barring accidents with fireworks or unfortunate kitchen mishaps, so base 10 seems to make sense. You count like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... 100 ... 1000 ... 10000
Binary works exactly the same way that your numeric system works, but it's much more limited. In your pathetic arabic numerals, you have ten unique positions per significant figure. 0-9. Each digit is like a single binary bit but rather than being in two unique positions, you have 10. Your base 10 numeric system works exactly the same way:
000-009 = 0-9 010-099 = 10-99 100-999 = 100-999
Humans count in base 10, so the total range of a value is 10n, where n is the number of significant figures. The maximum value that can be represented with n significant figures is 10n-1.
The same thing is true for binary, which is base 2. The total range of a value is 2n, and the maximum value represented is 2n-1.
So let's break out one of these binary numbers and really show the pattern a few of you might have caught already:
+---+---+---+ | 4 | 2 | 1 | VALUE +---+---+---+ | 2 | 1 | 0 | BIT +---+---+---+
The above little chart shows the pattern. The top row are the value blocks. The bottom row is the sequence of the bit. Both start right to left.
The value is equal to 2x where x is the sequence.
20 = 1
21 = 2
22 = 4
When we plug a bit group into this chart, we can calculate the values:
+---+---+---+ | 4 | 2 | 1 | +---+---+---+ | 1 | 0 | 1 | +---+---+---+
To calculate the above bit group, let's turn the chart on its side:
+---+---+ | 4 | 1 | +---+---+ | 2 | 0 | +---+---+ | 1 | 1 | +---+---+
Now let's set up the operations:
4 * 1 = X 2 * 0 = X 1 * 1 = X +_________ X
This is The basic form that we take to calculate the decimal value of a binary number. In this case, we do the math:
4 * 1 = 4 2 * 0 = 0 1 * 1 = 1 +_________ 5
Thefore the decimal value of 101 is 5.
Let's go the other way. Let's convert decimal into binary:
where n is the number to convert to binary: let x be 2 to the power of round down (log 2 n) while x is greater than or equal to 1 if n is greater than or equal to x subtract x from n print 1 else print 0 divide x by 2 and round down
This is a basic flow chart of how a human would visualize the process of manually converting decimals into binary. You don't have to do this very often, so it's not that important to know, but it can be useful.
Let's follow this process for the number 5:
x is now 4 (log 2 5 is 2.32, rounded down is 2, 2 to the power of 2 is 4) 4 is greater than or equal to 1 4 is greater than or equal to 4 n is now 1 print 1 x is now 2 1 is not greater than or equal to 2 print 0 x is now 1 1 is greater than or equal to 1 n is now 0 print 1 x is now 0 output: 101
It's a long process, especially with larger numbers, but it's pretty simple.
That's a gist of how binary numbers get their values.
This tutorial bytes:
So let's move back to talking about memory. Remember how I said everything is numbers? Well memory is just a big ol' hunk of bits on a circuit board in your computer. There's a lot of them. I mean, like, a mindboggling number of them. We're talking billions.
Of course, referencing each bit by number is pretty silly, because programmers almost never work with individual bits. There's too many and there aren't all that many uses for a number with a range of 2. So grouping bits together into individual units is damn useful.
The smallest standard grouping of bits is called a nybble. A nybble contains 4 bits, therefore a total range of 16 and a maximum value of 15. nybbles aren't used in modern hardware terribly often, so a different grouping is what you are going to be seeing the most of. These are called bytes, and there are 8 bits to a byte.
This gives us a total range of 256 and a maximum value of 255. If you've spent any time working with pixel art or with games, the numbers 256 and 255 come up everywhere! That's because bytes are the fundamental unit of memory that all computers work with. While its true that computers manipulate billions of bits, they do it in blocks of 8 at a time, or one byte at a time.
So if everything is numbers, how is Ter talking to me through text on the screen? Text isn't numbers, right?
Wrong. I'm not talking to you through text on a screen. I'm talking to you through points of colored light on the screen. Your screen is made up of tiny, tiny little LED light bulbs. Each pixel is three bulbs together: red, green, and blue. These make colors by varying in intensity individually. Each pixel on your screen has three channels that feed to each of these pixels. The range of the intensity is 0 to 1, or off to full brightness. Each of these pixels is fed an electronic value in the form of a single byte per channel. That means 256 different steps between off (0) and fully bright (255).
That means that colors are really expressed in computers as a group of three bytes. We often like to express them via rgb values: rgb(0,255,0). Colors are numbers. Mind blown yet? Keep reading. Shit's about to get cray.
So this text, right? Yeah, numbers too.
This is a font. Specifically, it's the ANSI character set. It has 16 rows of 16 characters. The first character is zero and the last character is 255, read left-right, top-bottom. A string of text is just literally a bunch of binary numbers jammed together in sequence and then translated by the computer into visual symbols.
Old school fonts used to be 8 pixels by 8 pixels. Why? Because fonts could only be one color. This meant that you could represent each pixel as a single bit horizontally (on or off). That means that each character could have a total memory footprint of 8 bytes, and the total font could be 2048 bytes of memory. Not too shabby!
Modern fonts no longer work like this. They are actually very, very complex and use vectors to build a complex raster graphic, but the memory encoding for strings of text is pretty similar to what it used to be. Either way, you aren't looking at text on the screen. You are looking at light emitted by diodes modulated by numbers representing rasterized numeric vectors representing letters encoded by numbers in memory. It's all numbers!
I will call him 0x0FAA156C and he will be my 0x0FAA156C and he will be mine.
Binary, right? Just when you thought you were getting the hang of things, computer scientists just gotta throw a curveball at you. Well, bad news, you need to learn hexadecimal too. Hexadecimal is a base 16 numeric system. Much like base 10, where you have 0-9 representing digits, and a value of 10n-1 for each figure, base 16 has 16 positions per figure (0-F), and a value of 16n-1 for each figure. That sounds completely ridiculous and confusing, right? Wrong again. It's so much easier for a human to look at than binary.
Binary is very, very space-hungry to visually display on a computer screen.
253 is much easier for a human to recognize as a numeric value than 11111101. But 253 has some problems compared to the binary representation. It's still pretty space hungry when you are representing bytes. Humans are capable of recognizing thousands of symbols if they work with them frequently. The English language has 26 unique symbols to represent different sounds. We work with those all the time. 26, however, is a bit of a bugbear of a number for computers because it's not an exponent of 2. So our options for condensing around that range of what symbols people are already used to is limited:
1 - 256 unique characters. Total gibberish.
2 - 64 unique characters. Still gibberish.
3,5-7,9-15,17-31,33-63,65-127 - does not factor.
4 - slight improvement from binary; worse than decimal
8 - 32 unique characters? Doable, but just a bit too much
16 - 16 unique characters? 'at'll do.
32 - 8 unique characters? Might as well use decimal.
64 - 4 unique characters? Not much better than binary.
128 - 2 unique characters? Congratulations, you just found binary.
Sure, you could go the Chinese route and have thousands of unique symbols, but we saw how well that went for humanity last time.
So Hexadecimal has 16 unique symbols per digit. You'll recognize them immediately:
0 = 0 1 = 1 2 = 2 3 = 3 4 = 4 5 = 5 6 = 6 7 = 7 8 = 8 9 = 9 A = 10 B = 11 C = 12 D = 13 E = 14 F = 15
You can apply the exact same concepts we learned for dealing with binary to hexadecimal.
+----+----+ | 16 | 1 | +----+----+ | 6 | C | +----+----+ 16 * 6 = 92 1 * C = 12 +____________ 104
Hexadecimal is generally used for viewing and editing large chunks of memory, large binary files manually, and also for representing colors:
#FF00FF = rgb(255,0,255)
Hexadecimal is also for memory locations. Remember our warehouse shelves and how we numbered them? Well your numbers would look less like 0, 1, 2, 3, 4, 255 and more like: 0x00000001, 0x00000002, 0x00000003, 0x00000004, 0x000000FF.
These are memory offsets. Now, memory locations are measured in bytes, not bits. This is important for the next part of this tutorial.
Some values can occupy more than one byte of memory. Strings typically occupy one byte of memory per character plus one or two additional bytes. The extra character is usually either a terminator, marking the end of the string, or sometimes the first few bytes are a numeric length of the string itself.
Some values that occupy more than one byte are numbers!. Numbers come in a couple of different shapes and sizes. First, there are things called integers. Then there are things called floating points numbers.
1, 2, 3, 4, 5, etc. are all integers. Integers correspond to all of the binary math that we learned previously, but there are also negative integers. When an integer value supports negative and positive integers, it's called a signed value. When an integer value does not support negative integers, it's called an unsigned value.
Signed values actually change something that we learned previously. Instead of 8 bits representing the number, 7 bits represent the number portion, and 1 bit represents whether the number is positive or negative. This means that a single byte will now have a range between -127 and 127 rather than 0 to 255. This is still a total range of 256 unique positions because the number of bits didn't change. The upper range, however, did change because one fewer bits is used for the numeric component. In BYOND all number values accessible to the developer are signed. Internally, the engine uses many unsigned numeric values, but you can't manipulate these directly from a developer perspective.
Integer values can also get bigger by having more bytes smacked on to them. Standard lengths are:
char/byte: 1 byte (range: 256 a little bit) short: 2 byte (range: 65,536 a good bit) int: 4 byte (range: 4,294,967,296 a lot) long: 8 byte (range: 18,446,744,073,709,551,616 a metric butt-ton. Seriously. That's 18 quintillion! You can't even fathom that.)
This means that the boxes on our imaginary warehouse's shelves don't store whole things. They store parts of things. You need to know how big of a thing you are pulling off the shelves so you don't look at only part of the thing on the shelf at one time. This is the basis of computing. Bits to Bytes to Primitives to Objects and beyond.
DM does not support pure integer values, but they are the basis of many of the things happening under the hood in BYOND.
Unicorns all the way down:
0.1, 1000.2, 4.3, etc. are not integers. They are floating point numbers. Floating point numbers work... Well, let's put it this way:
"Oh, but Ter, it's all science, not magic! I don't believe you."
No. It's god damn magic.
But here's the gist of what the magicians who cooked up floating point numbers claim that the values represent under the hood.
Floating point numbers aren't one number, they actually represent two numbers:
Single precision floating points are 32 bits in length, or 4 bytes. 24 of these bits are used for S, or the significand. 8 of these bits are used for E, or the exponent.
This pattern represents scientific notation. 1.5*106 means 1,500,000, or 1.5 moved to the left 6 decimal places. 1.5*10-6 means 0.0000015, or 1.5 moved to the right 7 decimal places.
This doesn't sound like magic to you? Sounds entirely reasonable? Well, it's not:
- The exponent bit sequence is unsigned, but it supports negative numbers. Magic.
- The significand's most significant digit is omitted and assumed to be 1, except for subnormal numbers which are marked by an all-0 exponent. Magic.
- There are separate positive and negative zero values. Magic.
- There are special positive and negative infinity values, where the exponent is all 1 bits and the significand is all 0 bits. Magic.
- There are special NaN values where the exponent is all 1-bits and the significand is not all 0-bits. Magic.
The reason I'm mentioning floating point numbers though is because they are inaccurate. Floats are not exact values. They are approximations of numeric values. You can't add arbitrarily large or small numbers to them and expect exact results back from them. There is a boundary where numbers become increasingly inaccurate and that boundary is at the very large and very small. All numbers the developer has access to in DM are floating points, which means that DM is basically built on Magic. When you start working with really big or really small numbers, you will need to remember that they are inaccurate and search for coping mechanisms. I've already said too much on the subject. Magic.
No more beating around the bush:
Binary mathematics are what we set out to cover today, but to understand them, we really needed to understand a whole lot more about computing and memory. Now that we have all of that behind us, we can actually look at some binary operations.
The binary operations that we care about are called OR, AND, XOR, BNOT, RSHIFT, and LSHIFT.
OR is an operation that checks if the same sequential bit in two values is on in either and evaluates to that position being on.
That sounds complicated. Let's visualize:
+---+---+---+ | 1 | 0 | 1 | +---+---+---+ OR +---+---+---+ | 0 | 0 | 1 | +---+---+---+ = +---+---+---+ | X | X | X | +---+---+---+
As we can see in the above equation, we have the value 5|1.
Let's evaluate to see how OR works:
1 | 0 = X*4 0 | 0 = X*2 1 | 1 = X*1 +_________ X
1 | 0 is equal to 1 because one of the two bits is on.
0 | 1 is equal to 1 because one of the two bits is on.
0 | 0 is equal to 0 because none of the two bits is on.
1 | 1 is equal to 1 because at least one of the two bits is on.
1 | 0 = 1*4 0 | 0 = 0*2 1 | 1 = 1*1 +______________ 5
That means that 5 or 1 is equal to 5.
Let's do another:
0 | 1 = 1*4 0 | 1 = 1*2 1 | 0 = 1*1 +______________ 7
As you can see, this pattern will more or less merge any "on" bits into one number and leave any bits where both operand bits are off alone. OR is commonly used to turn on binary flags.
XOR is like OR, but it's exclusive. In XOR, it will only return true on a bit if only one of the two bits is on:
1 ^ 0 is equal to 1 because one of the two bits is on.
0 ^ 1 is equal to 1 because one of the two bits is on.
0 ^ 0 is equal to 0 because none of the two bits is on.
1 ^ 1 is equal to 0 because more than one of the two bits is on.
Let's try those equations we did for OR:
1 ^ 0 = 1*4 0 ^ 0 = 0*2 1 ^ 1 = 0*1 +______________ 4
5^1 = 4
0 | 1 = 1*4 0 | 1 = 1*2 1 | 0 = 1*1 +______________ 7
1^6 = 7
XOR can be thought of as a toggle pattern, turning on or off the target bitflags.
AND will only return true on a bit only if both bits are on.
1 & 0 is equal to 0 because only one of the two bits is on.
0 & 1 is equal to 0 because only one of the two bits is on.
0 & 0 is equal to 0 because none of the two bits are on.
1 & 1 is equal to 1 because one of the two bits is on.
1 & 0 = 0*4 0 & 0 = 0*2 1 & 1 = 1*1 +______________ 1
5&1 = 1
0 & 1 = 0*4 0 & 1 = 0*2 1 & 0 = 0*1 +______________ 0
1&6 = 0
You can think of AND as a pattern that allows you to turn off any bits that aren't targeted.
BNOT takes only one argument. It will invert all the bits in a number. on bits turn off and off bits turn on.
~ 1 = 0*4 ~ 0 = 1*2 ~ 1 = 0*1 +_________ 2
~5 = 2
~ 0 = 1*4 ~ 0 = 1*2 ~ 1 = 0*1 +_________ 6
~1 = 6
NOTE: The above equations are only accurate because we are only considering 3 bits for the sake of simplicity. Because numbers in BYOND have 16 accessible bits, the actual inversions will be significantly different. ~5 = 65530 and ~1 = 65534. For our purposes, learning, showing all 16 bits would take up a lot of space, so we're only using three because let's face it, if you've read this far into this wall of text, you deserve a damn medal what with the attention span of millennials these days. (GET OFF MY LAWN!)
BNOT is a simple inversion. It is most useful in combination with AND to turn off targeted bits.
4&~2 = 4
5&~1 = 4
6&~4 = 2
Shifting is the process of moving bits left or right. The LSHIFT (<<) and RSHIFT(>>) can be thought of as physically moving bits.
Let's picture the process:
5: +---+---+---+ | 1 | 0 | 1 | +---+---+---+ 5>>1 +---+---+---+ | | 1 | 0 | 1 +---+---+---+ 5>>2 +---+---+---+ | | | 1 | 0 1 +---+---+---+ 5<<1 +---+---+---+ 1 | 0 | 1 | | +---+---+---+ 5<<2 +---+---+---+ 1 0 | 1 | | | +---+---+---+
Alright, so what do we end up with? First we need to talk about what happens when bits are shifted beyond the boundaries of the grouping. Numbers in BYOND have 16 accessible bits for binary operations even though they are 32 bit floats. Any shift operations that go below bit 0 or above bit 16 simply drop those bits. What about those blank spaces? The bits default to off.
So let's take a second look at this:
5: 16 8 4 2 1 +---+---+---+---+---+ | 0 | 0 | 1 | 0 | 1 | +---+---+---+---+---+ 5>>1 = 2 +---+---+---+---+---+ | 0 | 0 | 0 | 1 | 0 | +---+---+---+---+---+ 5>>2 = 1 +---+---+---+---+---+ | 0 | 0 | 0 | 0 | 1 | +---+---+---+---+---+ 5<<1 = 10 +---+---+---+---+---+ | 0 | 1 | 0 | 1 | 0 | +---+---+---+---+---+ 5<<2 = 20 +---+---+---+---+---+ | 1 | 0 | 1 | 0 | 0 | +---+---+---+---+---+
You might notice that shifting effectively doubles or halves the number in it for every place, dropping the remainder. The equivalent arithmetic operation to approximate a shift would be:
LSHIFT: _x*2n (floor x times two to the power of n, where n is the right hand operand of the shift operation)
RSHIFT: _x/2n (floor x divided by two to the power of n, where n is the right hand operand for the shift operation)
Bit shifting is something that you can go a whole career without having to use if you program in DM, but bit shifting is incredibly useful if you find yourself migrating to C++ or a similar language in the future and need to work with raw memory or deal with binary file storage or network serialization.
I will be posting an addendum to the bottom of this in a few hours, so stay tuned for common uses for binary in DM.