ID:2064422
 
It's not Sunday anymore, but I wrote most of this on Sunday before falling asleep after a lovely pulled pork shoulder dinner, so it still counts.

If you've gotten anywhere with programming, you'll have noticed that some people use some logic patterns that seem a bit "clustered", and you might think that they are being efficiency freaks and writing "hacky" code. That's not exactly the case. When you understand the logic patterns behind many of the operators available to you, you can begin to build habits that will help you avoid redundancies in your code and help you understand some of these "clustered messes" and "hacks" that get much derided around these parts.

Let's introduce you to DM's operators:

Order Group Operators
-2 expansion expansion
-1 lexical access
0 (P) structural ( )
1 navigation [] () . / :
2 single operand ~ ! - ++ --
3 (E) arithmetic **
4 (MD) arithmetic * / %
5 (AS) arithmetic + -
6 relational < <= > >=
7 binary << >>
8 relational == != <>
9 binary & ^ |
10 boolean * &&
11 boolean * ||
12 ternary ?
13 assignment = += -= /= &= |= ^= <<= >>=


*: && and || operators are special conditional boolean operations sometimes called logical operators. See below for a discussion on short-circuiting

You may be familiar with PEMDAS or BEMDAS (depending on which side of the Atlantic you were born on). This is a standard mnemonic device that is drilled into the heads of every algebra student. It determines the order that you perform operations in an expression. Programming inherits and expands these rules quite a bit.

Every operator is sorted at compile-time using this priority list. The negative phases only matter if you care about how the compiler and interpreter works to resolve values. Everything else matters for what you actually type when compiling and determines the order that operations will occur in. The sequence from 0 to 13 can generally be assumed to be the order that operations will occur, and all operators that share a priority will generally be resolved left to right. There are some exceptions.

Operands: Values used in conjuction with operators are called operands. These are the values that the operator uses to perform its function. For instance, the operands of b*4 are "b" and 4.

Associativity: Associativity determines the sequence that operators are processed in. 4**2**3 is equal to (4**2)**3 because ** has left associativity. Meaning the operator will be processed from left to right. Multiplication and addition on the other hand are right and left associative. It doesn't matter what order they are processed in because the values will be the same regardless of what direction you process them in. In many languages, assignment operators are right associative. In DM, though, they are non-associative. What does this mean for you? Well, in many languages you can perform the operation a = b = 4*3, or a = b += 4. in DM, however, you cannot perform these tasks and must instead break the operations up into multiple lines.

Pre/post/infix: Some operators come before the operand. Some come after. Some operators come between two operands. The negation operator (-) is a prefix operator. It modifies the operand that comes after, performing an XOR operation on the negative bit of a numeric value. The ++ and -- operators are actually not a single operator, but two identical looking operators. This causes a lot of confusion because people believe these operators to be the same operator acting differently just because they look the same. This is not true. These symbols denote a different operation based on their position relative to an operand.

a++ or a-- is a different operation completely from ++a or --a. We'll get into this more in depth below.

Output values: Some operators take one or more operands and create an output value. 4*2, for instance takes two operands and converts it into a single output. In this case, 8. This output is then used for any subsequent operations in the expression. So 4*2+1 would be a three-step process:

4*2+1
8+1
9


Making incorrect assumptions:

When people first learn about operators, they tend to associate their operations with specific uses. For instance, most people think of the assignment class of operators as used for changing variables (stored memory values). This is actually true! The assignment class of operators has a singular use. Their behavior is exclusively for operating on memory values.

But then people make bad assumptions based on this same logic. Since assignment operators are only used for operating on memory values, therefore boolean and relational operators only belong inside of a conditional statement! This is not true! Almost every operator is equally valid in any non-constant expression! So if you thought that the conditional OR and conditional AND operators are only useful inside of an if statement, while statement, for statement, etc, prepare to learn some cool junk!

if(value1&&value2)


This is a common use of an && operator. It's being used to test if both value1 and value2 are true. You almost never see && and || operators outside of an if statement. But let's take a look at a more complex usage.

if(value1&&value2&&value3&&value4)
return 1
else
return 0


Is roughly the same as:

return value1&&value2&&value3&&value4


There is a difference you should be aware of before using this pattern everywhere. In the case that the values are not equal to 1 or 0, you should be aware that && outputs either zero or the rightmost value.

This means that:

if(a&&b)
return 1
else
return 0


will return 1 or 0

and:

return a&&b


will return b or 0

&& and || often function as boolean logic operators. This is why any pattern that suggests that you check if a boolean is equal or not equal to 0 or 1 is wrong. boolean logic operators do not always return 0 or 1. They generally do, but not always. Remember that 0 or null is always false, and any value except those two are always true. As such, numerics, strings, objects, and any other datatype are all operable by boolean logic. They are not boolean, but they will be treated as boolean for the purpose of boolean operations. This means that the correct way to evaluate a boolean function is never if(function()==0) or if(function()==1). You cannot and should not depend on a function being 0 or 1 even if the reference tells you that it only returns 0 or 1. It's both wasteful and unsafe. You must depend ironically on the function being considered boolean, but returning only a virtually boolean value, which means the correct way to check them are if(!function()) and if(function()).

The same but not really:

The ++ and -- operators are actually two operators each, not one operator each. These operators are differentiated by the order they come in compared to their operand. These operators are called the increment and decrement operators. They increase or decrease a value by 1 and generate an output.

The basic process:

a++ or a-- will output first, and increment/decrement second. This is post-increment or post-decrement.

var/a = 4 //the value of a is 4
world << a++ //outputs: "4"
//the value of a is now 5
world << a-- //outputs: "5"
//the value of a is now 4


var/a = 4 //the value of a is 4
world << ++a //outputs: "5"
//the value of a is now 5
world << --a //outputs: "4"
//the value of a is now 4


Pre and post decrement are used very often in loops that access lists, but that doesn't mean that is where their only usage is.

A few iteration variations:

//naive while loop iteration
var/count = 1
while(count<=l.len)
world << l[count]
count += 1
//standard while loop iteration
var/count = 1
while(count<=l.len)
world << l[count++]
//standard in-condition while loop iteration
var/count = 0
while(++count<=l.len)
world << l[count]
//standard for loop iteration
for(var/count=0;count<=l.len;count++)
world << l[count]
//optimized C for loop iteration (optimized by C compilers, but not by DM's)
for(var/count=0;count<=l.len;++count)
world << l[count]
//optimized DM for..in..to iteration
for(var/count in 1 to l.len)
world << l[count]


Another common practice is when you need to check if a function argument has been supplied and if so fall back to a default value to do something like this:

mob/proc
Example(Dir=0,Speed=0)
if(!Dir) Dir = dir
if(!Speed) Speed = step_size
Refire(Dir,Speed)


This is where the || operator comes in. You don't need to navigate troublesome branches to use it, and it doesn't just belong in if statements.

mob/proc
Example(Dir=0,Speed=0)
Refire(Dir||dir,Speed||step_size)


If this is confusing, it's probably because you don't know about something that && and || operators do inherently. It's called short-circuiting.

Short Cirtcuiting:

Short-circuiting is a concept particular to && and || operators. && and || operators check for the boolean condition of two operands. If the left operand meets the condition, the right operand is tested. If the left operand violates the condition, the right operand is abandoned.

We can use this behavior to our advantage if we understand the rules. This isn't about optimization to make your existing code faster. If you already have a project that doesn't take advantage of these concepts, it's usually not worth going back to change working code to use them. But if you understand these concepts, you will inherently write better code in the future. This is about building good habits through improved understanding, not fixing problems after the fact.

if(value1)
return Refire()
else
return 0


We can use short-circuiting to remove the need for a logical branch here:

return value1&&Refire()


The order that we insert operations into boolean expressions is important because more CPU intensive operations should generally be on the right, and less CPU intensive operations should generally be on the left. The above example will prevent the function call from happening if value1 is false. This is often a very useful optimization for preventing function calls where an if statement would otherwise be used. Again, this isn't worth rewriting code for. It's a very, very small optimization, but it is worth understanding so you can adjust your habits in the future to take advantage of conditional logic.

The ternary conditional operator:

The ternary operator is an awful lot like an if statement, but it's inline. It will return one of two values based on a condition. An if statement involves a lot of work that a ternary conditional operator doesn't have to do. Some of this involves scope traversal, which can be costly in the sense of overall performance or may not be possible within the context of the instruction you are writing.

CONDITION ? TRUE : FALSE


The format is simple. The operand to the left of the ? operator is used as a condition, and the operand to the right is the output that is used if the condition is true. This is followed by the ":" separator, the right of which is the false output, which is used when the condition is false.

Understanding what these operators do at a machine level will help you understand how to form better programming habits. They will make you more literate at reading code to boot, which is always a good thing.

Remember that faster code isn't always better code. Sometimes faster code is ugly, unmaintainable, or the speed difference isn't worth the work difference for the programmer. Sometimes worrying about micro-optimization can necessitate the introduction of patterns that are overall less efficient or just plain more cumbersome. Having an understanding of what computers actually do will help you to see when you are using optimal patterns in suboptimal places and will also help you avoid the pitfalls of bad or redundant logic. Programming is often about balancing many factors and making sacrifices in one area over another. There is rarely a perfect solution. Good solutions often require the introduction and elimination of corner cases and requiring the end user to jump through certain hoops. The art of compromise is important, so there is generally not a "best" pattern. Every pattern does a specific task. Getting your head out of the box of just doing what you've seen and injecting true understanding of what operators actually do will help you understand a wider array of patterns. Understanding a wider array of patterns means being able to make better choices every step of the way.
Expansion:

When the compiler runs, it has to transform your code into something that it can process. The gist of this is that it combines all of the code files in #include order and then takes any #defined macro directives and subs out the labels for their values.

When you use preprocessor macros, what you are doing is telling the compiler that a certain label should be replaced with a series of instructions or values. Because this happens at compile-time, definitions can be much faster than variable accesses.

Definitions also have an interesting property of being both global and state-based.

#define TILE_WIDTH 32
#define TILE_HEIGHT 32


The above will define a new macro label "TILE_WIDTH". Anywhere that you type TILE_WIDTH will be interpreted by the compiler as 32.

mob
bound_width = TILE_WIDTH*2
bound_height = TILE_HEIGHT*2


The above example would use this preprocessor macro to set the bounds of a mob to be two tiles on each side.

After the expansion phase has done its work, it will look like this to the compiler:

mob
bound_width = 32*2
bound_height = 32*2


When the compiler reaches an operator that has two constant values, it will attempt to optimize the expression. So it will be interpreted as:

mob
bound_width = 64
bound_height = 64


But why would be bend ourselves over backward to use a TILE_WIDTH definition in the first place? The answer is ease of adjustment of large swaths of your source code by changing values around.

You can't use non-constant values when creating prototypes, so you can't do something like this:

mob
bound_width = world.icon_size*2
bound_height = world.icon_size*2


The above won't compile because you can't use non-constant values. So the only way to quickly adjust huge chunks of your code by changing a setting or two is to use a preprocessor macro.

Now, there are some questions as to when this would be necessary. Let's say you are making a library that implements a very general solution to a problem. In order for your library to be plug and play, it's good to establish parameters that may vary between uses.

But what if you are just writing a game? Nobody else is going to touch it, so you KNOW that this will never change, so why not use constants?

You don't know that. One day in the future, you might hire an artist that makes a bunch of 16x16 art that better suits your game's style. Now you have all this awesome art, but you've got to wade through thousands of lines of code and change all these little settings. And once you think you've got them all, someone triggers an obscure function buried somewhere in your code and your entire game slips into pixel movement mode because you forgot to change one line. Sounds like fun right?

It's not.
Lexical:

The lexical phase happens when the compiler reaches a specific line and has to process the expression into instructions. Your source code gets transformed into a series of bytes that the runtime interpreter knows how to run.

So what to you may look like:

var/delay = move_delay+stun_delay*is_stunned


Will compile down into something like:

access <src,stun_delay> (push)
access <src,is_stunned> (push)
mul (pop) (pop) (push)
assign <local,1>


During compilation, the engine figures out what labels are and translates them into bytecode. Operators are automatically subbed into the correct bytecode. This means that some things that you type will only affect the lexing phase of dealing with expressions.
Structural:

The structural phase isn't an official phase. This happens when during the lexical phase. This refers to the process of building an expression. Expressions can often compile down to the same thing regardless of the way that you type them. The structural phase will interpret the operators and structural tokens to create a series of instructions.

var/c = a+4+b*3


access <local,b> (push)    {b}
push 3                     {3,b}
mul (pop) (pop) (push)     {3b}
access <local,a> (push)    {a,3b}
push 4                     {4,a,3b}
add (pop) (pop) (push)     {a+4,3b}
add (pop) (pop) (push)     {a+4+3b}
assign <local,c> (pop)     {}


The values in the curly brackets show the current state of the stack. The stack is fast local memory used by the interpreter. Access reaches into the heap, which is a slower version of memory.

Every time you push, you add a value to the top of the stack. Every time you pop, you take a value off the top of the stack.

Let's show how parentheses change this a little bit.

var/c = a+(4+b)*3


push 4                     {4}
access <local,b> (push)    {b,4}
add (pop) (pop) (push)     {b+4}
push 3                     {3,b+4}
mul (pop) (pop) (push)     {3b+12}
access <local,a> (push)    {a,3b+12}
add (pop) (pop) (push)     {3b+12+a}
assign <local,c> (pop)     {}


You see that the value changes in response to the placement of parentheses? This is operation ordering. The parentheses only change the generation of the bytecode pattern's structure. They don't actually equate to an operation.

var/c = (a+4)+b*3


access <local,a> (push)    {a}
push 4                     {4,a}
add (pop) (pop) (push)     {a+4}
access <local,b> (push)    {b,a+4}
push 3                     {3,b,a+4}
mul (pop) (pop) (push)     {3b,a+4}
add (pop) (pop) (push)     {3b+a+4}
assign <local,c> (pop)     {}


With this variant, the final result did not change, but the order of the instructions did. Using redundant parentheses doesn't really matter in DM because they go away by runtime. Expressions simply boil down to a set of instructions. Even if the expression is different, all that matters is that the values come out correctly.

Also, there is some DM optimization that happens during this phase. If an operation happens that has multiple constant values being operated on in a single operation, it will be condensed. So:

var/a = 4+7-5*6


push 5                 {5}
push 6                 {6,5}
mul (pop) (pop) (push) {30}
push 4                 {4,30}
push 7                 {7,4,30}
add (pop) (pop) (push) {11,30}
sub (pop) (pop) (push) {-19}
assign <local,a>       {}


That's what you'd expect right?

push -19              {-19}
assign <local,a>      {}


The compiler optimizes some things for you, so the second example is more or less what you actually wind up with thanks to the optimization of constant operations.
Navigation:

Navigation operators deal mostly with accessing memory structures and navigating the prototype tree. These values are resolved between compile-time and runtime. This is the last phase of operator that can possibly resolve during compile time.

[] (index) operator:

The list index operator access an element inside of a list. This operator is left associative. The left-hand label must be resolved at compile-time, but the index is resolved at runtime.

var/list/l = list()
l[10] = 1 //list assignment
world << l[10] //list access


() (invocation) operator:

The invocation operator calls a proc or a function. This operator is left associative. The left-hand label must be resolved at compile-time, but the invocation happens at runtime.

Move(get_step(src,EAST),0) //invoke Move with specified args.


..() //invoke the supercall with default args.


. (explicit access) operator:

The explicit access operator is used to navigate the properties of an instance. This must resolve to a defined label at compile time, but the access occurs at runtime.

src.Move() //invoke a function belonging to src


world << src.name //output the name property belonging to src


: (implicit access) operator:

The implicit access operator is used to navigate the properties of an instance. Unlike the explicit access, this does not need to resolve to a defined label at compile-time on the object, but at compile-time, at least one label of the implied name must have been defined under some object. This access occurs at runtime and is just as fast as the explicit access. It is however, potentially unsafe and you should take caution to not feed it garbage values to avoid runtime errors.

var/sometype = src
sometype:Move() //invoke a function belonging to the object stored in sometype


var/sometype = src
world << sometype:name //output the name property of the object stored in sometype



Path operators:

All Path operators resolve at compile-time.

/ (path deepening) operator:

The path deepening operator allows you to specify the prototype path of an object based on its polymorphic hierarchy.

/obj/item/herp/derp


The above is an absolute path that references derp, which derives from herp, which derives from item, which derives from obj.

. (upward path search) operator:

This path operator allows you to search upward in the tree to specify a path. It's not really all that useful.

: (downward path search) operator:

This path operator allows you to search downward in the tree to specify a path. It's really fantastically not useful.
Single Operand:

~ (Binary not/BNOT) operator:

BNOT performs a binary inversion of a bitsequence. DM only grants you access to the lower 16 bits of a number, so effectively every number has an integer range between 0 and 65,535 if you start working with binary.

This will invert a binary number:

~4 becomes 65531
~65531 becomes 4

! (Logical not/LNOT) operator:

LNOT performs a logical inversion of a value. This is a boolean operation. This will return 0 or 1 based on the operand.

If the operand is 0 or null, LNOT returns 1.

If the operand is not 0 or null, LNOT returns 0.

This means that using ! on any value is valid. It does not exclusively belong in boolean sequences.

- (Negation) operagor:

- will negate any numeric value. This is similar to multiplying by negative 1.

++x (pre increment) operator:

++x will increment (add 1) a value and output the final value. This modifies the value in memory.

x++ (post increment) operator:

x++ will output the value and increment it. This modifies the value in memory.

--x (pre decrement) operator:

--x will decrement (subtract 1) a value and output the final value. This modifies the value in memory.

x-- (post decrement) operator:

x-- will output the value and decrement it. This modifies the value in memory.
Arithmetic:

** (exponent) operator:

The ** operator take the left operand and raises it to the power of the right operand.

* (multiplication) operator:

The * operator will take the left operand and multiply it by the right operand. Multiplication is commutative, and therefore which operand is on the left or right doesn't matter.

/ (division) operator:

The / operator will take the left operand and divide it by the right operand. Division by zero will return an error.

% (modulus) operator:

The % operator will take the left operand and return the remainder of division by the right operand. Modulus by zero will return an error.

+ (addition) operator:

The + operator will take the left operand and add it to the right operand. Addition is commutative, and therefore the order of the operands does not matter.

- (subtraction) operator:

The - operator will take the left operand and reduce it by the left operand.

+ (string concatenation) operator:

The + operator can be used to combine two strings into one. This operator is not commutative. This does not change the value in the left operand. It merely creates and outputs a new string.

+ (list addition) operator:

The + operator can be used to add values to a list provided the list is the left operand. If the right operand is a list as well, the lists will be combined. Because list order is determined by which list comes first, the list addition operator is not commutative. This does not change the list referenced by the left operand. It merely creates a new list containing whatever elements were combined.

- (list subtraction) operator:

The - operator can be used to remove values from a list provided the list is the left operand. If the right operand is a list as well, the elements in the right operand list are removed from the left. This does not change the list referenced by the left operand. It merely creates a new list containing whatever elements were combined.
Relational:

< (lesser) operator:

This outputs 1 or 0 based on whether the left operand is less than the right operand. This also works on strings and returns whether the left operand's ascii character sequence comes before the right's.

> (greater) operator:

The inverse of the lesser operator. It will return 1 if the left operand is greater than the right, or 0 otherwise.

<= (lesser or equal) operator:

This outputs 1 or 0 based on whether the left operand is less than the right operand or equal to it. This also works on strings.

>= (greater or equal) operator:

This outputs 1 or 0 based on whether the left operand is greater or equal to the right.

== (equality) operator:

The equality operator determines whether the left or right operand are the same as one another. In BYOND, all numbers and all strings that are identical are shared. memory values. This function does not compare values, but rather the memory locations themselves.

For lists and objects this has some consequences. If the lists or objects have all the same values, it doesn't matter to the equality operator. They will never be equal unless they are the SAME object stored in the same memory location.

Output value is 1 for equality and 0 for inequality.

!= (logical inequality) operator:

The logical inequality operator is the opposite of the equality operator. It returns 1 for an inequality and 0 for an equality.

<> (greater or lesser) operator:

This operator is a lot like the equality operator, except nobody ever uses it.
Binary:

& (AND) operator:

The binary AND operator returns a number that is the result of ANDing the left and right operands together.

Binary numbers are made up of 16 accessible bits in BYOND. ANDing looks at each bit. If both numbers share the same bit both in the on position, the resulting number will have that bit in the on position. Otherwise, that bit will be in the off position.

127&129 = 1

because:
0000000001111111 (127)
&                  &
0000000010000001 (129)
=                  =
0000000000000001  (1)


Look at the numbers top-to-bottom. Every 0 or 1 is a bit. Both of them have to be on in the same row for the result to have that bit switched on.

The AND operator is good for checking which bits are on, and in combination with the BNOT operator, for turning bits off.

| (OR) operator

The OR operator is like the AND operator in that it deals with binary, but instead of both bits having to be on, one or both can be on.

127&129 = 255

because:
0000000001111111 (127)
|                  |
0000000010000001 (129)
=                  =
0000000011111111 (255)


The OR operator is good for turning bits on without having to check whether it is already on.

^ (exclusive OR/XOR) operator

The XOR operator is like the OR operator, but instead of one or both bits being on, the OR is exclusive, so only one of the two bits can be on for a true bit to be allowed.

127^129 = 254

because:
0000000001111111 (127)
^                  ^
0000000010000001 (129)
=                  =
0000000011111110 (254)


The XOR operator is useful for toggling bits.

~ (Binary NOT) operator:

The BNOT inverts a single binary number. We covered this earlier, but I want to show the bit pattern here.

~127 = 65408

because:

~0000000001111111 (127)
=1111111110000000 (65408)


The BNOT is useful for turning bits off in combination with the AND modifier:

value &= ~(FLAG_1|FLAG_2|FLAG4) //turns off flags 1 2 and 4 whether they are on already or not. Any other flags will stay at their current state.
Jesus, this turned into a jumbled mess...
In response to Ter13
Ter13 wrote:
Jesus, this turned into a jumbled mess...

Welcome to my life...

It's been an interesting read so far though. Haven't got through all of it yet.
Honestly, most of it you won't need clarification on. For the sake of completeness I covered all of the operators, but the only parts worth reading for anyone with a grasp of basic programming is the first post, and the expansion, lexical, and structural phase posts.

Honestly, the next snippet sunday is probably going to come from a PM conversation I had with Zagros on binary. I think binary needs to be covered more in depth here.
Don't mention the 'to' and 'in' operators; or do you consider them more keywords?
In response to Super Saiyan X
Super Saiyan X wrote:
Don't mention the 'to' and 'in' operators; or do you consider them more keywords?

There was no reasoning for that at the time. I forgot about them. But I feel like explaining them involves explaining for, switch, input, and obscure variable limiting patterns now that you mention it, which are out of the scope of this already unwieldy SS.
In response to Ter13
Ter13 wrote:
Super Saiyan X wrote:
Don't mention the 'to' and 'in' operators; or do you consider them more keywords?

There was no reasoning for that at the time. I forgot about them. But I feel like explaining them involves explaining for, switch, input, and obscure variable limiting patterns now that you mention it, which are out of the scope of this already unwieldy SS.

SS #12 then...
SS #12 then...

That one's likely to be an in-depth explanation of binary numerics and understanding that "data" is only what you think it is.
In response to Ter13
Ter13 wrote:
SS #12 then...

That one's likely to be an in-depth explanation of binary numerics and understanding that "data" is only what you think it is.

SS #13?
Will put it on the dockett.

If you have any interesting material on in/to/as keywords, I'd be much interested in your take.

I don't much like covering input(), but map editor instance limitation is a cool topic that only a handful of people know about.