I thought this might be handy for people who are doing bit manipulations. Let's say you have to swap two bits in a bitflag.

// Do not use this #define right after an if() on the same line

// n is the number to swap
// b0 and b1 are the bit numbers (e.g., bit 0, bit 1, etc.)
// b0 < b1 is implied
#define SWAP_BITS(n, b0, b1) \
_swap_bits_mask = (n ^ (n >> ((b1)-(b0)))) & (1 << (b0));\
_swap_bits_mask |= _swap_bits_mask << ((b1)-(b0));\
n ^= _swap_bits_mask

Here's how it works.

Let's say we have a number 0xF0 and we want to swap bits 2 and 5. Bit 5 is on, and bit 2 is off, so swapping them would result in 0xD4.

5 2


If both bits are off or both are on, we wouldn't want to change the number at all, but in this case they have different values. So the first step is to XOR bits 2 and 5 together. This is done by shifting one copy of the number so that the bits line up over bit 2.

00011110  // n >> 3 shifts bit 5 into 2's position
11110000 // n

00000100 // XOR result

Now whatever bit is in bit 2, we want to duplicate in bit 5.

00000100  // mask
00100000 // mask << 3 shifts bit 2 back to bit 5

00100100 // result of OR

The mask can now be XORed with the original number to get a result.

11110000  // n
00100100 // mask

11010100 // result of XOR

But if the original number had the same value for both bits, this would work out differently. The mask would end up as 0, and the XOR would do nothing.

  11000001  // n = 193
^ 00011000 // n >> 3
00000000 // mask (result of XOR)
| 00000000 // mask << 3
00000000 // new mask (result of OR)
^ 11000001 // n
11000001 // result of XOR

I found this, by the way, while working on some feature creep. I had a bitflag that groups items by fours, and under certain circumstances I wanted to swap all the bits with mask 0x1111 with mask 0x4444, and likewise for 0x2222 and 0x8888 under other circumstances. Because the shift is the same across the board, this was doable for multiple bits in one fell swoop.

// swap bits at 0x1111 masks with their 0x4444 counterparts
mask = (n ^ (n >> 2)) & 0x1111
mask |= mask << 2
n ^= mask

Not a bad little trick. What use will it be to you? I don't know, but it was worth bringing up. You could use it to swap two bitflags in a dir, like for instance if you want to change from EAST to WEST and vice-versa, while leaving any diagonal NORTH or SOUTH component alone.

There is a similar trick that might be useful when working with directional bitflags. Say you've done some bit twiddling on a dir, and you want to make sure any opposing dirs like north and south, east and west, are canceled out.

// bits 0 and 1, and bits 2 and 3, should cancel each other out
// if NORTH and SOUTH are both set, the mask will have NORTH set
// likewise EAST and WEST will set EAST
mask = (dir ^ (dir >> 1)) & NORTHEAST
// copy NORTH to SOUTH, and EAST to WEST
mask |= mask << 1
dir &= mask

That example looks for the bitflags of two opposing cardinal directions to be different--that is, the dir uses one of those bitflags but not its opposite. If the bitflags are different, they're valid, and we want to keep the one that's in use; if they're both on, they're invalid, and we want to turn both off; if they're both off, we want to leave them off.

The mask will end up containing the bitflags for any cardinal direction, and its opposite, where only one of them is in use by dir. In other words, if the dir faces north or south, even on a diagonal, the mask will contain both NORTH and SOUTH; but if the dir has neither of them, or for some reason has both, the mask will leave those two bitflags blank. That way we can AND with the dir to remove them, if need be.
Not sure when I'd use bit swapping like that, but I'm sure I'd do it in a less efficient way than the above examples. Hopefully nobody would notice, because they'll be using super machines for my games! :D
Not sure when I'd use bit swapping like that, but I'm sure I'd do it in a less efficient way than the above examples. Hopefully nobody would notice, because they'll be using super machines for my games! :D

There's a use-case for them in autotiling state reduction algorithms.

In autotiling algorithms, you generally track connectivity of nodes by creating a value where each neighbor is a single bit.

 7 0 4  0 1 2
 3 X 2  7 X 3
 6 1 5  6 5 4

Each neighbor is assigned its own bit, where connectivity is confirmed.

47-state autotiles are a standard. 8 neighbors means 2^8 unique states, or 256 states. That's a full complexity autotile. 47-state reduction is done by only allowing a diagonal link if both component directions of that diagonal is also linked. So we need to go through and turn off each corner with an algorithm.

One way to do this is to create two versions of the same flag sequence, but shift one left, and shift the other right.

Then you can AND them together with a a flag sequence, set all cardinal bits to true in this mask, then AND them to the original sequence to retrieve the reduced 47-state value.

Otherwise, you have to do the reduction manually, which is ugly, and how I handle it:

#define at256_reduce47(at) ((at)&15) + (((at)&21)==21 ? 16 : 0) + (((at)&38)==38 ? 32 : 0) + (((at)&74)==74 ? 64 : 0) + (((at)&137)==137 ? 128 : 0)

Honestly, the 47 state reduction is not the bottleneck of my autotiling algorithm, so I don't bother with the bit twiddling required to optimize it further. It forces me to use a different bit layout, so I can't use the built-in directions for my algorithm, and that hurts ease of use and standards adoption.

The bottleneck for my algorithm is neighbor-checking and appearance churn. I've profiled it at >800K state updates a second at runtime, and >4M state updates a second at initialization on a 3.6Ghz processor in a moderate appearance complexity game.

Bit twiddling is nice, but it's hard to wrap your head around. It can come in useful sometimes, and when you are really caring about binary operations you are generally trying to write tightly packed code that's gonna run millions of times a second.