ID:2287972
 
Applies to:Dream Daemon
Status: Open

Issue hasn't been assigned a status value.
Right now I'm pretty sure red/black trees are being used for the purposes of storing appearances. I know for a fact that it is some sort of tree structure because lummox said it was a tree structure.

The problem with trees is that they get more expensive as you add more items to it. This results in the awkward situation where a concept works fine on a small project with small amount of appearances, but gets significantly slower on a large project with a large amount of appearances.

The other problem with trees (specifically red/black ones) is that adding or deleting to them results rebalancing operatons

So what do you get when you put all that computing power into maintaining a properly balanced tree, which gets written to often? You get to have your list sorted for you. Unless I'm mistaken, BYOND does not need it's appearance list sorted.

Hashmaps should be used instead. If the hashing function is implemented properly, you can get the same performance whether there's 1,000,000 objects or just one. The only thing it costs is some extra memory.

(actually you should just get rid of appearances entirely, or put them on a separate thread, or open source BYOND but somehow I don't think that's gonna happen)
Red-black trees are already pretty efficient. Each insertion or deletion is designed to to fairly minimal rotations while rebalancing.

Hashtables might be more efficient in some ways, but less in others.

First, the idea that hashtable performance is great at high scales is not necessarily true. A typical hashtable handles collisions by creating linked lists from each bucket. If a table gets big enough, it either has to be re-hashed on the fly so there are more buckets, or the linked lists can get terribly long and defeat the purpose.

Second, a hashing function for appearances might well underperform the comparisons being done currently. The comparison function for appearances looks at the data in order of its likelihood of being different. When adding a new appearance to the tree, the number of comparisons may well be very low; when looking up a repeat appearance of course it'll go through the full set, and in those cases a pre-computed hash would work better. It's not clear to me how a hash function would shake out compared to the current method.

All that said, hashtables might well potentially improve performance. But it'd be tricky to get the implementation right.
My concern is that there seems to be a very large overhead when introducing a new appearance that wasn't on the tree before. Creating as few as 20-30 eats up about a quarter of a tick of cpu time on my machine. Perhaps you could outline exactly what happens when a new appearance is created?
In response to Monster860
Monster860 wrote:
My concern is that there seems to be a very large overhead when introducing a new appearance that wasn't on the tree before. Creating as few as 20-30 eats up about a quarter of a tick of cpu time on my machine. Perhaps you could outline exactly what happens when a new appearance is created?

Even with world.fps set ridiculously high, I don't see how creating 20-30 appearances could cause that kind of performance drop. I suspect there's something else in your code to account for that.

As for how appearances are created, I already went over that in an Under the Hood post in the Tutorials & Snippets forum.

There is one aspect of appearance adding, and similarly constructed trees, that I think could be improved. Currently there's a find operation, and if nothing is found there's an addition, which does a find itself. A combined find-or-add would be useful.
A combined find-or-add would mean you only navigate the tree once as opposed to twice.

As for how appearances are created, I really wanted to know if there was any networking overhead or if the constructor is expensive

If it helps, the way I was creating the appearances, I was adding them one by one to a new list, and then setting the overlays of another thing to that list. Perhaps it is the process of creating overlays that is expensive.

Anyways, perhaps I'll run that branch again and see if results change.
Alright I've tested it and I've found that it still sometimes takes 4% and sometimes 25%... and it's independent of the number of overlays.
If you're using tgstation be aware we have a wrapper around overlays to ensure they're only rebuilt once per tick