ID:2427135
 
Applies to:DM Language
Status: Open

Issue hasn't been assigned a status value.

Current state:


When del is called on an object, the VM iterates everything that could be holding a reference to that object, and sets any references it finds to null. This means that the more objects there are in the world, and the more properties those objects have, the longer it takes to forcibly delete an object. This is a known problem, as described in http://www.byond.com/docs/ref/info.html#/proc/del.


Aside from attaching a reverse engineering tool like IDA to dreamdaemon (no thanks), or iterating the entire world (also no thanks), there is no way to inspect the references being held to an object. This makes it very difficult to find and fix circular or hanging references. This is the biggest performance issue in some of the SS13 codebases, and one that we cannot build userland tools to solve.


Ideal state:


DM has a references(obj) proc.
args:
obj: an object
returns:
if obj is an object:
a list(list(
the object referencing the target,
the string property name where the target is referenced
))
if obj is a primitive type or null:
an empty list


So that's a proc that returns a list of tuples showing everything that's holding the given object and how it's being held.


We can use this proc to:


  • Proactively trace hard deletions while load testing the world locally. This will let us easily find the worst culprits of things holding references to objects after we need them to go away.
  • Defensively find hard deletion issues in "production" by tracing hard deletions all the time at a low sample rate, like 1/1000. This will let us catch new issues that arise with hard deletions and maintain good performance.
  • Make more objects safe to pool. We use object pools in the game to reduce the overhead of allocating small spammy data objects, like signals between machines. We can use this new function at development time to find all the references to these objects and ensure they're cleaned up appropriately before we enable object pooling for them.

Why:


In the Goonstation SS13 codebase, we have approximately 16,000 object types. Many of the other SS13 codebases are in similar situations.


The vast number of object types, the large number of contributors, and the lack of strongly enforced static types makes it difficult to ensure that new and modified code properly cleans up references to unneeded objects.


Given the large codebase and large number of objects alive in the world during the game, hard deletions take a very long time. The game has tons of great content that we're happy with, so simply killing code and reducing the amount of content is not a viable option (although we've done a fair amount of code size reduction over the years).

Things we've tried:


  • Delete Queueing: Our delete queue does give objects a chance to be garbage collected by removing them from the queue and getting a text ref to them, then sleeping and checking to see if the object was GCed by trying to `locate` it. This lets us avoid forcibly deleting quite a lot of objects, but there are still tons of objects with hanging references that don't get GCed.
  • Code cleanup: Because the codebase is large, and the type system is very loose (doesn't allow static analysis like 'find all things that could hold this thing') this is a manual process requiring extremely high effort and providing generally low impact. I could spend a week trying to clean these things up by reading the code and finding all the references to a thing manually, but it wouldn't be very effective work, because the development environment gives me no assistance in finding the biggest of these deletion problems. I could write a thing to iterate every property on every object in the world, but that would be incredibly slow to run in userland code.
  • Time profiling: We did some profiling to see what objects are slow to forcibly delete, and that gave us a few clues, but it's painful to do this. Being able to enumerate references directly would make solving these problems vastly simpler.

This feature would improve the BYOND developer experience for us so much! Thanks for considering it.



+1
This would be pretty useful to have, we had to write a cyclic ref checker that can only be run locally because of how long it takes to complete.
Please consider this, you're our only hope!
Unfortunately I don't know if this is feasible.

The garbage collection code takes a function pointer so it can either eliminate a ref or count it, but that function is not given any kind of context as to where the ref actually is. That would involve passing a lot more information and therefore slowing down the GC. (The only alternative I can think of is a global struct that would take the place of an iterator variable, but store other info as well on where it is in the process.)

Additionally that code skips over some references that might get eliminated early in del(), such as contents, visual contents, and some minor temporary vars.

The global state struct idea is new to me though so it's worth ruminating on whether that would change the equation here.
Thanks very much for taking a look!

I don't have the code to look at, but I've done enough of this kind of instrumentation to think that the global state struct idea seems sound. Setting some global state that hints at what object is being considered seems reasonable given that the GC is single threaded and is probably not reentrant in this codepath.

I would imagine the iteration code for del can be reused for enumerating references, and all you'd need to do is implement a different function for collecting holder references from the global state rather than eliminating holder references from the prop bags on the heap.