ID:2207689
 
BYOND Version:510
Operating System:Windows Server 2012 R2
Web Browser:Chrome 55.0.2883.87
Applies to:Dream Daemon
Status: Open

Issue hasn't been assigned a status value.
Descriptive Problem Summary:
Occasionally, for several hours at a time, /vg/station (and possibly other SS13 codebases) will experience completely unexplainable runtimes related to any list named "turfs". This has happened even while our repo was inaccessible, meaning that there could not even theoretically have been a change to our code to cause it, and it resolved itself before the repo was accessible again.
This is extremely inconsistent, generally happening for several consecutive rounds once every couple months. The exact runtimes are different each time, but there are two common threads:
1. "Out of resources!" is the most common (but not only) type of runtime that occurs during these periods. It never occurs otherwise.
2. The runtimes always have to do with either reading from or writing to a list called "turfs". It is not always the SAME list called "turfs", though.
Generally, 2-3 distinct runtimes occur each time this happens.

It is important to note that we have a globally-scoped list named "turfs", which contains references to all turfs in the world. (Exactly 1.5 million on our current maps.) However, this also happens with lists named "turfs" that are locally-scoped to individual procs, often before they even contain anything.
I'm not 100% sure about this, but I don't believe this happened when our maps were smaller. It's possible that the main turfs list encounters trouble simply due to being preposterously large. That still shouldn't affect other lists, though.

At the moment, on our test server, a runtime is happening in the generation of the main turfs list.
    turfs = new/list(maxx*maxy*maxz)
world.log << "DEBUG: TURFS LIST LENGTH [turfs.len]"
The second line shown runtimes with "cannot read null.len". There is nothing between these two lines of code, and they have the same indentation. They appear in the code exactly as shown (except with more indentation). Yes, turfs is declared. (var/global/list/turfs = list())

Sorry about the rest of the bug report not being particularly helpful. We just don't have the kind of information to answer all of these questions.

Numbered Steps to Reproduce Problem:
I wish I knew.

Code Snippet (if applicable) to Reproduce Problem:
It would take years of testing to find a single simple snippet that could reproduce this, more than likely.

Expected Results:
Don't get runtimes that shouldn't be possible.

Actual Results:
Get runtimes that shouldn't be possible.

Does the problem occur:
Every time? Or how often?
Rarely, but for several server restarts in a row when it does.

In other games?
Unknown.

In other user accounts?
It's serverside.

On other computers?
Unknown.

When does the problem NOT occur?
Most of the time.

Did the problem NOT occur in any earlier versions? If so, what was the last version that worked? (Visit http://www.byond.com/download/build to download old versions for testing.)
It's too inconsistent to know for sure.

Workarounds:
Wait a few hours to a day for it to fix itself.
On your test server, what is the size of that turfs list? In this case, what is maxx*maxy*maxz?

The name of the list is obviously irrelevant unless for any reason that caused a change to the compilation that treated it differently; but since turfs is not a reserved word or a keyword used anywhere by default, that's not possible. "Out of resources" on the other hand comes up when list.Copy()--and it's specific to that exact routine--can't allocate memory.

My suspicion is that your server is already running high on memory, and something is pushing it over the edge. That also explains the erratic nature of your runtimes; they happen inconsistently because you hit the limit inconsistently and in different ways. The turfs list however is a large list, and that allocation is probably big enough to trigger the problem more reliably than a smaller list would. That's why you're seeing problems with that list most of the time.
I assume the main turfs list is somehow getting broken, and for some reason that's either causing problems with other lists by the same name, or it's breaking scope precedence and getting used in place of others. But, if I'm not mistaken, every time this has happened, there has been at least one runtime related to the main turfs list, and at least one runtime related to another list named "turfs".

I believe the test server runs the same map as the main one at the moment, because our map changing script had some problems when the default map was changed, or something along those lines. That makes the map 500*500*6.

Considering I've just been told that the server's (or at least test server's) memory limit is 2GB, we may well have run out of memory, but that still shouldn't cause lists unrelated other than by name to also have problems.
In response to Exxion
Exxion wrote:
I assume the main turfs list is somehow getting broken, and for some reason that's either causing problems with other lists by the same name, or it's breaking scope precedence and getting used in place of others. But, if I'm not mistaken, every time this has happened, there has been at least one runtime related to the main turfs list, and at least one runtime related to another list named "turfs".

Scope is set at compile-time. There's no possible way for it to "break" at runtime; there can only be scope confusion at compile-time.

Again, this is not surprising that the turfs list is causing most of the runtimes, because it's a very large list intended to hold a lot of data. If you're making this list too big--or worse, if you're making several lists like this--there's a possibility it might be pushing your memory woes over the limit.

I believe the test server runs the same map as the main one at the moment, because our map changing script had some problems when the default map was changed, or something along those lines. That makes the map 500*500*6.

That's 1.5 million items, for a grand total of 6 million bytes plus the small number of bytes needed for the list struct itself. A little under 6 MB is not a bank-breaker, unless you're already close to the limit.

Considering I've just been told that the server's (or at least test server's) memory limit is 2GB, we may well have run out of memory, but that still shouldn't cause lists unrelated other than by name to also have problems.

Running out of memory will cause problems across the board. Basically all your symptoms are completely consistent with that.

I would suggest that, before getting to this point, you run a report on how DD is using memory. (In Windows Dream Daemon, you can get this via a menu command. I forget how it's done in Linux.) It would be helpful to know for instance if you have images or lists using up a lot more memory than would normally be expected, and at that point you can start finding ways to cut those things back. I know some builds like tg have massive requirements, but even so you want to keep them well back from the limit wherever possible.
Yeah, I was going to ask the host to check the memory usage while the test server was restarting, but I couldn't get a hold of him.
He said at one point it was 1.2GB, but that was while the server was just sort of uselessly sitting idle because it failed to initialize.

I seem to recall this having happened with a list that wasn't nearly that long that was also named "turfs", but I can't find it at the moment. The problem with logging everything is that finding anything is really hard.

I take it that a for in loop uses list.Copy(), given that lines starting such a loop have thrown out of resources before.
Bear in mind the copy routine is called internally for various things, like looping through a list.
I've noticed that everytime one of my fangled projects has issues with memory, there is some point where the memory seems to swell in size, then shrink, and if that swelling would take it past the 32bit memory limits (2.1GB per process) you get issues.

So basically you have to aim to keep the peak memory usage below 1.05GB because you can expect some internal rebuild operation to just about double it.
Do you resize the map? Because a large map beong resized would cause a swell like that.
From my assumption, I just figured it was re-allocations of the list of lists or one of the other big lists.

In these test cases where I've seen this, there isn't a map.