ID:2306367
 
BYOND Version:511
Operating System:Windows Server 2012 R2 64bit
Web Browser:Chrome 62.0.3202.52
Applies to:Dream Daemon
Status: Open

Issue hasn't been assigned a status value.
So the servers have been crashing a bit for the past few months, I haven't reported any of it because I assumed it was related to some changes we made to how the server is ran, but that doesn't seem to be case, so i'm going to post every module/offset combo for each exception type for one of our two servers for just this month.


Exception 0xc0000005:
byondcore.dll 0x000f6d46
byondcore.dll 0x00167d6d x4
byondcore.dll 0x000f40bf
byondcore.dll 0x0014f258 x9
byondcore.dll 0x000e61d9 x8
byondcore.dll 0x00167d80



Exception 0xc00000fd:
byondcore.dll 0x000c98e6 x2
ntdll.dll 0x000431f2
ntdll.dll 0x000431f3


So that is 28 crashes in 17 days on just one server.

I originally thought they might be related to a change in something from shell() to a dll call()(), but the timing turned out to not match up. And i thought it might be related to a memory usage issue, but we now log memory usage and it wasn't high enough when the crash happened.


Also, it crashed again while i was typing this. 0xc00000fd byondcore@0x000c98e6
I think we're still having some stack overflow crashes but we're not getting anything that is helping us trace them.
Are all those in 511.1385?

The stack overflow (c00000fd) is most likely an issue with the game code although obviously I want to avoid that ever happening at all, so it'd be nice if I had some trace info going way back to be able to tell what was going on.

The others I'd have to trace individually to figure out where they were crashing; a common cause would not be unheard of. I believe one SS13 variant has been having issues with the number of unique cells passing 64K and some lingering issue with that causing situations like turfs without appearances. It's been nigh impossible to track down.

I would suggest trying out 512, which is getting a lot more stable after the earliest issues. A couple of known issues were fixed there that could have a bearing on your server.
Yes, they are all 1385.

I believe one SS13 variant has been having issues with the number of unique cells passing 64K

Could you go over what that means? I could likely make something to check this but I'd need more details on what byond considers a unique cell.
Been having similar crashes, specifically with the memory accesses, on Aurora's branch of SS13.

Same profile as here: very regular crashes, with the same offsets. 00167d6d keeps popping up quite regularly.

Started happening ever since we updated our code, specifically lighting and visual effects, to make use of 511 features like virtual appearances and GPU accelerated lighting processing.

Initial report here: http://www.byond.com/forum/?post=2275657
(Haven't had time to compile a list of offsets like MSO did, though.)
I searched all logs for the past 45 days and was unable to find a single "Maximum recursion level reached" runtime.

If it was a game code level stack overflow, it's bad enough it always triggers a crash and never triggers a runtime.

But i'm more likely to believe that they are actually byond side stack overflows.
Scratch that. It seems our world/Error() handler is eating "Maximum recursion level reached" runtimes for some reason.

Please stand by
Related bug ID:2306577 world/Error issues with stack overflow runtimes.
Ok, so we've confirmed a few stack overflows now that we worked around ID:2306577

https://tgstation13.org/parsed-logs/recursion.txt

We'll fix these and i'll let you know how that impacts crashes rates.

Might still be worth a look to see why these are crashing sometimes if that is the case.
We fixed the stack overflows and crashes still happen.

Another interesting tidbit i've found is that a oddly high number of them happen shortly after world/Reboot. the 0x000e61d9 is one example
Happened on 512:

Faulting application name: dreamdaemon.exe, version: 5.0.512.1393, time stamp: 0x59f3789e
Faulting module name: byondcore.dll, version: 5.0.512.1393, time stamp: 0x59f3782e
Exception code: 0xc0000005
Fault offset: 0x001765bd
Faulting process id: 0xae0
Faulting application start time: 0x01d3516883cf2952
Faulting application path: C:\tgstation-server-3\BYOND\bin\dreamdaemon.exe
Faulting module path: C:\tgstation-server-3\BYOND\bin\byondcore.dll
Report Id: 74900d22-be4b-11e7-80d6-00155d7f830a
Faulting package full name: