ID:1983240
 
BYOND Version:509.1314
Operating System:Windows 7 Ultimate 64-bit
Web Browser:Chrome 46.0.2490.86
Applies to:Dream Daemon
Status: Open

Issue hasn't been assigned a status value.
Descriptive Problem Summary:

Over at Eternia we've been having some pretty nasty problems with the linux DreamDaemon. After a random amount of uptime, the process will hang up (but not crash) and disconnect all clients, refusing any more connections. This is pretty characteristic of an infinite loop happening somewhere in the code, but we can definitely assure you the problem isn't with our code. For one there's no runtime error being generated in our logs.

When it hangs up, it ignores SIGTERMs and forces us to SIGKILL the process to restart the game.

Here are the past few days' worth of logs. Every time the server starts up again, the server's hung up and forced us to restart it.

Fri Nov 13 00:34:22 2015
Dream Daemon FAILED to open port 1213!
BUG: File not found: /root/Eternia/logs/chat/2015/11/13/xdgamer7.html (current directory is /root/Eternia)
BUG: Unable to read icon 3003
BUG: Unable to read icon 3003
BUG: Failed to decode message 159,13
BUG: Finished erasure with refcount=1 (ref=5:59) DM (keymapping.dm:167)
BUG: Bad ref (5:59) in DecRefCount(DM keymapping.dm:167)
BUG: Corrupt block header
BUG: Fmem block size at 1108 is -1858849779/38529 with type 5.
BUG: Error reading file memory structureFile offset: 1108
Real position: 1113/38529
BUG: Ccorrupt or invalid savefile 'saves/bindings/redthesaiyan_bindings.sav'
BUG: Attempting auto-recovery of '/root/Eternia/saves/bindings/redthesaiyan_bindings.sav'
BUG: Backed up old savefile as '/root/Eternia/saves/bindings/redthesaiyan_bindings_bad_000.sav' and exported text to '/root/Eternia/saves/bindings/redthesaiyan_bindings_bad_000.txt'. REMOVE OR RENAME THESE FILES. If too many of these files build up, auto-recovery will be disabled.
BUG: Sequence number D023 expected but 43 received
Sat Nov 14 15:14:36 2015
World opened on network port 1213.
Welcome BYOND! (5.0 Beta Version 509.1312)

Or this code can be embedded:
<iframe src="http://www.byond.com/play/embed/74.91.112.158:1213" width=640 height=480></iframe>

The BYOND hub reports that port 1213 is reachable.
BUG: Sequence number 225A expected but 42 received
BUG: Unexpected hub certificate (65535)
BUG: Unexpected hub certificate (65535)
BUG: Sequence number F4AB expected but 2 received
Sun Nov 15 17:11:13 2015
Dream Daemon FAILED to open port 1213!
Sun Nov 15 17:12:58 2015
World opened on network port 1213.
Welcome BYOND! (5.0 Beta Version 509.1312)

Or this code can be embedded:
<iframe src="http://www.byond.com/play/embed/74.91.112.158:1213" width=640 height=480></iframe>

The BYOND hub reports that port 1213 is reachable.
BUG: Sequence number EC9D expected but 2 received
What happens if you attach gdb to the process? You should be able to stop it in the middle and find out where it currently is in its stack. If you can get me a stack trace at one of the frozen points, that'd be super helpful.
...does the fact your BYOND savefile directory is in /root mean you're running BYOND as the root user?

Because that sounds like asking for trouble to me.
one, GinjaNinja32, that's irrelevant. Hardening is not a security requirement, just a best practice among enterprise systems. Running something as root isn't gonna make something bad happen as long as a security hole doesn't exist.

What is required of security stops at ensuring there are no security holes. It does not go further.

and if there is a security hole, playing this game of "don't run as root and you'll be fine" is dangerous. if it gets exploited you've still allowed an unauthenticated attacker to gain access to the computer's system, regardless of rather you are running the program as root or superlimitedchrootaccounthere you still can no longer trust the system, a full format and reinstall is required to clean the infection. Saying 'don't run as root' lures people into a false sense of security.

Ya, not having it run as root means it can't

So don't go bringing up security stuff in random bug reports, it's off topic and rude.




Moving on! I've gotten this bug in linux on an earlier version, it went away the moment i stopped using nohup DreamDaemon ... and started using nohup startserver.sh, my thinking is that having a real shell stay involved, something nohup won't do on it's own, might be related.

before i did 'nohup DreamDaemon Byond\ OAUTH.dmb 31337 -invisible -webclient -trusted' and now i do 'nohup byondoauth.sh'

script for reference:

#!/bin/bash
while true;
do DreamDaemon Byond\ OAUTH.dmb 31337 -invisible -webclient -trusted;
sleep 60;
done;
Here's a stack trace of a frozen point:

In response to Doohl
Okay, that should be fairly helpful. It'd be nice if you could find out where libbyond.so was loaded in memory so I could get a proper offset, but I might be able to work with this.

Which exact version was that server running? Still 509.1314?
Nay, 1312.
Oh, good to know. Thanks. I'll check against 1312 and see what I can find.
I was able to narrow this down quite a lot, so that I can see where the infinite loop is happening; what I don't understand is why. Something is putting this in a weird state.

What I need now is to know what string was sent to DMTextPrinter::PrintText() that precipitated this. In that way I can reproduce the problem (I'm certain this will always happen when fed that same text), and get it fixed.

The next step is to get this to hang again, attach gdb, and go back to the frame where DMTextPrinter::PrintText() is. You should be able to examine the arguments to the function, one of which is a this pointer, and one of which is a string. (I don't know the exact gdb commands you'd need to do this, but they're pretty Googlable.) Once you can get the string, I'll need you to send me the exact string--preferably in a <dm> block in a private message--and at that point I can get this properly resolved.

Sorry to put the debugging on you here, but because this stems from a string generated in your game under play conditions, it's the only way I can think of that I can solve this.
I wasn't able to extract the function's parameters with the proper gdb commands. Only relevant information it displays is that the function accepts a c-string type.
When attempting to run info args it just says "No symbol table info available" when the DMTextPrinter::PrintText() address is selected.
In the stack frame, the argument would be at ESP plus or minus a small multiple of 4 (if not at ESP itself). You won't need symbols for that. Try examining ESP, ESP+4, etc. and also negative offsets. A string should jump out.
In response to Lummox JR
I've tried this but I'm either doing something wrong, or it's just not working.
You've tried things like "print ESP" in that stack frame? And what about "print (char*)ESP"?
Here's some info on how to print out an address as a string: http://stackoverflow.com/questions/12758217/ printing-string-pointed-to-from-register-in-gdb

I would suggest the x/s $esp option, and x/s $esp+4, etc.

It might help if we can debug this together live at some point.
Could go the easymode route and just have a private build for exclusive debugging of that method
In response to Somepotato
Changing up the build specs on Linux is not a trivial thing for me to deal with.
You're running out of time for me to get this fix into 509. If you can get the string that's doing this, I know I can fix it.
We're just waiting for DD to hang up again.
Here's a few strings I extracted from $esp (+0, +-4, +-8)

Page: 1 2