ID:136491
 
About once a day, I have to restart the Dragon Warrior Online server to clean up some problems that occur during the game. Sometimes when I do this, some savefiles get corrupted. SilkWizard said that he used a slightly modified version of Deadron's library for handling savefiles. Here are the basic steps I take when trying to restart the server (remember, I am not the game's coder, just the guy who runs the server):

  1. run 'ps -aux | grep DreamDaemon' and get the process id.
  2. do a 'kill -15 ' and wait for the server to complete halt
  3. if that does not work (haven't had to do this in a really long time), do a 'kill -9 ' to forcefully kill the server (yes, that would definetly kill some savefiles if there were being written to during that time....but this step hasn't been necessary in several months now)
  4. Finally, restart the server as normal by 'DreamDaemon dwonline.dmb 2047'

I generally wait about 5-10 seconds between steps 3 and 4.

When I run step 2, everyone is automatically logged out of the game and everything is saved.

IF a savefile gets corrupted, I generally get a [ckeyname]_lost.txt file that contains what looks like what was a failed attempt of recovering the data. Here is the contents of the text file from our latest corrupted savefile:


//Orphan savefile buffer '' found on Fri Sep 20 22:37:09 2002
= null
//End of exported data.
//Orphan savefile buffer '' found on Fri Sep 20 22:37:09 2002
= null
//End of exported data.
//Orphan savefile buffer '' found on Fri Sep 20 22:37:09 2002
= null
//End of exported data.
//Orphan savefile buffer '' found on Fri Sep 20 22:37:09 2002
= null
//End of exported data.



Any ideas on what might be causing this? I am using BYOND for Linux version 330 on a RedHat 7.2 system.

Here's something to try while I'm looking into the rest of the problem: you can tell DreamDaemon to restart (and reload the .dmb) without killing it. Just send it a SIGUSR1 (kill -10). Maybe that doesn't fit in with the type of maintainance that you need to do, though, so just try it if it makes sense. It should behave the same as SIGTERM, except users are automatically reconnected when the game reboots.

I will also compile the latest version for Linux today.

--Dan
In response to Dan
Well....ain't that neat.... Thanks Dan. I guess I need to read the man page on 'kill' a little closer.
In response to CableMonkey

Well....ain't that neat.... Thanks Dan. I guess I need to read the man page on 'kill' a little closer.

The behavior of SIGUSR1 is totally up to the application in question. It's not like other signals with a standard meaning. I always thought it might be useful to be able to send a signal to DreamDaemon telling it to reboot, but as far as I know, nobody has ever made use of it yet!

--Dan
In response to Dan
Perhaps it would be a good idea to give the game control of being able to handle signals? I'd like that functionality, as I am currently putting together a remote Linux build environment for DM, using CVS and other custom build scripts. Basically, I'd like to remotely store, compile, and update my game source automatically, and having access to that kind of functionality would be a godsend...just a suggestion.
In response to Devhead
as Dan mentions, it already understands the -10 SIGKILL which reboots the running process.

you could implement an 'auto compile, and reboot world' in a shell script or bash script (or whatever shell your host provides you).

basically your script would do something like this:
- compile the program
- grab the process id of the current running program of the same name
- do the reboot of that process id (as Dan mentions above)

so if your script is called 'do_it_again_sam', and it takes one argument, you could simply type:

do_it_again_sam mygame

which would then recompile and reboot the running world with the newly compiled code. you could also put in some checks so that ifthe game is *not* running, it gets started up instead.

In response to digitalmouse
digitalmouse wrote:
as Dan mentions, it already understands the -10 SIGKILL which reboots the running process.

Augh. This little bit of necromancy highlights what I've been thinking was a flaw in the system. And it basically is, since there's no way for Dream Daemon to ignore the reboot message.

When hosting Incursion on polaris I sometimes noticed that it would reboot without warning, which sucked because for Incursion to run in Dream Daemon it must be loaded from a bootstrap in DS or DMCGI; upon reboot it freezes because it knows it's not supposed to run if it doesn't know who the host is.

These problems were caused by something (or someone) overzealously rebooting all server processes during maintenance, a mistake which I believe has still not been corrected. Because of the way Incursion must load, which is also a necessity only due to the design flaw that Dream Daemon does not know the host, it cannot be rebooted. (Besides, a reboot will fry any game in process anyway.)

Lummox JR
In response to Lummox JR
I've been able to run MLAAS in Polaris for 4-7 days at a time without any reboots. I'm not sure it's the server doing it.
In response to Lummox JR
Lummox JR writes:
(Besides, a reboot will fry any game in process anyway.)


ah, so maybe something that traps the -10 reboot hook, allows you to save the state of the game (if applicable), *then* reboot the world, is needed?
In response to digitalmouse
Yes, I realize this. However, whenever the RSC file needs to be rebuilt, the Reboot will zap any custom icons that were store in the cache. However, if the server is just killed, and then restarted, the cache is fine. I like the fact that the reboot flushes savefiles, and things of that nature, but I'd like a 'clean shutdown' functionality, which I can implement myself, if given the option of handling the signal on my own.