ID:2004620
 
BYOND Version:509
Operating System:Server:Windows Server 2012 rc2 AND Windows 7 ultimate 64bit
Web Browser:Chrome 47.0.2526.80
Applies to:Dream Daemon
Status: Open

Issue hasn't been assigned a status value.
Descriptive Problem Summary:
There are about 300 handles open right now on /tg/station's primary server to nonexistent threads or processes. This can keep windows from properly cleaning things up, on linux it can lead to "zombie processes" that never get cleaned up at all, including the cleanup of any handles those processes left open.

We use shell() alot so i can only guess that that might be why.

Testing shows it does the same thing for windows DS launched via the join button.

It also randomly (but rarely) likes to leave a handle open to the folder the dmb is in even between rounds, opening up a new handle, and basically preventing that old folder from ever being deleted until dd is closed. (might be related to above)

I suspect shell() is indeed the problem. Historically shell() issues have been impossible to reproduce reliably.
this one seems reliable, at least on my system, its not closing some handles to the processes or threads

it seems odd though, because i'm pretty sure that shell() is mostly just a call to system() so i don't know what handles would be involved.

edit:
OHH unless it async's by opening a process. and shell() does sleep rather then block, so it would have to.

How are you launching the process in shell()?
The process is launched by execl() in Linux. So this would make sense if the shell() doesn't notice it's complete, which is actually a known issue I've never been able to reproduce either. This is our code for checking on whether the child process has completed:

int status;
if(pid != waitpid(pid,&status,WNOHANG)) return false;

If the call succeeds, the status code is interpreted here:

if(WIFEXITED(status)) {
if(ret_code) {
*ret_code = WEXITSTATUS(status);
if(*ret_code == 127) *ret_code = -1; //special null return code
}
return true;
}
if(WIFSIGNALED(status)) {
if(ret_code) *ret_code = -1;
return true;
}
return false;

A true return value indicates the command is done. Maybe you'll see some nuance in these calls that I'm missing.
Well, my case for noticing this was windows, i don't actually know if it's an issue in linux, I assumed it might be, but it makes sense that they would use separate ways of handling it.
I forgot to mention Linux also has a handler for SIGCHLD.

while((pid = waitpid(-1,&status,WNOHANG)) > 0) {
if(GetExitCode(status,&exit_code)) {
signal_srv->CommandFinished(pid,exit_code);
}
}

GetExitCode() is that 2nd block of code in the previous post, where it interprets the status value.

This is the Windows version of the code that checks on a finished command:
HANDLE p = OpenProcess(PROCESS_QUERY_INFORMATION,0,pid);
if(!p) {
*ret_code = -1;
return true;
}
DWORD code;
GetExitCodeProcess(p,&code);
CloseHandle(p);
if(code==STILL_ACTIVE) {
return false;
}
else {
*ret_code = (s4c)code;
return true;
}
Well that windows code is fail proof, hmm...

What about when you launch the process? does it open any handles to get the pid or other info that might be staying open?
In response to MrStonedOne
That may be the problem. Looks like in the call to CreateProcess(), the process handle that's returned is used, but the thread handle is not used and is not closed. I also see that process termination doesn't close the process handle, which is probably a bug.
bump, still an issue.