It'd be great if this were resolved somehow. I can't actually keep hosting the game from my own computer, haha.
In response to Kaiochao
Kaiochao wrote:
It'd be great if this were resolved somehow. I can't actually keep hosting the game from my own computer, haha.

Agreed .. I have a lot of customers complaining and putting up with the issue .. as it's out of my hands really, I can only suggest a few things such as downgrading BYOND on the server...
I don't really see what would be different in Linux vs. in Windows; it seems like the crash/freeze would be likely to happen equally in both. At the moment I'm still stumped.
I'm not sure why but Eternia's server has went from crashing daily to having yet to crash in a week or so.
In response to Writing A New One
Mine hasn't frozen since June 25... It was freezing pretty much daily before then. I haven't changed any configuration/version. Very odd.
Yeah, my project is the same... The version remained the same I believe as well (ATHK could confirm this--I'm not particularly sure).

But that seems illogical and kind of... impossible? If nothing has changed with the versioning, then crashing should persist.
I started the server up on Linux ~4 hours ago and it froze some time between then and now. It's definitely still happening.

I guess I'm gonna have to keep hosting it on Windows until something happens.
In response to Kaiochao
If you attach gdb, what do you get? I'm curious if you'll have anything different that might point to a possible cause. Also, do you get those refcount 5:xxx errors before this occurs, or do those not happen in your particular project? That could be a clue, though I'm still not sure of a possible cause.

If you do get the refcount 5:xxx errors, I'd like to know what skin procs you're calling, and where. In particular you should avoid calling skin procs in client/New(); the skin is not initialized at that point.
In response to Lummox JR
I dunno what a gdb is. I haven't gotten any of the usual old refcount errors, but these look new:
BUG: Bad ref (2:80252) in IncRefCount
BUG: Bad ref (2:80252) in IncRefCount
BUG: Bad ref (2:80252) in IncRefCount(DM Mouse Control.dm:83)
BUG: Bad ref (2:80252) in IncRefCount(DM Mouse Control.dm:83)
BUG: Bad ref (2:80252) in IncRefCount(DM Mouse Control.dm:84)
BUG: Bad ref (2:80252) in DecRefCount(DM Mouse Control.dm:84)
BUG: Bad ref (2:80252) in DecRefCount(DM Mouse Control.dm:84)
BUG: Bad ref (2:80252) in DecRefCount
BUG: Bad ref (2:80252) in IncRefCount
BUG: Bad ref (2:80252) in DecRefCount(DM Mouse Control.dm:87)
BUG: Bad ref (2:80252) in IncRefCount(DM Mouse Control.dm:88)
BUG: Bad ref (2:80252) in DecRefCount(DM Mouse Control.dm:88)
BUG: Bad ref (2:80252) in DecRefCount
BUG: Bad ref (2:80252) in DecRefCount

[client]
MouseEntered(over_object)
mouse.over = over_object // 83
..()

MouseExited()
mouse.over = null // 87
..()

I spot about 9 chunks of them throughout today's log. There are no errors at the bottom of the log, though, where the server would've been frozen at 100% CPU.
You may need to install gdb with whichever package manager you're using.

Then just get the pid of your frozen DreamDaemon instance and:
gdb
attach PID
backtrace
detach
quit
In response to Murrawhip
Murrawhip wrote:
Mine hasn't frozen since June 25... It was freezing pretty much daily before then. I haven't changed any configuration/version. Very odd.

^Scratch that. Froze just now.
Have any of you upgraded to the new version of BYOND? 1197

Does this fix the issue?
Alrightly, I'm hosting Hazordhu II on 499.1193, on CentOS 6.4 amd64.

Just caught it in a tight loop as these guys describes, with it hooked up to gdb, backtrace is as such:

#0  __strcmp_sse4_2 () at ../sysdeps/i386/i686/multiarch/strcmp-sse4.S:202
#1 0x0046951c in ProtoStrCompSigned(ProtoStr*, ProtoStr*) ()
from /srv/byond/499.1193/lib/libbyond.so
#2 0x00378c28 in ?? () from /srv/byond/499.1193/lib/libbyond.so
#3 0x003dd951 in ?? () from /srv/byond/499.1193/lib/libbyond.so
#4 0x003f08dd in ExecProc_Attach(Value, unsigned char, unsigned short, unsigned char, Value, Value*, unsigned long, void (*)(Value, void*), void*) ()
from /srv/byond/499.1193/lib/libbyond.so
#5 0x003f09e2 in CallByNameWithCallback_Attach(Value, unsigned char, unsigned long, Value, Value*, unsigned long, void (*)(Value, void*), void*) ()
from /srv/byond/499.1193/lib/libbyond.so
#6 0x003f1d4d in ?? () from /srv/byond/499.1193/lib/libbyond.so
#7 0x003d48a5 in ?? () from /srv/byond/499.1193/lib/libbyond.so
#8 0x003f08dd in ExecProc_Attach(Value, unsigned char, unsigned short, unsigned char, Value, Value*, unsigned long, void (*)(Value, void*), void*) ()
from /srv/byond/499.1193/lib/libbyond.so
#9 0x003f09e2 in CallByNameWithCallback_Attach(Value, unsigned char, unsigned long, Value, Value*, unsigned long, void (*)(Value, void*), void*) ()
from /srv/byond/499.1193/lib/libbyond.so
#10 0x003f1d4d in ?? () from /srv/byond/499.1193/lib/libbyond.so
#11 0x003dc5a1 in ?? () from /srv/byond/499.1193/lib/libbyond.so
#12 0x003f08dd in ExecProc_Attach(Value, unsigned char, unsigned short, unsigned char, Value, Value*, unsigned long, void (*)(Value, void*), void*) ()


Instruction read-out for the current frame is so:

207             jbe     L(more16byteseq)
208 #endif
209
210 add $16, %edx
211 jle L(loop)
212 L(crosspage):
213 movzbl (%edi,%edx), %eax
214 movzbl (%esi,%edx), %ebx
215 subl %ebx, %eax
216 jne L(ret)


(Obviously, I have no debug symbols for BYOND)

I'll leave it hooked up and hanging for now, in case you want to just SSH in and have a look, Lummox/Tom. Similarly, I can quite happily install a build with debug symbols and basically walk the entire thing through with you/myself and diagnose what's up.
Stephen001 changed status to 'Verified'
I would like to add that i also am experiencing that issue (freeze with an apparent infinite loop inside 0x0046951c in ProtoStrCompSigned(ProtoStr*, ProtoStr*) ()).

Heres the data gathered from the sigusr2
Caught SIGUSR2, printing diagnostics:

Server port: 2506
Server visibility: invisible
Server reachable by players: yes

Fri Jul 5 03:02:12 2013
proc name: Stat (/mob/Stat)
source file: mob.dm,701
usr: Someone (/mob/living/silicon/robot)
src: Someone (/mob/living/silicon/robot)
call stack:
Someone (/mob/living/silicon/robot): Stat()
Someone (/mob/living/silicon/robot): Stat()

DreamDaemon [0x8048000, 0x0], [0x8048000, 0x804a8ce]
libc.so.6 [0x813000, 0x0], 0x117ee9
[0x121000, 0x121600], [0x121000, 0x121600]
libc.so.6 [0x813000, 0x0], 0x117ee9
libbyond.so 0x33f5d0, 0x33f5ec
libbyond.so [0x122000, 0x0], 0x24db67
libbyond.so [0x122000, 0x0], 0x2b25cd
libbyond.so 0x2c5360, 0x2c546b
libbyond.so [0x122000, 0x0], 0x2ce972
libbyond.so [0x122000, 0x0], 0x2b1132
libbyond.so 0x2c5360, 0x2c546b
libbyond.so 0x2c54e0, 0x2c5593
libbyond.so 0x2c7730, 0x2c77fc
libbyond.so [0x122000, 0x0], 0x2cc49e
libbyond.so 0x2c54e0, 0x2c568d
libbyond.so 0x2c7730, 0x2c77fc
libbyond.so [0x122000, 0x0], 0x272204
libbyond.so 0x284020, 0x284506
libbyond.so [0x122000, 0x0], 0x28b65a
libbyond.so 0x35a990, 0x35ab07
libbyond.so 0x32ba80, 0x32bcea
DreamDaemon [0x8048000, 0x0], [0x8048000, 0x804a3ee]
libc.so.6 0x19000, 0x190f3 (__libc_start_main)




server mem usage:
Prototypes:
obj: 849164 (6216)
mob: 851372 (138)
proc: 8102304 (14260)
str: 4764794 (86181)
appearance: 7262282 (14866)
id array: 8141556 (27491)
map: 1268608 (240,240,6)
objects:
mobs: 218976 (147)
objs: 12617164 (49362)
datums: 5364016 (52448)
lists: 16033348 (286412)



the process in question when the freeze happen is somewhat random, this time it was mob.dm:701 which is stat(null,"CPU:\t[world.cpu]") in our case, but there is always 1 thing in common with all of them. Its always happening on a line with a string operation
oh also, we are currently running

4.0 Public Version 499.1197
Can I get a debug build of 499.1197? We got debugging through with Lummox recently, and found the that during a string append, it suddenly got a huge value for a counter in a loop, which was causing the lock-up. Problem was, walking the execution with gdb without debug symbols (and the StringEditor class members especially on the heap) proved rather error prone and eventually I broke the stack-frame.
Incidentally, I upgraded to 499.1197, compiled Chatters on that version, and hosted on it, and manage to get hangs pretty quickly (15 minutes or so, with 10 clients?).

#0  0x00478346 in StringEditor::Insert(char const*, int) () from /srv/byond/499.1197/lib/libbyond.so
#1 0x0043be24 in TelnetTextPrinter::AlignText() () from /srv/byond/499.1197/lib/libbyond.so
#2 0x0043c95c in TelnetTextPrinter::Puts(char const*) () from /srv/byond/499.1197/lib/libbyond.so
#3 0x00480f9b in HtmlParser::FlushWord() () from /srv/byond/499.1197/lib/libbyond.so
#4 0x0048248a in HtmlParser::Parse(char*, char const*) () from /srv/byond/499.1197/lib/libbyond.so
#5 0x004898c6 in DMTextPrinter::PrintText(char const*) () from /srv/byond/499.1197/lib/libbyond.so
#6 0x004049ec in TelnetLink::PrintText(char const*) () from /srv/byond/499.1197/lib/libbyond.so
#7 0x00405d55 in TelnetLink::WriteMsg(NetMsg*) () from /srv/byond/499.1197/lib/libbyond.so
#8 0x003b0a82 in ?? () from /srv/byond/499.1197/lib/libbyond.so
#9 0x003b1712 in SendMobMsg(unsigned long) () from /srv/byond/499.1197/lib/libbyond.so
#10 0x0037077f in BroadcastMsg(Value) () from /srv/byond/499.1197/lib/libbyond.so
#11 0x003d7a92 in ?? () from /srv/byond/499.1197/lib/libbyond.so
#12 0x003f9133 in ?? () from /srv/byond/499.1197/lib/libbyond.so
#13 0x003dfb72 in ?? () from /srv/byond/499.1197/lib/libbyond.so
#14 0x003f646b in ExecProc_Attach(Value, unsigned char, unsigned long, unsigned char, Value, Value*, unsigned long, void (*)(Value, void*), void*)
() from /srv/byond/499.1197/lib/libbyond.so
#15 0x003f6593 in CallByNameWithCallback_Attach(Value, unsigned char, unsigned long, Value, Value*, unsigned long, void (*)(Value, void*), void*)
() from /srv/byond/499.1197/lib/libbyond.so
#16 0x003f7a52 in ?? () from /srv/byond/499.1197/lib/libbyond.so
#17 0x003e222c in ?? () from /srv/byond/499.1197/lib/libbyond.so
#18 0x003f646b in ExecProc_Attach(Value, unsigned char, unsigned long, unsigned char, Value, Value*, unsigned long, void (*)(Value, void*), void*)
() from /srv/byond/499.1197/lib/libbyond.so
#19 0x003f6593 in CallByNameWithCallback_Attach(Value, unsigned char, unsigned long, Value, Value*, unsigned long, void (*)(Value, void*), void*)


Full source code here https://github.com/Stephen001/Chatters

The one improvement we saw that could be done in this specific case was to use memmove() in Insert() instead of the loop. Obviously if buffer in the StringEditor was knackered (which I broke the stack-frame trying to look at, impressively), then it couldn't fix the bug.
It's probably worth noting that due to my side-by-side installs and scripts, I can basically test any combination of compiler and runtime you want me to.
You should try stepping through calls/returns and see where its actually hanging

e: oh wait you already did that, not sure how you're managing to break the frame, it shouldent be too bad to step
Its probably a minor buffer overflow breaking a value on the stack, or something more trivial
Page: 1 2 3 4 5 6 7 8