ID:1591825
 
BYOND Version:506
Operating System:Linux
Web Browser:Chrome 35.0.1916.114
Applies to:Dream Maker
Status: Open

Issue hasn't been assigned a status value.
Descriptive Problem Summary: CPU usage increases over time, throttling at over 200% within 12 hours. 50 people online, after the server had been left over night the CPU will shift between 140% - 250%: http://puu.sh/9gagy/b6c9ff6e80.png

Fresh off a reboot, 10 minutes in, we are at 20-30%: http://puu.sh/9gaMT/042833c34e.png

Numbered Steps to Reproduce Problem:

Host Eternia and keep track of CPU usage - you'll notice a gradual increase until the game is almost unplayable, typically within 12 hours but often sooner.



Does the problem occur:
Every time? Or how often?
All of the time.

In other games?
Unsure.

In other user accounts?
Yes.

On other computers?
Yes.


When does the problem NOT occur?
Never.

Did the problem NOT occur in any earlier versions? If so, what was the last version that worked? (Visit http://www.byond.com/download/build to download old versions for testing.)
Cannot recall a version that didn't have this problem. We've always rebooted daily due to CPU increase.
And...are you sure this is a BYOND bug vs a programming issue?
Yes, the profiler doesn't seem to indicate this kind of increase in CPU usage.

Server has been up for 20 minutes. 20-30% CPU. This is a log of 20 minutes usage: http://pastebin.com/ehNMsnVJ

Will edit this post tomorrow with a pastebin of the profiler after 20 minutes.
From what I can tell the major performance users all seem to have one thing in common, which is accessing and writing log files in one way or another.

I'm guessing that appending to a larger file is going to hit the CPU pretty hard, especially if it's happening at the frequency that Eternia has to log things.

We're working on switching the bulk of the logging to SQLite systems, which I'm hoping has a nice streamlined way of writing to larger files than just directly opening and appending to a text file. It's going to take some time to have it to a point where we can make the switch though -- logging is the most important feature in Eternia and I can't speed through this change.

We'll see how well that helps things out, as most of what we see in the profiler using any abnormal amounts of usage end up being very basic procs that you wouldn't expect much usage out of, Lummox and Tom have even helped us optimize things to a pretty high level, not to mention some of the resource-saving techniques we've picked up from people like Ter13.

Fingers are crossed that my hunch that file I/O being the culprit for the majority of things is correct. Also fingers crossed that switching to SQLite is worth the work.
Why would the logging increase CPU usage over time though, and be okay again after a reboot? While it's certainly an optimization, I think there's a larger, unrelated problem in play here.
Since you're switching to SQLite to try and get past the issue, there isn't much point in throwing in my piece of advice, but either way..

How do you handle log files? is it one huge file with everything in it, or is it one new file each day?

You could always have tried (if you didn't do so already) to alleviate the load by creating a new logfile by the hour, and just store it in a way that it is easily overlooked.
Fingers are crossed that my hunch that file I/O being the culprit for the majority of things is correct. Also fingers crossed that switching to SQLite is worth the work.

I can confirm that there's something stemming from file I/O. I've been working on some tools that work with DMM files recently.

I actually use Eternia's map as a worst-case-scenario test for my DMM tools, because you guys have a lot going on in your map.

I've done a lot of backtracing of string reference ids, and been keeping an eye on memory usage. I even wrote my setup in such a way that I wasn't abandoning huge numbers of string references, but rather was abandoning only one or two at a time with a single user, giving the fastest-possible outcome for working with the string table.

Memory usage is stable, but I've noticed that generating my output DMMs after batch processing results in each successive Z layer of the map taking exponentially longer than the last layer, despite the fact that there's no reason that should happen looking at my code (I'm very, very good at writing efficient code).

I think there might be an underlying issue going on with file I/O as well, but until I can test corner cases that give me a clearer insight to what's going on, I can't say that with any certainty.
I'm assuming Eternia's source code is off limits, yes?
What Nadrew has previously told me about Eternia's logging is that it keeps track of a list of items over time. The string table may simply be growing very huge or there could be other problems.

Most interestingly, a number of these procs have very high real time values compared to self CPU time and self+child CPU time. That makes file I/O a strong possibility. The less of this that needs to be done on demand, the better.
In response to Laser50
Laser50 wrote:
I'm assuming Eternia's source code is off limits, yes?

Yes.
And what results come from using things like iotop? (linux) Or any other things in general that show you the harddrive usage?
In response to Lummox JR
Lummox JR wrote:
What Nadrew has previously told me about Eternia's logging is that it keeps track of a list of items over time. The string table may simply be growing very huge or there could be other problems.


I actually got rid of the lists in favor of straight file reading and writing, it helped memory usage a bit, but not CPU usage. Our memory usage is fairly consistent and doesn't really rise too much over time.
In response to Nadrew
Nadrew wrote:
Lummox JR wrote:
What Nadrew has previously told me about Eternia's logging is that it keeps track of a list of items over time. The string table may simply be growing very huge or there could be other problems.


I actually got rid of the lists in favor of straight file reading and writing, it helped memory usage a bit, but not CPU usage. Our memory usage is fairly consistent and doesn't really rise too much over time.


And disk use?
iotop is reporting between 3kb and 125kb every few seconds from the DreamDaemon process.
Well.. That doesn't seem like a lot.. You could try to keep it running for a few hours and see what the end result looks like.
Soo, after running iotop for approximately... I guess 8 hours and a bit now.

Here's my results;
http://puu.sh/9gWny/8474667c39.png
(SS13)

I guess that's quite a lot, though. And I have no idea what it has been writing to or from, either.

What are yours looking like, if you have any?
Almost 1GB written in 8 hours?

How often does the game save?
We're at 112MB after about 8 hours, so it's not a memory problem over here.
112MB of RAM, roughly 600MB of disk IO.
In response to Writing A New One
Writing A New One wrote:
Server has been up for 20 minutes. 20-30% CPU. This is a log of 20 minutes usage: http://pastebin.com/ehNMsnVJ

Will edit this post tomorrow with a pastebin of the profiler after 20 minutes.

20 minutes: http://pastebin.com/xd09KJLw, CPU is rarely below 200% with 50 people. After rebooting, it's back to normal levels (20-30%).
Page: 1 2 3 4