ID:2272686
 
BYOND Version:511
Operating System:Linux
Web Browser:
Applies to:Dream Daemon
Status: Open

Issue hasn't been assigned a status value.
Descriptive Problem Summary:

On the latest few releases of 511, we (Baystation) have been experiencing extremely common savefile corruption. This appears to be happening on write by DreamDaemon, because the resulting files cannot be read whatsoever even with far simpler code (just opening the file and exporting to text).

Numbered Steps to Reproduce Problem:

Have not yet found exact trigger; this appears to be internal to DreamDaemon savefile handling and regards internal state.

Expected Results:

Savefiles to be a reliable storage mechanism and not corrupt themselves.

Actual Results:

Savefiles are virtually unusable for a server of this size.

Here's a copy of the actual messages coming out of DreamDaemon:

Sometimes we'll get messages like this, where it just refuses to read:

[2017-07-22 18:12:54] BUG: Fmem block size at 0 is 559903/4018 with type 0.
[2017-07-22 18:12:54] BUG: Error reading file memory structureFile offset: 0
[2017-07-22 18:12:54] Real position: 5/4018
[2017-07-22 18:12:54] BUG: Ccorrupt or invalid savefile 'data/player_saves/j/jacoblee/preferences.sav'

We also occasionally get, which might be related, things like:

[2017-07-22 06:52:37] BUG: Cannot write to 'data/player_saves/f/freak223/preferences.sav': file is in use

Other time the file is corrupt where it triggers "attempted recovery," which always results in a useless/empty file.

[2017-07-22 04:59:28] BUG: Corrupt block header
[2017-07-22 04:59:28] BUG: Fmem block size at 316 is 1279475200/240557 with type 85.
[2017-07-22 04:59:28] BUG: Error reading file memory structureFile offset: 316
[2017-07-22 04:59:28] Real position: 321/240557
[2017-07-22 04:59:28] BUG: Orphan savefile buffer skin_green exported to /PATH/data/player_saves/m/mark9013100/preferences_lost.txt
[2017-07-22 04:59:28] BUG: Orphan savefile buffer version exported to /PATH/data/player_saves/m/mark9013100/preferences_lost.txt
[2017-07-22 04:59:28] BUG: Ccorrupt or invalid savefile 'data/player_saves/m/mark9013100/preferences.sav'
[2017-07-22 04:59:28] BUG: Attempting auto-recovery of '/PATH/data/player_saves/m/mark9013100/preferences.sav'
[2017-07-22 04:59:28] BUG: Backed up old savefile as '/PATH/data/player_saves/m/mark9013100/preferences_bad_000.sav' and exported text to '/PATH/data/player_saves/m/mark9013100/preferences_bad_000.txt'. REMOVE OR RENAME THESE FILES. If too many of these files build up, auto-recovery will be disabled.

[2017-07-22 01:27:15] BUG: Corrupt block header
[2017-07-22 01:27:15] BUG: Fmem block size at 334 is 34538297/11440 with type 236.
[2017-07-22 01:27:15] BUG: Error reading file memory structureFile offset: 334
[2017-07-22 01:27:15] Real position: 339/11440
[2017-07-22 01:27:15] BUG: Ccorrupt or invalid savefile 'data/player_saves/k/kingofstarryskies/preferences.sav'
[2017-07-22 01:27:15] BUG: Attempting auto-recovery of '/PATH/data/player_saves/k/kingofstarryskies/preferences.sav'
[2017-07-22 01:27:15] BUG: Backed up old savefile as '/PATH/data/player_saves/k/kingofstarryskies/ preferences_bad_000.sav' and exported text to '/PATH/data/player_saves/k/kingofstarryskies/ preferences_bad_000.txt'. REMOVE OR RENAME THESE FILES. If too many of these files build up, auto-recovery will be disabled.

[2017-07-21 16:22:43] BUG: Corrupt block header
[2017-07-21 16:22:43] BUG: Fmem block size at 2031 is -963980824/10084 with type 97.
[2017-07-21 16:22:43] BUG: Error reading file memory structureFile offset: 2031
[2017-07-21 16:22:43] Real position: 2036/10084
[2017-07-21 16:22:43] BUG: Orphan savefile buffer species exported to /PATH/data/player_saves/m/mrbarman/preferences_lost.txt
[2017-07-21 16:22:43] BUG: Ccorrupt or invalid savefile 'data/player_saves/m/mrbarman/preferences.sav'
[2017-07-21 16:22:43] BUG: Attempting auto-recovery of '/PATH/data/player_saves/m/mrbarman/preferences.sav'
[2017-07-21 16:22:43] BUG: Backed up old savefile as '/PATH/data/player_saves/m/mrbarman/preferences_bad_000.sav' and exported text to '/PATH/data/player_saves/m/mrbarman/preferences_bad_000.txt'. REMOVE OR RENAME THESE FILES. If too many of these files build up, auto-recovery will be disabled.

We've also had outright crashes in our load proc:

[2017-07-22 04:41:02] BUG: Crashing due to an illegal operation!
[2017-07-22 04:41:02]
[2017-07-22 04:41:02] Backtrace for BYOND 511.1385 on Linux:
[2017-07-22 04:41:02] Generated at Sat Jul 22 04:41:02 2017
[2017-07-22 04:41:02]
[2017-07-22 04:41:02] DreamDaemon [0x8048000, 0x0], [0x8048000, 0x804bb24]
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x1d58aa
[2017-07-22 04:41:02] [0xf77da000, 0xf77da410], [0xf77da000, 0xf77da410]
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x1d58aa
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x1e1507
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x1bb6eb
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x1ba5aa
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x1bb431
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x25de57
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273083
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273ced
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x251c2c
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273083
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273ba0
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x251c2c
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273083
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273ced
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x251c2c
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273083
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273ced
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x251c2c
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273083
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x27441b
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x270ade
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x27678d
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x254e1f
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x2766ca
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x254e1f
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x26a84e
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x273083
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x27441b
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x207f08
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x224872
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x2252ff
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x286e91
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x301128
[2017-07-22 04:41:02] libbyond.so [0xf7283000, 0x0], 0x301d05
[2017-07-22 04:41:02] libbyond.so 0x303970, 0x303cfb
[2017-07-22 04:41:02] libbyond.so 0x303e30, 0x30403d
[2017-07-22 04:41:02] DreamDaemon [0x8048000, 0x0], [0x8048000, 0x804ae34]
[2017-07-22 04:41:02] libc.so.6 0x198b0, 0x199a3 (__libc_start_main)
[2017-07-22 04:41:02] DreamDaemon [0x8048000, 0x0], [0x8048000, 0x804a731]
[2017-07-22 04:41:02]
[2017-07-22 04:41:02] Recent proc calls:
[2017-07-22 04:41:02] /datum/category_item/player_setup_item/general/body/ load_character
[2017-07-22 04:41:02] /datum/category_item/player_setup_item/general/language/ load_character
[2017-07-22 04:41:02] /datum/category_item/player_setup_item/general/basic/ load_character
[2017-07-22 04:41:02] /datum/category_group/player_setup_category/proc/ load_character
[2017-07-22 04:41:02] /datum/category_collection/player_setup_collection/proc/ load_character
[2017-07-22 04:41:02] /datum/map/proc/private_use_legacy_saves
[2017-07-22 04:41:02] /datum/map/proc/character_load_path
[2017-07-22 04:41:02] /proc/sanitize_integer
[2017-07-22 04:41:02] /datum/preferences/proc/load_character
[2017-07-22 04:41:02] /datum/preferences/Topic
[2017-07-22 04:41:02] /client/Topic
[2017-07-22 04:41:02] /client/Topic
[2017-07-22 04:41:02] /mob/proc/is_client_active
[2017-07-22 04:41:02] /client/proc/is_stealthed
[2017-07-22 04:41:02] /mob/Stat
[2017-07-22 04:41:02] /mob/Stat

Unknown if related, but we also occasionally get messages like:
[2017-07-22 00:41:11] BUG: Removing suspect free block index
[2017-07-22 00:41:11] BUG: Removing suspect free block index

I wonder if the savefiles are trying to handle too much. The limit increase from 64K to 16 million (or more) for most objects didn't really touch the savefile system, which is a little more complicated.

What exactly are you saving in these savefiles? Are you saving whole sections of the map? Characters? Setup info? Datums? I'm curious how big the files are in terms of data and especially diversity of data.
We're primarily storing characters and some associated preferences. I'll provide an example file (exported); this one has not corrupted, but has most of the data we are storing:

http://xales-public.s3.amazonaws.com/out.txt

Edit: I'll also work on getting one that has corrupted, as close to when it broke as I can.