ID:2026658

 BYOND Version: 509 Operating System: Windows 10 Pro 64-bit Web Browser: Firefox 41.0 Applies to: Dream Daemon
 Status: Open Issue hasn't been assigned a status value.
Descriptive Problem Summary:
Performance degredation with lists with a large amount of items whether adding to, removing from, or just looping through.
Numbered Steps to Reproduce Problem:
make a large list, profile a loop through it, split said list up into chunks and profile the loops.
Code Snippet (if applicable) to Reproduce Problem:
```world    loop_checks = 0var/global/list/one = list()var/global/list/two = list()var/global/list/three = list()var/global/list/four = list()var/global/list/five = list()var/global/list/six = list()var/global/list/seven = list()var/global/list/eight = list()var/global/list/nine = list()var/global/list/ten = list()var/global/list/eleven = list()var/global/list/listone = list()var/global/list/listtwo = list()var/global/list/listthree = list()var/global/list/listfour = list()var/global/list/listfive = list()var/global/list/listsix = list()var/global/list/listseven = list()var/global/list/listeight = list()var/global/list/listnine = list()var/global/list/listten = list()var/global/list/listeleven = list()verb/test_list()    var/start = world.timeofday    //world << "starting at [start]"    var/n = 1    for(var/i = 1 to 2600000)        listone += i        if(n != 10 && !(i % 260000))            //world << "Hit milestone [i] in [((world.timeofday - start)/10)] second\s"            n++        switch(n)            if(1)                listtwo += i            if(2)                listthree += i            if(3)                listfour += i            if(4)                listfive += i            if(5)                listsix += i            if(6)                listseven += i            if(7)                listeight += i            if(8)                listnine += i            if(9)                listten += i            if(10)                listeleven += i    var/startone = world.timeofday    //world << "starting processing of list one at [startone] total of [((world.timeofday - start)/10)] second\s have passed"    process_list_one()    //world << "finished list one in [((world.timeofday - startone)/10)] second\s; total of [((world.timeofday - start)/10)] second\s have passed"    startone = world.timeofday    //world << "starting processing of list one at [startone] total of [((world.timeofday - start)/10)] second\s have passed"    process_list_two()    //world << "finished list two in [((world.timeofday - startone)/10)] second\s; total of [((world.timeofday - start)/10)] second\s have passed"/proc/process_list_one()    for(var/ii in listone)        one += ii/proc/process_list_two()    var/starttime = world.timeofday    var/i    var/substart = world.timeofday    for(i in listtwo)        two += i    //world << "finished sub list one in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listthree)        three += i    //world << "finished sub list two in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listfour)        four += i    //world << "finished sub list three in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listfive)        five += i    //world << "finished sub list four in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listsix)        six += i    //world << "finished sub list five in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listseven)        seven += i    //world << "finished sub list six in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listeight)        eight += i    //world << "finished sub list seven in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listnine)        nine += i    //world << "finished sub list eight in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listten)        ten += i    //world << "finished sub list nine in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"    substart = world.timeofday    for(i in listeleven)        eleven += i    //world << "finished sub list ten in [((world.timeofday - substart)/10)] second\s; total of [((world.timeofday - starttime)/10)] second\s have passed"
```

Expected Results:
similar performance
Actual Results:
```                       Profile results (total time)Proc Name                  Self CPU    Total CPU    Real Time        Calls----------------------    ---------    ---------    ---------    ---------/mob/verb/test_list          20.202       39.803       39.803            1/proc/process_list_one       17.912       17.912       17.913            1/proc/process_list_two        1.689        1.689        1.690            1
```

Does the problem occur:
every time with large lists

When does the problem NOT occur?
When the list is split up into smaller chunks

Workarounds:
make a lot of smaller lists.

Reason i bothered testing for this is we are wanting to increase map size to 500x500x6 but looping through turfs causes infinite loop errors resulting in poor performance and eventually a non working proc.

edit: processing only the list chunks in the example, IE commenting out the references to listone and process_list_one(), i get this
```                       Profile results (total time)Proc Name                  Self CPU    Total CPU    Real Time        Calls----------------------    ---------    ---------    ---------    ---------/mob/verb/test_list            2.164        3.713        3.713            1/proc/process_list_two         1.549        1.549        1.549            1
```

 Jan 29 2016, 12:14 pm I'm honestly not sure what I'm looking at here. The example is ultra-complicated and it's hard to tell the goal from the code. What are the sizes of all the lists involved? Even that's not clear from the code. The process routines are very apples-and-oranges in terms of what they're doing, so it's hard to see how any inferences can be drawn from this.
 Jan 29 2016, 12:16 pm I just saw you edited this to add a removal test, but I can tell you that test is really messed up. You're looping through list items in forward order, and Remove() and -= both operate in reverse order, so you'll have worst-case performance that way and it's O(N^2). If you're removing a bunch of items from a list while looping through it, it's best to do it by index when feasible; it's even better, if list order is not an issue, to swap those items to the end of the list and remove them all in one shot by decreasing the list length.
 Jan 29 2016, 12:17 pm (Edited on Jan 29 2016, 7:33 pm) profile of list destroy ``` Profile results (total time)Proc Name Self CPU Total CPU Real Time Calls---------------------- --------- --------- --------- ---------/mob/verb/test_list 2.158 331.502 333.978 1/proc/destroy_list_two 327.728 327.753 328.800 1/proc/process_list_two 1.569 1.587 2.557 1/proc/announce 0.047 0.047 2.456 22 ``` Lummox JR wrote: I just saw you edited this to add a removal test, but I can tell you that test is really messed up. You're looping through list items in forward order, and Remove() and -= both operate in reverse order, so you'll have worst-case performance that way and it's O(N^2). If you're removing a bunch of items from a list while looping through it, it's best to do it by index when feasible; it's even better, if list order is not an issue, to swap those items to the end of the list and remove them all in one shot by decreasing the list length. Ah alright
 Jan 29 2016, 12:28 pm In response to Lummox JR Lummox JR wrote: I'm honestly not sure what I'm looking at here. The example is ultra-complicated and it's hard to tell the goal from the code. What are the sizes of all the lists involved? Even that's not clear from the code. The process routines are very apples-and-oranges in terms of what they're doing, so it's hard to see how any inferences can be drawn from this. size of listone is 2.6e+006 items long size of listtwo through listeleven is 260,000 items long with a total of 2.6e+006
 Jan 29 2016, 12:31 pm I need to see that code vastly simplified, because right now it's really not clear at all what it's doing. I'm really not even sure this is appropriate for a bug report, as performance issues may not be bugs at all.
 Jan 29 2016, 12:55 pm other than coming up with better variable names dunno how i can simplify it as it's pretty simple already. I can comment the whole thing out for clarification if that would help If so http://ss13.moe/uploads/2016-01-29_14-59-32.txt profile for this code: http://ss13.moe/uploads/2016-01-29_14-59-46.txt
 Jan 29 2016, 2:30 pm You're appending to a growing list each time, and 'one' grows much faster than the rest. Clear the lists at the top of your verb (or don't append to a list as part of your loop) and the problem disappears.
 Jan 29 2016, 2:33 pm In response to GinjaNinja32 GinjaNinja32 wrote: You're appending to a growing list each time, and 'one' grows much faster than the rest. Clear the lists at the top of your verb (or don't append to a list as part of your loop) and the problem disappears. point of this is to show that working with large lists is very different performance wise than several smaller lists even at the same total item length
 Jan 29 2016, 2:35 pm My guess is lists are linked lists, ie insertion at the end is slow. That'd mean inserting N items at the end of a long list costs significantly more than inserting one item at the end of N short ones (it's approximately O(m*n**2) vs O(m*n), where n is the number of items and lists, and m is the length of the shorter lists) edit: Additionally, if you do as I suggest, the *split* lists perform slightly worse. double edit: I'm breaking my own arguments here. triple edit: Profile results when clearing lists between runs: http://nyx.gn32.uk/i/2016-01-29_22:39:19.png
 Jan 29 2016, 4:55 pm In response to GinjaNinja32 GinjaNinja32 wrote: My guess is lists are linked lists, ie insertion at the end is slow. That'd mean inserting N items at the end of a long list costs significantly more than inserting one item at the end of N short ones (it's approximately O(m*n**2) vs O(m*n), where n is the number of items and lists, and m is the length of the shorter lists) edit: Additionally, if you do as I suggest, the *split* lists perform slightly worse. double edit: I'm breaking my own arguments here. triple edit: Profile results when clearing lists between runs: http://nyx.gn32.uk/i/2016-01-29_22:39:19.png care to post the code
 Jan 29 2016, 4:57 pm http://gn32.uk/f/lists.dme edit: Not sure whether I misread my own code or what, but this is clearing the initial not final lists. Unsure what that means for this bug report.
 Jan 29 2016, 7:26 pm (Edited on Jan 30 2016, 10:51 am) lummox, let me clear this up for you: First, ignore the deletion part, it's a red herring. Ignore the part about adding to the list. The key part is /proc/process_list_one and /proc/process_list_two. He is showing that if you stuff 260000 items into a list, and loop thru it reading only, it takes 17.9 seconds, but if you split the same list up between 10 lists of 26000 items each, and loop thru all 10, it takes 1.6 seconds TOTAL to loop thru all 10. Really those numbers should be identical, in fact looping thru the 10 lists of 26000 items should be higher then looping thru 1 list of 260000 items. Meaning there is a overhead to looping thru a list that scales with it's size. In table format: Total Time to loop thru 260000 items spread across 1 list with 260000 in each list: 17.912 Total Time to loop thru 260000 items spread across 10 lists with 26000 in each list: 1.689
 Jan 29 2016, 7:34 pm yeah what MSO said, though it does show to get increasingly more expensive to add something to a larger list seen here http://ss13.moe/uploads/2016-01-29_14-59-32.txt
 Jan 30 2016, 11:25 am (Edited on Jan 30 2016, 11:47 am) I'm trying to rule adding out, as that has to expand the list, and some downgrade in performance for doing that is expected. Reading isn't affected by that. i confirmed on my machine, I used my slower laptop to make it easier to see the difference: Time it takes to loop thru 260,000 items in a list and increment a proc var number for each item: 300 seconds Time it takes to loop thru 26,000 items in a list and increment a proc var number: 3 seconds Total Time it takes to loop thru 10 lists with 26,000 items in each list and increment a proc var number: 30: seconds.
 Jan 30 2016, 11:31 am Note: I have the page file disabled, so any possibility that the large list is getting paged to virtual ram can be dismissed.
 Jan 30 2016, 12:12 pm In response to MrStonedOne MrStonedOne wrote: Time it takes to loop thru 260,000 items in a list and increment a proc var number for each item: 300 seconds Time it takes to loop thru 26,000 items in a list and increment a proc var number: 3 seconds Total Time it takes to loop thru 10 lists with 26,000 items in each list and increment a proc var number: 30: seconds. Do you have cleaner example code that shows this?
 Jan 30 2016, 12:15 pm No, i used his code, i'll see about cleaning it up a bit, but the way it works is very necessary, doing a list of lists gives the same performance as a large list for some reason.
 Jan 30 2016, 1:56 pm Are DM lists implemented as vectors/arrays or linked-lists under the hood? I've seen conflicting reports over the years (or maybe it's been changed over time). If vectors, then the profile results for adding new items would make some sense, as a single large DM list would have to reallocate and copy over significantly more data when it reached capacity. I'm not sure what to make of the accessing rates MSO posted, though. If linked lists, does for(... in list) keep track of the last node it accessed, or only the last index*? If the latter, does that mean it has to re-traverse the linked list from the head each iteration to find the node corresponding to the current index? *it has always surprised me how well it handles altering the list while iterating through it
 Jan 30 2016, 2:12 pm *it has always surprised me how well it handles altering the list while iterating through it I'm pretty sure that it creates a copy of the list in memory to iterate through, so modifications to the actual list don't actually affect it.
Page: 1 2 3