In response to DarkCampainger
User-created lists in DM are arrays. Linked lists are only used for special kinds of lists, like contents. The logistics of copying data and re-allocating frequently don't really account for the difference in performance that Pomf is seeing, IMO, especially as lists attempt to adjust for that by allocating more space as the list gets bigger.

A for(item in list) loop always copies the list before iterating.
In response to Lummox JR
Ter13 wrote:
I'm pretty sure that it creates a copy of the list in memory to iterate through, so modifications to the actual list don't actually affect it.

Lummox JR wrote:
A for(item in list) loop always copies the list before iterating.

Learn somethin' new every day. Thanks for the info!

Lummox JR wrote:
User-created lists in DM are arrays. Linked lists are only used for special kinds of lists, like contents. The logistics of copying data and re-allocating frequently don't really account for the difference in performance that Pomf is seeing, IMO, especially as lists attempt to adjust for that by allocating more space as the list gets bigger.

Hmmm, it especially wouldn't explain his read-only access performance numbers. I look forward to reading the resolution text of this one!
his read only performance numbers aren't read only performance numbers.

Taking a closer look at the code, it seems the var he made to hold a number so that he could do for (var/thing in list) number += thing, is actually another list, so its reading from one list, and adding the value to another list,

what pomf i think, meant to do was to have the second set of lists (var/one thru var/eleven) be a number var to add the number in the list to a var (to ensure byond actually reads from the list, as other wise the compiler would see that nothing is being done with the value, and not even compile the reading of the list)
Compiler's not that smart, I don't think. A loop looping 'for(var/i = 1 to 16000000) continue' still takes quite a while.
oh it is.

for (var/n in 1 to list.len)
var/thing = list[n]
donothing(thing)
proc/donothing(thing)
return


that will never actually access the item in the list, it will show as faster then a for(var/thing in list) but won't actually be faster.
In response to MrStonedOne
MrStonedOne wrote:
oh it is.

> for (var/n in 1 to list.len)
> var/thing = list[n]
> donothing(thing)
> proc/donothing(thing)
> return
>


that will never actually access the item in the list, it will show as faster then a for(var/thing in list) but won't actually be faster.

Why does the following not print very similar numbers, then?

/world/loop_checks = 0

/world/New()
var/list/L = list()
for(var/i = 1 to 100) L += i
var/{s;e}

s = world.timeofday
for(var/i = 1 to 100000)
a(L)
e = world.timeofday
world.log << "[e-s]" // ~32-33 on my machine

s = world.timeofday
for(var/i = 1 to 100000)
b(L)
e = world.timeofday
world.log << "[e-s]" // ~25-26 on my machine

/proc/a(list/L)
for(var/n in 1 to L.len)
var/thing = L[n]
donothing(thing)

proc/donothing(thing)
return

/proc/b(list/L)
for(var/n in 1 to L.len)
var/thing = n
donothing(thing)

I'm pretty sure that process_list_one is only taking longer because it's appending a huuuuuuge list, and as we know, appending is O(n). Try replacing all the 'one = list()' stuff at the top with 'one = 0' so that the test performs adds instead of appends.

I tried to simplify the test myself, and here's what I come up with.
#define DEBUG
world/loop_checks = 0

var/list
list_1e7
list_10_1e6

client/verb
prep()
list_1e7 = new/list(1e7)
list_10_1e6 = new/list(10, 1e6)
src << "Done"
prep2() // Alternate method of building the list, if you don't trust prep(). Can take up to 6 minutes.
list_1e7 = list()
for(var/i = 1 to 1e7)
list_1e7 += i
list_10_1e6 = list()
for(var/i = 1 to 10)
list_10_1e6 += list(list())
for(var/j = 1 to 1e6)
list_10_1e6[i] += j
src << "Done"

for_1e7()
ASSERT(list_1e7)
for(var/i = 1 to 1e7)
. = list_1e7[i] // Writing to . so we don't get accused of everything being optimized out.
src << "Done"
for_10_1e6()
ASSERT(list_10_1e6)
for(var/i = 1 to 10)
var/list/list_1e6 = list_10_1e6[i]
for(var/j = 1 to 1e6)
. = list_1e6[j]
src << "Done"

foreach_1e7()
ASSERT(list_1e7)
for(var/i in list_1e7)
. = i
src << "Done"
foreach_10_1e6()
ASSERT(list_10_1e6)
for(var/list_1e6 in list_10_1e6)
for(var/i in list_1e6)
. = i
src << "Done"


                         Profile results (total time)
Proc Name Self CPU Total CPU Real Time Calls
--------------------------- --------- --------- --------- ---------
/client/verb/for_10_1e6 4.923 4.923 4.925 4
/client/verb/for_1e7 4.868 4.868 4.866 4
/client/verb/foreach_10_1e6 3.908 3.908 3.909 4
/client/verb/foreach_1e7 3.720 3.720 3.719 4
/client/verb/prep 0.045 0.045 0.044 1

In response to Yota
Yota wrote:
I'm pretty sure that process_list_one is only taking longer because it's appending a huuuuuuge list, and as we know, appending is O(n).

No it isn't. Appending is O(1).
In response to Lummox JR
Lummox JR wrote:
Yota wrote:
I'm pretty sure that process_list_one is only taking longer because it's appending a huuuuuuge list, and as we know, appending is O(n).

No it isn't. Appending is O(1).

I get mixed results. One on hand, I see stuff like this:
client/verb
append_1e4()
var/list/L = list()
for(var/i = 1 to 1e4)
L += null
append_1e5()
var/list/L = list()
for(var/i = 1 to 1e5)
L += null
append_1e6()
var/list/L = list()
for(var/i = 1 to 1e6)
L += null
append_1e7()
var/list/L = list()
for(var/i = 1 to 1e7)
L += null

Profile results (total time)
Proc Name Self CPU Total CPU Real Time Calls
----------------------- --------- --------- --------- ---------
/client/verb/append_1e4 0.001 0.001 0.001 1
/client/verb/append_1e5 0.034 0.034 0.034 1
/client/verb/append_1e6 2.738 2.738 2.737 1
/client/verb/append_1e7 274.681 274.681 274.681 1


But then I see stuff like this:
client/verb
append_test_1()
var/list/L = list()
for(var/i = 1 to 1e6)
L += null
append_test_2()
var/list/Ls = list()
var/list/L = list()
for(var/i = 1 to 1e6)
L += null
if (L.len > 100000)
Ls += L
L = list()

Profile results (total time)
Proc Name Self CPU Total CPU Real Time Calls
-------------------------- --------- --------- --------- ---------
/client/verb/append_test_1 2.672 2.672 2.673 1
/client/verb/append_test_2 2.523 2.523 2.522 1


So I don't know what to believe. Maybe you could shed some light?
As I explained in this thread already, lists are arrays. List appending is therefore O(1).
If space has been allocated ahead, yes. However once it runs out of capacity, it needs to reallocate the entire array to a new location, which is O(n). Question is, how frequently does it do this?
What occurs if you make a list of specific predefined length and add elements to it?

Assuming these possible facts:
-It is a memory allocation problem
-Byond automatically allocates the whole memory of a list with predefined length when it is initialized

Then the performance issues of adding something to the list would be negligible in that case, correct?
Then the performance issues of adding something to the list would be negligible in that case, correct?

Adding something to the list wouldn't be done if it's a predefined list. a list with a set size pre-allocates null references, adding something to a list will always increase the size of the list and not avoid the problem.
Yes well thats essentially what I am getting at,
'getting around it by assigning values to a specific index in a predefined list of specific size'
as a workaround, though you are right it isn't a very good method of testing the actual issue itself.

I have to wonder if adding to a list through list[list.len++] = entry would make a performance difference in any of these metrics. Perhaps I'll give it a test.
list[list.len++]

list[++list.len] = entry would be the correct syntax for what you are trying to achieve.

Remember the difference between pre and post increments.
Yep
In response to Yota
Rip Omnistream, slain by the unholy behemoth that is Twitch.

-NotAnonJustPro
One workaround to this, other than manual pre-filling buffer space (something we can't always do because other code that uses the list can't take the performance hit of checking for nulls or looping thru a ton of empty items) is giving us an interface for editing the internal prefill buffer length.

Most self growing array systems from vectors to c#'s list<>'s allow for this in some way.

That being said, the key reason this issue report was made, was to improve the start up time performance of ss13 worlds.

A good chunk of that is objects adding themselves to various lists.

A common best practice to reduce overhead from for thing in world loops by making lists that hold all objects of a given type so we can just loop thru that (and remove the need to do type checking on the list since we know it only has a given type)

What I think is going on here, is byond is being stingy about allocating buffer/prefill space for lists)

Lummox, what is the logic/pseudocode of list prefilling? I bet it's capping at a really low number.
Hrm, I thought I was allocating more slack than I was, but it turns out not so much. While a lot of allocations increase the requested size based on the length of the list, user lists don't do that. It appears that currently the only slack being added is to raise the new length to a multiple of 4.

I'll change that metric so that when list size is increasing, it's increased to max(requested_len, round(buffer_size * 1.25)), which is then rounded up to a multiple of 4. A single fast-growing list should avoid a lot more allocations this way.
It's worth noting that the following is much faster than appending to a blank list, as it avoids all the reallocations.
var/list/L = new/list(1e6)
for(var/i = 1 to 1e6)
L[i] = null

So I made some attempts at creating a list with a large internal buffer size to begin with, to make the first set of additions O(1) as it would ideally be. But it turns out that the following code is just as slow as if we were using a fresh blank list.
var/list/L = new/list(1e6)
L.len = 0
for(var/i = 1 to 1e6)
L += null

I tried with Cut() as well. Is the list reallocating when we reduce the size as well?
Page: 1 2 3