ID:1342405
 
/*

Written By: FIREking

*/


//text2list
//takes a text argument and converts it into a list
//each line in the text separated by a line break becomes a unique list entry

proc/text2list(t)
. = list() //return value will be a list
var/pos //we store the position of the next line break

if(copytext(t, -1) != "\n") t += "\n" //if the last character isn't a line break, add one

while(t) //while there's still text left to process
pos = findtext(t, "\n") //find next line break
var/val = copytext(t, 1, pos) //copy text from beginning to line break
if(val) //if text copied isn't empty
. += val //add it to the list
t = copytext(t, pos + 1) //set text equal to the remaining text excluding the value we added to the list and skipping the line break
How does it compare to Forum_account's split()?
proc
// split the string into a list of substrings by splitting it
// using the specified delimiter.
split(txt, d)
#ifdef DEBUG
ASSERT(istext(txt))
ASSERT(istext(d))
ASSERT(d)
#endif

var/pos = findtext(txt, d)
var/start = 1
var/dlen = length(d)

. = list()

while(pos > 0)
. += copytext(txt, start, pos)
start = pos + dlen
pos = findtext(txt, d, start)

. += copytext(txt, start)
It seems like every developer wants to reinvent the wheel with their own string handling system, but I really can't blame them. Text is a very important and fundamental resource in pretty much any program, so it's often vital that functions handling it be as efficient as possible.

If we are going to compare code, then you might as well also compare against these:

dd_text2list() and dd_file2list() from Deadron's TextHandling library:
proc
/*
dd_text2list(text, separator)
Split the text into a list, where separator is the delimiter between items.
Returns the list. This is not case-sensitive.

If the myText string is "a = b = c", and you call dd_text2list(myText, " = "), you get a list back with these items:
a
b
c

Example:

// Get a list containing the names in this string.
var/mytext = "George; Bernard; Shaw"
var/separator = "; "
var/list/names = dd_text2list(mytext, separator)
*/

dd_text2list(text, separator)
var/textlength = lentext(text)
var/separatorlength = lentext(separator)
var/list/textList = new /list()
var/searchPosition = 1
var/findPosition = 1
var/buggyText
while (1) // Loop forever.
findPosition = findtext(text, separator, searchPosition, 0)
buggyText = copytext(text, searchPosition, findPosition) // Everything from searchPosition to findPosition goes into a list element.
textList += "[buggyText]" // Working around weird problem where "text" != "text" after this copytext().

searchPosition = findPosition + separatorlength // Skip over separator.
if (findPosition == 0) // Didn't find anything at end of string so stop here.
return textList
else
if (searchPosition > textlength) // Found separator at very end of string.
textList += "" // So add empty element.
return textList

/*
dd_file2list(file_path, separator = "\n")
Splits the text from the specified file into a list.
file_path is the path to the file.
separator is an optional delimiter between items in the file;
it defaults to "\n", which makes each line of the file an item in the list.

Example:

// Read in the list of possible NPC names.
var/list/names = dd_file2list("NPCs.txt")
*/

dd_file2list(file_path, separator = "\n")
var/file
if (isfile(file_path))
file = file_path
else
file = file(file_path)
return dd_text2list(file2text(file), separator)


kText.text2list() and kText.file2list() from Keeth's kText library:
proc
/**
* Converts a string into a list, using a delimiter to determine where to separate entries.
* @param string The string to be listified.
* @param delimiter The string that will serve as a separator between entries in the list.
* @return the list form of the string.
* IE: text2list("this is a test", " ") would return list("this", "is", "a", "test").
*/

text2list(string, delimiter=" ")
var/list/listified = new, last=1
for(var/find=findtext(string, delimiter); find; find=findtext(string, delimiter, find+length(delimiter)))
listified += copytext(string, last, find)
last=find+length(delimiter)

listified += copytext(string, last)

return listified

/**
* Reads the contents of a file and separates it into a list, using the specified delimiter to
* separate entries.
* @param file The file object, or string referring to the file.
* @param delimiter The delimiter used to separate the text.
* @return Returns the listified version of the file.
* null if a bad file was given.
*/

file2list(file, delimiter="\n")
if(!isfile(file) || !fexists(file))
return

var/fileText = file2text(file)
var/list/listified = new, last=1
for(var/find=findtext(fileText, delimiter); find; find=findtext(fileText, delimiter, find+length(delimiter)))
listified += copytext(fileText, last, find)
last=find+length(delimiter)

listified += copytext(fileText, last)

return listified


It may be more useful to read text from a file, that way you are not compiling strings directly in your code, and possibly nearing the string limit. Of course, for comparing string handling efficiency, you wouldn't directly check against a proc with that functionality.

It would be nice to come up with a proc for handling this with the absolute highest efficiency possible. I can imagine Lummox JR could come up with some really scary implementation involving ASCII and bit masks, that would just leave these procs in the dust. Although, I can't be too sure that there would be any real speed improvement from something like that.

Anyway, it will probably take some really extreme stress tests to truly compare the efficiency of something this "small".
Perhaps there should just be a native proc for this? If there's so many people doing it.
F_A's implementation wins in all cases against all of these examples, by quite a bit in some examples.
Could you clarify? If in performance?
In response to Jittai
Jittai wrote:
Could you clarify? If in performance?

Sure give me two minutes to rewrite my test environment
In response to Jittai
Huh, well seemingly I stand corrected, FK's implementation is a LOT faster than them all, but I suspect this could be due to the fact that he directly looks for linebreaks whilst the others are more flexible. Regardless:



var/string = "hello/nhello/nhello/nhello"
mob/verb
FK_t2l()
for(var/i = 90001, i, i--)
var/list/strings = text2list(string)
world<<"done"
FA_t2l()
for(var/i = 90001, i, i--)
var/list/strings = split(string,"/n")
world<<"done"
DD_t2l()
for(var/i = 90001, i, i--)
var/list/strings = dd_text2list(string,"/n")
world<<"done"
K_t2l()
for(var/i = 90001, i, i--)
var/list/strings = text2list_k(string,"/n")
world<<"done"
test again with \n instead of /n
In response to FIREking
FIREking wrote:
test again with \n instead of /n

waaaaaaaaaaw im retaaaaaaarded



So yup my first test was accurate.
oh yay lol, mine is terrible
Not really, when you consider that 90,000 calls is taking half a second, any speed gains are non-existent. You'd be better off just using any existing library and focusing on something that actually matters.
In response to MisterPerson
MisterPerson wrote:
Not really, when you consider that 90,000 calls is taking half a second, any speed gains are non-existent. You'd be better off just using any existing library and focusing on something that actually matters.

Well to be fair I have a pretty beefy rig, I doubt most BYOND users would get the same performance.

FIREking wrote:
oh yay lol, mine is terrible

Yours has a completely uneeded copytext() in it, remove that and you'll be getting much better results.
For what its worth, I am using mine in a utility so performance isn't a huge must because the utility just takes input and gives a one time output.
The speed difference between them doesn't really matter much, as it's very unlikely it's going to have an affect on your game's performance. It's just a matter of whose is more convenient, and Forum_account and Deadron's version happens to be the ones that are. Gosh, you DM programmers optimize every little chance you get.
In response to Magnum2k
Magnum2k wrote:
The speed difference between them doesn't really matter much, as it's very unlikely it's going to have an affect on your game's performance. It's just a matter of whose is more convenient, and Forum_account and Deadron's version happens to be the ones that are. Gosh, you DM programmers optimize every little chance you get.

Optimize the small things, and the big things become much easier to manage.

Take care of the pennies and the pounds will take care of themselves.
Anyone notice the fact that all but FIREking's allow for specifying a separator/delimiter?
It's not like we're busy working on games or anything. Why not use the time finding the best way to do something for when we are making games? Instead of using bad code and running into performance issues in the future, we have a plan of how to go about preventing those problems.

Eventually, you will run into performance problems, because it's BYOND. You can say that BYOND is fast, and it is, up to a point. It's all smooth until your game's speed gets cut in half or more. It's not like C++ or Java where you can not worry about optimizing anything until later.
I have since-then changed mine a bit which now supports comments... the purpose of mine is for telling a utility what directories to look at when finding files to process...

Example input text:
//HUMANS//

character/human/male
character/human/female


Code:
/*
Written By: FIREking
*/


//text2list
//takes a text argument and converts it into a list
//each line in the text file separated by a line break becomes a unique list entry

proc/text2list(t)

if(!t || !istext(t)) CRASH("Invalid argument passed to text2settings\ntext2settings([t])")

. = list() //return value will be a list
var/pos //we store the position of the next line break

if(copytext(t, -1) != "\n") t += "\n" //if the last character isn't a line break, add one

while(t) //while there's still text left to process
pos = findtext(t, "\n") //find next line break
var/val = copytext(t, 1, pos) //copy text from beginning to line break
if(val && copytext(val, 1, 3) != "//") //if text copied isn't empty
//skips lines that start with // which allows for comments
. += val //add it to the list
t = copytext(t, pos + 1) //set text equal to thke rest excluding the value we added to the list and skipping the line break


If I were going to use this in a game server, I would most definitely optimize it more or just use the fastest one (split).
In response to Kaiochao
Kaiochao wrote:
Eventually, you will run into performance problems, because it's BYOND. You can say that BYOND is fast, and it is, up to a point. It's all smooth until your game's speed gets cut in half or more. It's not like C++ or Java where you can not worry about optimizing anything until later.

And how exactly is a text2list() procedure going to make your game faster? Actually, there are very few cases where you would even need it.

Page: 1 2 3