Quick Snippet: text2list()

BYOND Forums

Announcements · BYOND Help · Bug Reports · Feature Requests · Beta Testers · Beta Bugs · Developer Help · Design Philosophy · Demos & Libraries · Tutorials & Snippets · Art & Sound · Classified Ads · Game Updates · Contests & Events · Linux Talk · On Topic · Off Topic

Page: 1 2 3

ID:1342405

Aug 2 2013, 10:20 am

FIREking

/*

    Written By: FIREking

*/

//text2list
//takes a text argument and converts it into a list
//each line in the text separated by a line break becomes a unique list entry

proc/text2list(t)
    . = list() //return value will be a list
    var/pos //we store the position of the next line break

    if(copytext(t, -1) != "\n") t += "\n" //if the last character isn't a line break, add one

    while(t) //while there's still text left to process
        pos = findtext(t, "\n") //find next line break
        var/val = copytext(t, 1, pos) //copy text from beginning to line break
        if(val) //if text copied isn't empty
            . += val //add it to the list
        t = copytext(t, pos + 1) //set text equal to the remaining text excluding the value we added to the list and skipping the line break

Aug 2 2013, 10:47 am

Kaiochao

How does it compare to Forum_account's split()?

proc
    // split the string into a list of substrings by splitting it
    // using the specified delimiter.
    split(txt, d)
        #ifdef DEBUG
        ASSERT(istext(txt))
        ASSERT(istext(d))
        ASSERT(d)
        #endif

        var/pos = findtext(txt, d)
        var/start = 1
        var/dlen = length(d)

        . = list()

        while(pos > 0)
            . += copytext(txt, start, pos)
            start = pos + dlen
            pos = findtext(txt, d, start)

        . += copytext(txt, start)

Aug 3 2013, 1:44 am

Multiverse7

It seems like every developer wants to reinvent the wheel with their own string handling system, but I really can't blame them. Text is a very important and fundamental resource in pretty much any program, so it's often vital that functions handling it be as efficient as possible.

If we are going to compare code, then you might as well also compare against these:

dd_text2list() and dd_file2list() from Deadron's TextHandling library:

proc
/*
dd_text2list(text, separator)
    Split the text into a list, where separator is the delimiter between items.
    Returns the list. This is not case-sensitive.

    If the myText string is "a = b = c", and you call dd_text2list(myText, " = "), you get a list back with these items:
        a
        b
        c

    Example:

    // Get a list containing the names in this string.
    var/mytext = "George; Bernard; Shaw"
    var/separator = "; "
    var/list/names = dd_text2list(mytext, separator)
*/
    dd_text2list(text, separator)
        var/textlength      = lentext(text)
        var/separatorlength = lentext(separator)
        var/list/textList   = new /list()
        var/searchPosition  = 1
        var/findPosition    = 1
        var/buggyText
        while (1)                                                            // Loop forever.
            findPosition = findtext(text, separator, searchPosition, 0)
            buggyText = copytext(text, searchPosition, findPosition)        // Everything from searchPosition to findPosition goes into a list element.
            textList += "[buggyText]"                                       // Working around weird problem where "text" != "text" after this copytext().

            searchPosition = findPosition + separatorlength                 // Skip over separator.
            if (findPosition == 0)                                           // Didn't find anything at end of string so stop here.
                return textList
            else
                if (searchPosition > textlength)                            // Found separator at very end of string.
                    textList += ""                                          // So add empty element.
                    return textList

/*
dd_file2list(file_path, separator = "\n")
    Splits the text from the specified file into a list.
    file_path is the path to the file.
    separator is an optional delimiter between items in the file;
    it defaults to "\n", which makes each line of the file an item in the list.

    Example:

    // Read in the list of possible NPC names.
    var/list/names = dd_file2list("NPCs.txt")
*/
    dd_file2list(file_path, separator = "\n")
        var/file
        if (isfile(file_path))
            file = file_path
        else
            file = file(file_path)
        return dd_text2list(file2text(file), separator)

kText.text2list() and kText.file2list() from Keeth's kText library:

proc
    /**
     * Converts a string into a list, using a delimiter to determine where to separate entries.
     * @param string    The string to be listified.
     * @param delimiter The string that will serve as a separator between entries in the list.
     * @return the list form of the string.
     * IE: text2list("this is a test", " ") would return list("this", "is", "a", "test").
     */
    text2list(string, delimiter=" ")
        var/list/listified = new, last=1
        for(var/find=findtext(string, delimiter); find; find=findtext(string, delimiter, find+length(delimiter)))
            listified += copytext(string, last, find)
            last=find+length(delimiter)

        listified += copytext(string, last)

        return listified

    /**
     * Reads the contents of a file and separates it into a list, using the specified delimiter to
     * separate entries.
     * @param   file        The file object, or string referring to the file.
     * @param   delimiter   The delimiter used to separate the text.
     * @return  Returns the listified version of the file.
     * null if a bad file was given.
     */
    file2list(file, delimiter="\n")
        if(!isfile(file) || !fexists(file))
            return

        var/fileText = file2text(file)
        var/list/listified = new, last=1
        for(var/find=findtext(fileText, delimiter); find; find=findtext(fileText, delimiter, find+length(delimiter)))
            listified += copytext(fileText, last, find)
            last=find+length(delimiter)

        listified += copytext(fileText, last)

        return listified

It may be more useful to read text from a file, that way you are not compiling strings directly in your code, and possibly nearing the string limit. Of course, for comparing string handling efficiency, you wouldn't directly check against a proc with that functionality.

It would be nice to come up with a proc for handling this with the absolute highest efficiency possible. I can imagine Lummox JR could come up with some really scary implementation involving ASCII and bit masks, that would just leave these procs in the dust. Although, I can't be too sure that there would be any real speed improvement from something like that.

Anyway, it will probably take some really extreme stress tests to truly compare the efficiency of something this "small".

Aug 3 2013, 6:31 am
Jittai	Perhaps there should just be a native proc for this? If there's so many people doing it.

Aug 3 2013, 6:32 am
Rushnut	F_A's implementation wins in all cases against all of these examples, by quite a bit in some examples.

Aug 3 2013, 6:33 am
Jittai	Could you clarify? If in performance?

Aug 3 2013, 6:34 am In response to Jittai
Rushnut	Jittai wrote: Could you clarify? If in performance? Sure give me two minutes to rewrite my test environment

Aug 3 2013, 6:42 am

In response to Jittai

Rushnut

Huh, well seemingly I stand corrected, FK's implementation is a LOT faster than them all, but I suspect this could be due to the fact that he directly looks for linebreaks whilst the others are more flexible. Regardless:

var/string = "hello/nhello/nhello/nhello"
mob/verb
    FK_t2l()
        for(var/i = 90001, i, i--)
            var/list/strings = text2list(string)
        world<<"done"
    FA_t2l()
        for(var/i = 90001, i, i--)
            var/list/strings = split(string,"/n")
        world<<"done"
    DD_t2l()
        for(var/i = 90001, i, i--)
            var/list/strings = dd_text2list(string,"/n")
        world<<"done"
    K_t2l()
        for(var/i = 90001, i, i--)
            var/list/strings = text2list_k(string,"/n")
        world<<"done"

Aug 3 2013, 6:46 am
FIREking	test again with \n instead of /n

Aug 3 2013, 6:47 am In response to FIREking
Rushnut	FIREking wrote: test again with \n instead of /n waaaaaaaaaaw im retaaaaaaarded So yup my first test was accurate.

Aug 3 2013, 6:49 am
FIREking	oh yay lol, mine is terrible

Aug 3 2013, 6:53 am
MisterPerson	Not really, when you consider that 90,000 calls is taking half a second, any speed gains are non-existent. You'd be better off just using any existing library and focusing on something that actually matters.

Aug 3 2013, 6:55 am

In response to MisterPerson

Rushnut

MisterPerson wrote:

Not really, when you consider that 90,000 calls is taking half a second, any speed gains are non-existent. You'd be better off just using any existing library and focusing on something that actually matters.

Well to be fair I have a pretty beefy rig, I doubt most BYOND users would get the same performance.

FIREking wrote:

oh yay lol, mine is terrible

Yours has a completely uneeded copytext() in it, remove that and you'll be getting much better results.

Aug 3 2013, 6:57 am
FIREking	For what its worth, I am using mine in a utility so performance isn't a huge must because the utility just takes input and gives a one time output.

Aug 3 2013, 6:57 am
Magnum2k	The speed difference between them doesn't really matter much, as it's very unlikely it's going to have an affect on your game's performance. It's just a matter of whose is more convenient, and Forum_account and Deadron's version happens to be the ones that are. Gosh, you DM programmers optimize every little chance you get.

Aug 3 2013, 6:59 am

In response to Magnum2k

Rushnut

Magnum2k wrote:

The speed difference between them doesn't really matter much, as it's very unlikely it's going to have an affect on your game's performance. It's just a matter of whose is more convenient, and Forum_account and Deadron's version happens to be the ones that are. Gosh, you DM programmers optimize every little chance you get.

Optimize the small things, and the big things become much easier to manage.

Take care of the pennies and the pounds will take care of themselves.

Aug 3 2013, 7:00 am
Kaiochao	Anyone notice the fact that all but FIREking's allow for specifying a separator/delimiter?

Aug 3 2013, 7:03 am

Kaiochao

It's not like we're busy working on games or anything. Why not use the time finding the best way to do something for when we are making games? Instead of using bad code and running into performance issues in the future, we have a plan of how to go about preventing those problems.

Eventually, you will run into performance problems, because it's BYOND. You can say that BYOND is fast, and it is, up to a point. It's all smooth until your game's speed gets cut in half or more. It's not like C++ or Java where you can not worry about optimizing anything until later.

Aug 3 2013, 7:03 am

FIREking

I have since-then changed mine a bit which now supports comments... the purpose of mine is for telling a utility what directories to look at when finding files to process...

Example input text:

//HUMANS//

character/human/male
character/human/female

Code:

/*
    Written By: FIREking
*/

//text2list
//takes a text argument and converts it into a list
//each line in the text file separated by a line break becomes a unique list entry

proc/text2list(t)

    if(!t || !istext(t)) CRASH("Invalid argument passed to text2settings\ntext2settings([t])")

    . = list() //return value will be a list
    var/pos //we store the position of the next line break

    if(copytext(t, -1) != "\n") t += "\n" //if the last character isn't a line break, add one

    while(t) //while there's still text left to process
        pos = findtext(t, "\n") //find next line break
        var/val = copytext(t, 1, pos) //copy text from beginning to line break
        if(val && copytext(val, 1, 3) != "//") //if text copied isn't empty
            //skips lines that start with // which allows for comments
            . += val //add it to the list
        t = copytext(t, pos + 1) //set text equal to thke rest excluding the value we added to the list and skipping the line break

If I were going to use this in a game server, I would most definitely optimize it more or just use the fastest one (split).

Aug 3 2013, 7:04 am

In response to Kaiochao

Magnum2k

Kaiochao wrote:

Eventually, you will run into performance problems, because it's BYOND. You can say that BYOND is fast, and it is, up to a point. It's all smooth until your game's speed gets cut in half or more. It's not like C++ or Java where you can not worry about optimizing anything until later.

And how exactly is a text2list() procedure going to make your game faster? Actually, there are very few cases where you would even need it.

Page: 1 2 3