Tokeniser with arbitrary, single-character dividers

BYOND Forums

Announcements · BYOND Help · Bug Reports · Feature Requests · Beta Testers · Beta Bugs · Developer Help · Design Philosophy · Demos & Libraries · Tutorials & Snippets · Art & Sound · Classified Ads · Game Updates · Contests & Events · Linux Talk · On Topic · Off Topic

ID:195089

Nov 22 2006, 6:25 pm (Edited on Nov 28 2006, 1:36 pm)

//Title: Tokeniser with arbitrary, single-character divisions between tokens
//Credit to: Jp
//Contributed by: Jp

/*
Breaks up a string into bits of string around
arbitrary characters. 'text' is the string you're
tokenising, and the 'tokenbreaker' list should
contain a string consisting of characters that the
tokens should be split up around. Dividers aren't
included in the tokens
*/

proc
    Tokenise(text, list/tokenbreaker)
        var
            list/tokens=list()
            temp
            currtoken
            
        for(var/a=1, a<=lentext(text), a++)
            temp=copytext(text,a,a+1)
            if(!(temp in tokenbreaker)) currtoken+=temp
            else
                if(currtoken) tokens+=currtoken
                currtoken=""
        if(currtoken) tokens+=currtoken
        return tokens
    
    Tokenize(t, list/tb)
        return Tokenise(t,tb) //Silly a-meri-cans


//Testing code:
mob/verb/test(t as message)
    var/list/o=Tokenise(t, list(" ", ",", ".", ascii2text(10), ascii2text(13))
    src << "Tokenise:"
    for(var/k in o) src << k

Nov 24 2006, 8:02 pm
Jtgibson	You know, I've been looking at this, and I'm realising that this won't work with tokens that are longer than one character long. I don't think it's serious but it's something you might want to consider. =) I'd also suggest adding an alias proc for "tokenize" for our American buddies. =) proc/Tokenize(text, list/tokenbreaker) return Tokenise(text, tokenbreaker)