Text Forums - General

Version 2

ID:510182

Mar 14 2012, 4:56 am

Forum_account

Version 2 (posted 03-14-2012)

Fixed a bug with the split() proc.
Changed how the replace proc works to increase its speed by 50%.
Added the Replace() and split() procs which are case-sensitive versions of replace and split.
Added the prefix() and ending() procs which determine if one string starts or ends with another string. Prefix() and Ending() are the case-sensitive versions.
Did some benchmarking to compare performance against the Deadron.TextHandling library. The benchmarking code is included as demo\benchmark.dm

Here are the results of the comparison:

Replace:
Forum_account.Text      50,000 calls in 5.700 seconds: 114 microseconds per call
Deadron.TextHandling    50,000 calls in 9.520 seconds: 190 microseconds per call
(dd_replacetext)

Split:
Forum_account.Text      50,000 calls in 3.150 seconds: 63 microseconds per call
Deadron.TextHandling    50,000 calls in 4.176 seconds: 83 microseconds per call
(dd_text2list)

Concat:
Forum_account.Text      50,000 calls in 1.894 seconds: 38 microseconds per call
Deadron.TextHandling    50,000 calls in 5.537 seconds: 110 microseconds per call
(dd_list2text)

The prefix() and ending() procs are almost identical to the dd_hasprefix and dd_hassuffix procs. Their performance is about the same and because those procs are so simple, it can't make much of a difference. The only difference is because the dd_hassuffix proc is incorrect:

    dd_hassuffix(text, suffix)
        var/start = length(text) - length(suffix)
        if (start) return findtext(text, suffix, start)

it should be findtext(text, suffix, start + 1), otherwise this will happen:

    if(dd_hassuffix("fails", "fail"))
        world << "oops!"

The value of start will be length("fails") - length("fail"), which is 1. String indexes start at 1, so this is checking if the string "fail" is found in "fails" (and it is). Instead it should start checking at index = 2, so it checks if "fail" is found in "ails" (and it's not). The library even has an automatic built-in test, oops indeed!

Mar 14 2012, 8:52 am

Lummox JR

Interesting looking library. A few thoughts:

1) The naming convention would benefit from some kind of library-specific prefix, in my opinion. The proc names you use are somewhat generic.

2) In int(), instead of using an associative list and copytext(), text2ascii() and a couple of if statements might be just as good. The only place you stand to lose is in losing the lookup from the associative list, but I doubt that's actually a problem; I think those instructions should execute pretty quickly, and you could possibly even avoid the trouble of needing to convert to uppercase. This also would let you handle bases up to 36. In addition, I would suggest the debug mode also ASSERT() that the base is an integer.

3) Based on my experience with BYOND's internals, the concat() method seems really ingenious. I'm not sure though that the cases above 10 items are really of great benefit, since the vast majority of concats will use 10 or fewer and big replacement operations will tend to be infrequent. I suspect that past 10, you'd gain a lot of simplicity and lose little in speed just by recursing into halves. If not 10, then 20 or 40 maybe. This at any rate would be a better way to handle the 321+ cases, which currently use tail recursion.

Mar 14 2012, 9:31 am

In response to Lummox JR

Forum_account

Lummox JR wrote:

1) The naming convention would benefit from some kind of library-specific prefix, in my opinion. The proc names you use are somewhat generic.

These names are fairly generic, but I don't think people are likely to create global procs of the same name. I can imagine people using the same proc names for mob procs, but not necessarily for global procs. Adding a prefix would avoid this, but I'd rather not make the names a little longer, uglier, and less intuitive in all situations for the few situations where the names may cause problems.

If you have a global proc of your own with the same name, you can comment out the library's proc or change its name directly. If you have a member proc with the same name, you can refer to them as global.join(), global.split(), etc.

I was also hoping that by providing multiple names for each proc you have enough options. The library defines merge(), join(), and concat() to all do the same thing. If your game uses the name "merge" for something (ex: merging armies together), you can use join() or concat() to refer to the text proc.

2)

I'm not too concerned about performance with this one, so I'll probably make these changes to support additional bases whether it improves performance or not. If it ends up being slow, I can make a fast version specifically for base 16. I can't imagine many people will need support for fast conversions from base 7 strings.

I suppose it also makes sense to have a proc that computes the inverse - turning integers into strings of a specified base.

3)

I have another version of the proc that uses recursion for all cases above 10 (just like how 321+ works). The problem is that the Total CPU numbers aren't making sense when I profile it. I have a verb that calls the new version of concat (called concat2) 10,000 times. The Total CPU time of the verb is less than the CPU time of the concat2 proc (by a significant amount too, about 0.6 seconds).

Mar 14 2012, 9:44 am (Edited on Mar 15 2012, 3:54 am)

In response to Forum_account

Forum_account

                       Profile results (total time)
Proc Name                  Self CPU    Total CPU    Real Time        Calls
----------------------    ---------    ---------    ---------    ---------
/mob/verb/test_concat1        0.420       20.291       20.291            3
/proc/concat                 19.871       19.884       19.929        30000
/mob/verb/test_concat2        0.555       12.115       12.116            3
/proc/concat2                11.560       21.135       21.249       120000

I guess it doesn't count recursion properly. If concat2 recursively calls itself, it looks like the time spent in the recursive call is counted double. The time inside the recursive call is counted and it's counted again as time that the parent call spends waiting for it to return. That's the best I can figure, but based on the time that the test_concat2 verb takes, concat2 looks to be better.

Mar 15 2012, 3:56 am
Forum_account	I posted an update which includes the change you suggested for int(), some changes to concat(), and some new procs. The timing of concat() when I profile it doesn't seem to be correct but the new version appears to run faster.