ID:2702762
 
Problem description:
Hey guys, can you help me?
I want to use Russian words for phrases to replace, it uses Regex and looks like it doesn't support Cyrillic? (doesn't react on russian words) how can i fix it?

Code:
/datum/dna/gene/disability/speech/chav
name = "Chav"
desc = "Forces the language center of the subject's brain to construct sentences in a more rudimentary manner."
activation_message = "Ye feel like a rite prat like, innit?"
deactivation_message = "You no longer feel like being rude and sassy."
mutation = CHAV
//List of swappable words. Normal word first, chav word second.
var/static/list/chavlinks = list(
"yikes" = "blimey",
"твой" = "твое"
)

/datum/dna/gene/disability/speech/chav/New()
..()
block = GLOB.chavblock

/datum/dna/gene/disability/speech/chav/OnSay(mob/M, message)
var/static/regex/R = regex("\\b([chavlinks.Join("|")])\\b", "g")
message = R.Replace(message, /datum/dna/gene/disability/speech/chav/proc/replace_speech)
return message
/datum/dna/gene/disability/speech/chav/proc/replace_speech(matched)
return chavlinks[matched]


I also tried this, but compiler throws me "unexpected character (ascii 143)", etc.

/datum/dna/gene/disability/speech/chav/OnSay(mob/M, message)
var/static/regex/R = regex("\\b([chavlinks.Join("|")])\\b[ЁёА-я]", "g")
message = R.Replace_char(message, /datum/dna/gene/disability/speech/chav/proc/replace_speech)
return message
/datum/dna/gene/disability/speech/chav/proc/replace_speech(matched)
return chavlinks[matched]


Sources:
https://github.com/ParadiseSS13/Paradise
(code\game\dna\genes\goon_disabilities.dm)
Cyrillic should be supported just fine in regular expressions. I think the problem you're having is that you're actually searching for HTML entities rather than the characters themselves.

If the user's input has already been sent through html_encode(), then there shouldn't be an issue here.
In response to Lummox JR
I tried this, but looks like it doesn't affect anything and i'm not sure how to properly use it, because for the first time faced with this.

/datum/dna/gene/disability/speech/chav/OnSay(mob/M, message)
var/static/regex/R = regex("\\b([chavlinks.Join("|")])\\b", "g")
message = R.Replace(message, /datum/dna/gene/disability/speech/chav/proc/replace_speech)
return html_encode(message)
/datum/dna/gene/disability/speech/chav/proc/replace_speech(matched)
return chavlinks[matched]
Here's the thing: the real text you're trying to replace doesn't look like Ё or such. Those are just the html_encode() versions. The real characters should be caught just fine. You're trying to search for the html_encode() text instead of the raw text and that's why it's failing on you.
In response to Lummox JR
Okay, maybe we just didn't understand each other. In fact, I don't write any numbers in the code, like "& #1025;", dm tag for some reason converted Russian letters to numbers :/

In code it's looks just normal words
https://i.imgur.com/9qb8Kvj.png

And in any case, it doesn't react to Cyrillic at all, but it works just fine for Latin.
Oh, I see what you mean now.

Is there any chance you can build a test case for this with a verb that has the text and then tries to do the regex replacement? If you can get me a test that would be really helpful. This could potentially be a bug.
In response to Lummox JR
Sorry, but I'm completely new to DM and Byond, but I can provide any information and links I have.

Especially this code above i took from here, but replaced some English words with Russian.
https://github.com/ss220-space/Paradise/blob/master220/code/ game/dna/genes/goon_disabilities.dm#L81

(Space Station 13 mod)
I really can't proceed without a simple test case.