I have a string variable response
, which contains text as well as categories that have already been coded (categories like "CatPlease", "CatThanks", "ExcuseMe", "Apology", "Mit", etc.).
I would like to erase everything in response
except for these previously coded categories.
For example, I would like response
to change from:
"I Mit understand CatPlease read it again CatThanks"
to:
"Mit CatPlease CatThanks"
This seems like a simple problem, but I can't get my regex code to work perfectly.
The code below attempts to store the categories in a variable cat_only
. It only works if the category appears at the beginning of response
. The local macro, cats
, contains all of the words I would like to preserve in response
:
local cats = "(CatPlease|CatThanks|ExcuseMe|Apology|Mit|IThink|DK|Confused|Offers|CatYG)?"
gen cat_only = strltrim(strtrim(ustrregexs(1)+" "+ustrregexs(2)+" "+ustrregexs(3))) if ustrregexm(response, "`cats'.+?`cats'.+?`cats'")
If I add characters to the beginning of the search pattern in ustrregexm
, however, nothing will be stored in cat_only
:
gen cat_only = strltrim(strtrim(ustrregexs(1)+" "+ustrregexs(2)+" "+ustrregexs(3))) if ustrregexm(response, ".+?`cats'.+?`cats'.+?`cats'")
Is there a way to fix my code to make it work, or should I approach the problem differently?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…