regex - Postgresql中的REGEXP_REPLACE不是子字符串(REGEXP_REPLACE in Postgresql not substring)

Question

asked Mar 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

In postgresql I would like to substitute just in full words and not substrings.

(在postgresql中，我只想用完整的单词代替子字符串。)

I noticed that replace and translate replace strings even in substrings.

(我注意到即使在子字符串中，替换和转换替换字符串也是如此。)

Then, I used regexp_replace to add the following:

(然后，我使用regexp_replace添加以下内容：)

SELECT REGEXP_REPLACE (UPPER('BIG CATDOG'), '(^|[^a-z0-9])' || UPPER('CAT') || '($|[^a-z0-9])', '1' || UPPER('GATO') || '2','g')

In the previous sample, CAT should not been replaced because it is not a whole word, but a substring which is part of a word.

(在前面的示例中，不应该替换CAT因为它不是一个完整的单词，而是一个组成单词一部分的子字符串。)

How can I achieve to avoid the replacement?

(如何避免更换？)

The output should be BIG CATDOG because no substitution was possible.

(输出应为BIG CATDOG，因为不可能进行替换。)

Thanks

(谢谢)

ask by Juan Perez translate from so

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-03-06T04:22:51+0000

The replacement happens because you are only checking for [^a-z0-9] after the search term, and D is not in that character class.

(发生替换是因为您仅在搜索词之后检查[^a-z0-9] ，而D不在该字符类中。)

You can resolve this by either adding AZ to your character class:

(您可以通过将AZ添加到角色类来解决此问题：)

SELECT REGEXP_REPLACE (UPPER('BIG CATDOG'), '(^|[^a-zA-Z0-9])' || UPPER('CAT') || '($|[^a-zA-Z0-9])', '1' || UPPER('GATO') || '2','g')

Or by adding the i flag to the replace call:

(或通过将i标志添加到replace调用中：)

SELECT REGEXP_REPLACE (UPPER('BIG CATDOG'), '(^|[^a-z0-9])' || UPPER('CAT') || '($|[^a-z0-9])', '1' || UPPER('GATO') || '2','gi')

In either case you will get the desired BIG CATDOG output.

(无论哪种情况，您都将获得所需的BIG CATDOG输出。)

However a better solution is to use the word boundary constraints \m (beginning of word) and \M (end of word):

(但是，更好的解决方案是使用单词边界约束\m （单词的开头）和\M （单词的结尾）：)

SELECT REGEXP_REPLACE (UPPER('BIG CATDOG'), 'm' || UPPER('CAT') || 'M', UPPER('GATO'),'g')