I have to handle this scenario in Java:
I'm getting a request in XML form from a client with declared encoding=utf-8. Unfortunately it may contain not utf-8 characters and there is a requirement to remove these characters from the xml on my side (legacy).
Let's consider an example where this invalid XML contains £ (pound).
1) I get xml as java String with £ in it (I don't have access to interface right now, but I probably get xml as a java String). Can I use replaceAll(£, "") to get rid of this character? Any potential issues?
2) I get xml as an array of bytes - how to handle this operation safely in that case?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…