You could use the String constructor with the charset parameter:
try
{
final String s = new String(nodevalue.getBytes(), "UTF-8");
}
catch (UnsupportedEncodingException e)
{
Log.e("utf8", "conversion", e);
}
Also, since you get the data from an xml document, and I assume it is encoded UTF-8, probably the problem is in parsing it.
You should use InputStream
/InputSource
instead of a XMLReader
implementation, because it comes with the encoding. So if you're getting this data from a http response, you could either use both InputStream
and InputSource
try
{
HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
final XmlHandler handler = new XmlHandler();
Reader reader = new InputStreamReader(in, "UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
parser.parse(is, handler);
//TODO: get the data from your handler
}
catch (final Exception e)
{
Log.e("ParseError", "Error parsing xml", e);
}
or just the InputStream
:
try
{
HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
final XmlHandler handler = new XmlHandler();
parser.parse(in, handler);
//TODO: get the data from your handler
}
catch (final Exception e)
{
Log.e("ParseError", "Error parsing xml", e);
}
Update 1
Here is a sample of a complete request and response handling:
try
{
final DefaultHttpClient client = new DefaultHttpClient();
final HttpPost httppost = new HttpPost("http://example.location.com/myxml");
final HttpResponse response = client.execute(httppost);
final HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
final XmlHandler handler = new XmlHandler();
parser.parse(in, handler);
//TODO: get the data from your handler
}
catch (final Exception e)
{
Log.e("ParseError", "Error parsing xml", e);
}
Update 2
As the problem is not the encoding but the source xml being escaped to html entities, the best solution is (besides correcting the php to do not escape the response), to use the apache.commons.lang library's very handy static StringEscapeUtils class
.
After importing the library, in your xml handler's characters
method you put the following:
@Override
public void characters(final char[] ch, final int start, final int length)
throws SAXException
{
// This variable will hold the correct unescaped value
final String elementValue = StringEscapeUtils.
unescapeHtml(new String(ch, start, length).trim());
[...]
}
Update 3
In your last code the problem is with the initialization of the nodevalue
variable. It should be:
String nodevalue = StringEscapeUtils.unescapeHtml(
new String(ch, start, length).trim());