Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
82 views
in Technique[技术] by (71.8m points)

java - How to set request encoding in Tomcat?

I have a problem in my Java webapp.

Here is the code in index.jsp:

<%@page contentType="text/html" pageEncoding="UTF-8" %>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd">

<% request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");
%>

<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>JSP Page</title>
    </head>
    <body>
        <h1>Hello World!</h1>

        <form action="index.jsp" method="get">
            <input type="text" name="q"/>
        </form>

        Res: <%= request.getParameter("q") %>
    </body>
</html>

When I wireshark a request, my browser sends this header:

GET /kjd/index.jsp?q=%C3%A9 HTTP/1.1

...
Accept-Charset: UTF-8,*

And the Tomcat server returns me this:

Content-Type: text/html;charset=UTF-8

But if I send "é"(%C3%A9 in UTF-8) in my form, "??" is displayed instead.

What I understand is that the browser sends an "é" encoded with UTF-8 (the %C3%A9).

But the server interpret this as ISO-8859-1. So the %C3 is decoded as ? and %A9 as ?, and then sends back the response encoded in UTF-8.

In the code, the requests should be decoded with UTF-8:

request.setCharacterEncoding("UTF-8");

But, if I send this url:

http://localhost:8080/kjd/index.jsp?q=%E9

the "%E9" is decocded with ISO-8859-1 and an "é" is displayed.

Why isn't this working? Why requests are decoded with ISO-8859-1?

I've tried it on Tomcat 6 and 7, and on Windows and Ubuntu.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The request.setCharacterEncoding("UTF-8"); only sets the encoding of the request body (which is been used by POST requests), not the encoding of the request URI (which is been used by GET requests).

You need to set the URIEncoding attribute to UTF-8 in the <Connector> element of Tomcat's /conf/server.xml to get Tomcat to parse the request URI (and the query string) as UTF-8. This indeed defaults to ISO-8859-1. See also the Tomcat HTTP Connector Documentation.

<Connector ... URIEncoding="UTF-8">

or to ensure that the URI is parsed using the same encoding as the body1:

<Connector ... useBodyEncodingForURI="true">

See also:


1 From Tomcat's documentation (emphasis mine):

This setting is present for compatibility with Tomcat 4.1.x, where the encoding specified in the contentType, or explicitly set using Request.setCharacterEncoding method was also used for the parameters from the URL. The default value is false.


Please get rid of those scriptlets in your JSP. The request.setCharacterEncoding("UTF-8"); is called at the wrong moment. It would be too late whenever you've properly used a Servlet to process the request. You'd rather like to use a filter for this. The response.setCharacterEncoding("UTF-8"); part is already implicitly done by pageEncoding="UTF-8" in top of JSP.

I also strongly recommend to replace the old fashioned <%= request.getParameter("q") %> scriptlet by EL ${param.q}, or with JSTL XML escaping ${fn:escapeXml(param.q)} to prevent XSS attacks.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...