Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
238 views
in Technique[技术] by (71.8m points)

java - Catch-all servlet filter that should capture ALL HTML input content for manipulation, works only intermittently

I need a servlet filter that will capture all input, then mangle that input, inserting a special token in every form. Imagine that the filter is tied to all requests (E.g. url-pattern=*). I have the code for capture of content, but it doesn't seem like the RequestWrapper is robust enough to capture all input. Some input returns zero bytes and then I can't "stream" that content back to the user. For example, we are still using Struts 1.3.10 and any Struts code does not "capture" properly, we get zero byte content. I believe it is because of how Struts handles forwards. If there is a forward involved in the request, I wonder if the capture code below will work. Here is all the code, do you have an approach that will capture any type of content that is meant for streaming to the user.

<filter>
   <filter-name>Filter</filter-name>
   <filter-class>mybrokenCaptureHtml.TokenFilter</filter-class>
</filter> 
<filter-mapping>
    <filter-name>Filter</filter-name>
    <url-pattern>/*</url-pattern>
 </filter-mapping>

package mybrokenCaptureHtml;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.PrintWriter;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;

public class TokenFilter implements Filter {    
    @Override
    public void destroy() {
    }

    public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain) throws IOException, ServletException {
        HttpServletRequest request = (HttpServletRequest) servletRequest;               
        HttpServletResponse response = (HttpServletResponse) servletResponse;
        try {                                                                                       
            final MyResponseWrapper responseWrapper = new MyResponseWrapper((HttpServletResponse) response);
            chain.doFilter(request, responseWrapper);                       

            // **HERE DEPENDING ON THE SERVLET OR APPLICATION CODE (STRUTS, WICKET), the response returns an empty string //
            // Especiall struts, is there something in their forwards that would cause an error?
            final byte [] bytes = responseWrapper.toByteArray(); 
                    // For some applications that hit this filter
                    // ZERO BYTE DATA is returned, this is bad, but SOME
                    // CODE, the data is captured.
            final String origHtml = new String(bytes);

            final String newHtml = origHtml.replaceAll("(?i)</(\s)*form(\s)*>", "<input type="hidden" name="zval" value="fromSiteZ123"/></form>");          
            response.getOutputStream().write(newHtml.getBytes());

        } catch(final Exception e) {            
            e.printStackTrace();
        }
        return;
    }

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {        
    }

    static class MyResponseWrapper extends HttpServletResponseWrapper {    
        private final MyPrintWriter pw = new MyPrintWriter();               
        public byte [] toByteArray() {            
            return pw.toByteArray();        
        }
        public MyResponseWrapper(HttpServletResponse response) {
            super(response);       
        }

        @Override
        public PrintWriter getWriter() {
            return pw.getWriter();
        }
        @Override
        public ServletOutputStream getOutputStream() {
            return pw.getStream();
        }       
        private static class MyPrintWriter {
            private ByteArrayOutputStream baos = new ByteArrayOutputStream();
            private PrintWriter pw = new PrintWriter(baos);
            private ServletOutputStream sos = new MyServletStream(baos);
            public PrintWriter getWriter() {
                return pw;
            }
            public ServletOutputStream getStream() {
                return sos;
            }
            byte[] toByteArray() {
                return baos.toByteArray();
            }
        }    
        private static class MyServletStream extends ServletOutputStream {
            ByteArrayOutputStream baos;
            MyServletStream(final ByteArrayOutputStream baos) {
                this.baos = baos;
            }
            @Override
            public void write(final int param) throws IOException {
                baos.write(param);
            }
        }
    }

}

This is what an example Struts app may look like, for some applications (not Struts), we may capture the content. But for apps like the one below, zero bytes are returned for the HTML content but there should be content.

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<%@ taglib uri="/WEB-INF/struts-html.tld" prefix="html" %>
<%@ taglib uri="/WEB-INF/struts-bean.tld" prefix="bean" %>
<%@ taglib uri="/WEB-INF/c.tld" prefix="c" %>
<%@ taglib uri="/WEB-INF/struts-logic.tld" prefix="logic" %>
<%@ taglib uri="/WEB-INF/struts-tiles.tld" prefix="tiles" %>
<%@ taglib uri="/WEB-INF/struts-nested.tld" prefix="nested"%>
<html:html>
<head>
<title><bean:message key="myApp.customization.title" /></title>
<LINK rel="stylesheet" type="text/css" href="../theme/styles.css">
</head>
<body>
<html:form styleId="customizemyAppForm" method="post" action="/customizemyApp.do?step=submit">
<html:submit onclick="javascript:finish(this.form);" styleClass="input_small">&nbsp;&nbsp;<bean:message key="myApp.customization.submit" />&nbsp;</html:submit> 
<input type="button" styleClass="input_small" width="80" style="WIDTH:80px" name="<bean:message key="myApp.customization.cancel" />" value="<bean:message key="myApp.customization.cancel" />" onclick="javascript:cancel();">

</html:form>
</body>
</html:html>

I suspect that the MyResponseWrapper and MyPrintWriter are not robust enough to capture all types of content.


Example servlet that would work(a):

response.getOutputStream().write(str.getBytes());

Example servlet that would not work(b):

response.getWriter().println("<html>data</html>");

Example a would get a capture, example b will not.

Here is an improved wrapper class, most of the applications will work but NOW some of the struts applications, only SOME of the response is sent to the browser.

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;

import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;

public class ByteArrayResponseWrapper extends HttpServletResponseWrapper {
    private PrintWriter output = null;
    private ServletOutputStream outStream = null;
    private static final String NL = System.getProperty("line.separator");

    public ByteArrayResponseWrapper(final HttpServletResponse response) {
        super(response);
    }

    public String getDocument() {        
        InputStream in = null;

        try {            
            in = this.getInputStream();            
            if (in != null) {             
                return getDocument(in);
            }           
        } catch(final Exception ee) {
            // ee.print;StackTrace();
        } finally {                  
            if (in != null) {
                try {
                    in.close();
                } catch (IOException e) {
                    //e.prin;tStackTrace();
                }
            }
        }
        return "";    
    }

    protected String getDocument(final InputStream in) {
        final StringBuffer buf = new StringBuffer();
        BufferedReader br = null;
        try {
            String line = "";
            br = new BufferedReader(new InputStreamReader(getInputStream(), this.getCharacterEncoding()));            
            while ((line = br.readLine()) != null) {
                buf.append(line).append(NL);                
            }
        } catch(final IOException e) {
            //e.print;StackTrace();
        } finally {
            try {
                if (br != null) {
                    br.close();
                }
            } catch (IOException ex) {             
            }
        }
        return buf.toString();
    }

    @Override
    public PrintWriter getWriter() throws IOException {
        if (output == null) {
            output = new PrintWriter(new OutputStreamWriter(getOutputStream(), this.getCharacterEncoding()));
        }
        return output;
    }

    @Override
    public ServletOutputStream getOutputStream() throws IOException {
        if (outStream == null) {
            outStream = new BufferingServletOutputStream();
        }
        return outStream;
    }

    public InputStream getInputStream() throws IOException {
        final BufferingServletOutputStream out = (BufferingServletOutputStream) getOutputStream();        
        return new ByteArrayInputStream(out.getBuffer().toByteArray());
    }

    /**
     * Implementation of ServletOutputStream that handles the in-memory
     * buffering of the response content
     */
    public static class BufferingServletOutputStream extends ServletOutputStream {
        ByteArrayOutputStream out = null;

        public BufferingServletOutputStream() {
            this.out = new ByteArrayOutputStream();
        }

        public ByteArrayOutputStream getBuffer() {
            return out;
        }

        public void write(int b) throws IOException {
            out.write(b);
        }

        public void write(byte[] b) throws IOException {
            out.write(b);
        }

        public void write(byte[] b, int off, int len) throws IOException {
            out.write(b, off, len);
        }
        @Override
        public void close() throws IOException {
            out.close();
            super.close();
        }
        @Override
        public void flush() throws IOException {
            out.flush();
            super.flush();
        }
    }
}

I found a possible solution, in the getInputStream method, it looks like if I call 'close' on all of the objects, e.g outStream.flush() and outStream.close() and then out.flush() and out.close()...it looks like the final bytes get written properly. it isn't intuitive but it looks like it works.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your initial approach failed because PrintWriter wraps the given ByteArrayOutputStream with a BufferedWriter which has an internal character buffer of 8192 characters, and you never flush() the buffer before getting the bytes from the ByteArrayOutputStream. In other words, when less than ~8KB of data is written to the getWriter() of the response, the wrapped ByteArrayOutputStream actually never get filled; namely everything is still in that internal character buffer, waiting to be flushed.

A fix would be to perform a flush() call before toByteArray() in your MyPrintWriter:

byte[] toByteArray() {
    pw.flush();
    return baos.toByteArray();
}

This way the internal character buffer will be flushed (i.e. it will actually write everything to the wrapped stream). This also totally explains why it works when you write to getOutputStream(), this step namely doesn't use the PrintWriter and nothing gets buffered in some internal buffer.


Unrelated to the concrete problem: this approach has some severe problems. It isn't respecting the response character encoding during construction of PrintWriter (you should actually wrap the ByteArrayOutputStream in an OutputStreamWriter instead which can take a character encoding) and relying on the platform default, in other words, any written Unicode characters may end up in Mojibake this way and thus this approach isn't ready for World Domination.

Also, this approach makes it possible to call both getWriter() and getOutputStream() on the same response, while that's considered an illegal state (precisely to avoid this kind of buffering and encoding trouble).


Update as per the comment, here's a full rewrite of the response wrapper, showing the right way, hopefully in a more self-explaining way than the code you've so far:

public class CapturingResponseWrapper extends HttpServletResponseWrapper {

    private final ByteArrayOutputStream capture;
    private ServletOutputStream output;
    private PrintWriter writer;

    public CapturingResponseWrapper(HttpServletResponse response) {
        super(response);
        capture = new ByteArrayOutputStream(response.getBufferSize());
    }

    @Override
    public ServletOutputStream getOutputStream() {
        if (writer != null) {
            throw new IllegalStateException("getWriter() has already been called on this response.");
        }

        if (output == null) {
            output = new ServletOutputStream() {
                @Override
                public void write(int b) throws IOException {
                    capture.write(b);
                }
                @Override
                public void flush() throws IOException {
                    capture.flush();
                }
                @Override
                public void close() throws IOException {
                    capture.close();
                }
            };
        }

        return output;
    }

    @Override
    public PrintWriter getWriter() throws IOException {
        if (output != null) {
            throw new IllegalStateException("getOutputStream() has already been called on this response.");
        }

        if (writer == null) {
            writer = new PrintWriter(new OutputStreamWriter(capture, getCharacterEncoding()));
        }

        return writer;
    }

    @Override
    public void flushBuffer() throws IOException {
        super.flushBuffer();

        if (writer != null) {
            writer.flush();
        }
        else if (output != null) {
            output.flush();
        }
    }

    public byte[] getCaptureAsBytes() throws IOException {
        if (writer != null) {
            writer.close();
        }
        else if (output != null) {
            output.close();
        }

        return capture.toByteArray();
    }

    public String getCaptureAsString() throws IOException {
        return new String(getCaptureAsBytes(), getCharacterEncoding());
    }

}

Here's how you're supposed to use it:

@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
    CapturingResponseWrapper capturingResponseWrapper = new CapturingResponseWrapper((HttpServletResponse) response);
    chain.doFilter(request, capturingResponseWrapper);
    String content = capturingResponseWrapper.getCaptureAsString(); // This uses response character encoding.
    String replacedContent = content.replaceAll("(?i)</form(\s)*>", "<input type="hidden" name="zval" value="fromSiteZ123"/></form>");
    response.getWriter().write(replacedContent); // Don't ever use String#getBytes() without specifying character encoding!
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...