Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
110 views
in Technique[技术] by (71.8m points)

java - Make HttpURLConnection load web pages with images

Currently I'm using HttpURLConnection for load remote web page and present to my clients (using InputStream to HttpResponse's outputStream transfer), it loads html correctly but skips images, how to fix it?

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You need to manipulate the HTML that way so that all resource URLs on the intranet domain are proxied as well. E.g. all of the following resource references in HTML

<base href="http://intranet.com/" />
<script src="http://intranet.com/script.js"></script>
<link href="http://intranet.com/style.css" />
<img src="http://intranet.com/image.png" />
<a href="http://intranet.com/page.html">link</a>

should be changed in the HTML that way so that they go through your proxy servlet instead, e.g.

<base href="http://example.com/proxy/" />
<script src="http://example.com/proxy/script.js"></script>
<link href="http://example.com/proxy/style.css" />
<img src="http://example.com/proxy/image.png" />
<a href="http://example.com/proxy/page.html">link</a>

A HTML parser like Jsoup is extremely helpful in this. You can do as follows in your proxy servlet which is, I assume, mapped on an URL pattern of /proxy/*.

String intranetURL = "http://intranet.com";
String internetURL = "http://example.com/proxy";

if (request.getRequestURI().endsWith(".html")) { // A HTML page is requested.
    Document document = Jsoup.connect(intranetURL + request.getPathInfo()).get();

    for (Element element : document.select("[href]")) {
        element.attr("href", element.absUrl("href").replaceFirst(intranetURL, internetURL));
    }

    for (Element element : document.select("[src]")) {
        element.attr("src", element.absUrl("src").replaceFirst(intranetURL, internetURL));
    }

    response.setContentType("text/html;charset=UTF-8");
    response.setCharacterEncoding("UTF-8");
    resposne.getWriter().write(document.html());
}
else { // Other resources like images, etc.
    URLConnection connection = new URL(intranetURL + request.getPathInfo()).openConnection();

    for (Map.Entry<String, List<String>> header : connection.getHeaderFields().entrySet()) {
        for (String value : header.getValue()) {
            response.addHeader(header.getKey(), value);
        }
    }

    InputStream input = connection.getInputStream();
    OutputStream output = response.getOutputStream();
    // Now just copy input to output.
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...