Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
414 views
in Technique[技术] by (71.8m points)

java - How to use the browser's (chrome/firefox) HTML/CSS/JS rendering engine to produce PDF?

There are nice projects that generate pdf from html/css/js files

  1. http://wkhtmltopdf.org/ (open source)
  2. https://code.google.com/p/flying-saucer/ (open source)
  3. http://cssbox.sourceforge.net/ (not necessarily straight pdf generation)
  4. http://phantomjs.org/ (open source allows for pdf output)
  5. http://www.princexml.com/ (comercial but hands down the best one out there)
  6. https://thepdfapi.com/ a chrome modification to spit pdf from html from

I want to programatically control chrome or firefox browser (because they both are cross platform) to make them load a web page, run the scripts and style the page and generate a pdf file for printing.

But how do I start by controlling the browser in an automated way so that I can do something like

render-to-pdf file-to-render.html out.pdf

I can easily make this job manually by browsing the page and then printing it to pdf and I get an accurate, 100% spec compliant rendered html/css/js page on a pdf file. Even the url headers can be omitted in the pdf through configuration options in the browser. But again, how do I start in trying to automate this process?

I want to automate in the server side, the opening of the browser, navigating to a page, and generating the pdf using the browser rendered page.

I have done a lot of research I just don't know how to make the right question. I want to programatically control the browser, maybe like selenium does but to the point where I export a webpage as PDF (hence using the rendering capabilities of the browser to produce good pdfs)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I'm not an expert but PhamtomJS seems to be the right tool for the job. I'm not sure though about what headless browser it uses underneath (I guess it is chrome/chromium)

var page = require('webpage').create();
page.open('http://github.com/', function() {
     var s = page.evaluate(function() {
         var body = document.body,
             html = document.documentElement;

        var height = Math.max( body.scrollHeight, body.offsetHeight, 
            html.clientHeight, html.scrollHeight, html.offsetHeight );
        var width = Math.max( body.scrollWidth, body.offsetWidth, 
            html.clientWidth, html.scrollWidth, html.offsetWidth );
        return {width: width, height: height}
    });

    console.log(JSON.stringify(s));

    // so it fit ins a single page
    page.paperSize = {
        width: "1980px",
        height: s.height + "px",
        margin: {
            top: '50px',
            left: '20px'
        }
    };

    page.render('github.pdf');
    phantom.exit();
});

Hope it helps.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...