java - How to get byte array from iText PDFReader

Question

Welcome To Ask or Share your Answers For Others

java - How to get byte array from iText PDFReader

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

java - How to get byte array from iText PDFReader

How to get byte array from Itext PDFReader.

float width = 8.5f * 72;
float height = 11f * 72;
float tolerance = 1f;

PdfReader reader = new PdfReader("source.pdf");

for (int i = 1; i <= reader.getNumberOfPages(); i++)
{
    Rectangle cropBox = reader.getCropBox(i);
    float widthToAdd = width - cropBox.getWidth();
    float heightToAdd = height - cropBox.getHeight();
    if (Math.abs(widthToAdd) > tolerance || Math.abs(heightToAdd) > tolerance)
    {
        float[] newBoxValues = new float[] { 
            cropBox.getLeft() - widthToAdd / 2,
            cropBox.getBottom() - heightToAdd / 2,
            cropBox.getRight() + widthToAdd / 2,
            cropBox.getTop() + heightToAdd / 2
        };
        PdfArray newBox = new PdfArray(newBoxValues);

        PdfDictionary pageDict = reader.getPageN(i);
        pageDict.put(PdfName.CROPBOX, newBox);
        pageDict.put(PdfName.MEDIABOX, newBox);
    }
}

From above code I need to get byte array from reader object. How?

1) Not working, getting empty byteArray.

OutputStream out = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, out);
stamper.close();

byte byteArray[] = (((ByteArrayOutputStream)out).toByteArray());

2) Not working, getting java.io.IOException: Error: Header doesn't contain versioninfo

ByteArrayOutputStream outputStream = new ByteArrayOutputStream( );
    for (int i = 1; i <= reader.getNumberOfPages(); i++)
        {
            outputStream.write(reader.getPageContent(i));
        }
   PDDocument pdDocument = new PDDocument().load(outputStream.toByteArray( );)

Is there any other way to get byte array from PDFReader.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T20:07:13+0000

Let's take a the question from a different angle. It seems to me that you want to render a PDF page by page. If so, then your question is all wrong. Extracting the page content stream will not be sufficient as I already indicated: not a single renderer will be able to render such a stream because you don't pass any resources such as fonts, Form and Image XObjects,...

If you want to render separate pages from a PDF, you need to burst the document into separate single page full-blown PDF documents. These single page documents need to contain all the necessary information to render the page. This isn't memory friendly: suppose that you have a 100 KByte document of 10 pages where every page shows an 80 KByte logo, you'll end up with 10 documents that are each at least 80 KByte (times 10 makes already 800 KByte which is much more than the 10-page document where a single Image XObject is shared by the 10 pages).

You'd need to do something like this:

PdfReader reader = new PdfReader("source.pdf");
int n = reader.getNumberOfPages();
reader close();
ByteArrayOutputStream boas;
PdfStamper stamper;
for (int i = 0; i < n; ) {
    reader = new PdfReader("source.pdf");
    reader.selectPages(String.valueOf(++i));
    baos = new ByteArrayOutputStream();
    stamper = new PdfStamper(reader, baos);
    stamper.close();
    doSomethingWithBytes(baos.toByteArray);
}

In this case, baos.toByteArray() will contain the bytes of a valid PDF file. This wasn't the case in any of your attempts.

Categories

java - How to get byte array from iText PDFReader

java - How to get byte array from iText PDFReader

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags