Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
472 views
in Technique[技术] by (71.8m points)

itext - How to insert invisible text into a PDF?

UPDATE: Please see https://softwarerecs.stackexchange.com/questions/71464/java-library-to-insert-invisible-text-into-a-pdf instead.

I want to insert invisible text into an existing PDF file, to make it searchable.

What library should I use?
I would appreciate links to specific API methods to use.

Free, ideally open source.
Thanks a lot!

(For the curious: I want to automatically OCR incoming scanned papers and make them searcheable, in an Alfresco repository)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

3 options. My answers are itext-specific, but you should be able to translate the underlying methods to any sufficiently advance PDF library.

  1. Text render mode 3: "No stroke, no fill". With iText: myPdfContentByte.setTextRenderMode(PdfContentByte.TEXT_RENDER_MODE_INVISIBLE);
  2. Draw the text behind something. You're presumably using scanned page images. iText myPdfStamper.getUnderContent(pageNum) makes this easy, and will let you draw the text under the scan. Other libraries that let you access a page's contents might require you to add your text 'in the raw' at the beginning of an existing content stream. You'll want to check out the "PDF Spec" (google that, you'll be fine) for details. Chapter 9 is all about text rendering.
  3. Draw the text outside the page's media or crop box. If you just want some random PDF-savvy search engine to turn up your page this will work, but if you want people looking at the PDF to see the appropriate text selection box... not so much.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...