Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
394 views
in Technique[技术] by (71.8m points)

java - Why is the Gujarati-Indian text not rendered correctly using Arial Unicode MS?

This is a follow-up on this question How to export fonts in Gujarati-Indian Language to pdf?, @amedee-van-gasse, QA Engineer at iText asked me to post a question specific to itext with relevant mcve.

Why is this sequence of unicode u0ab9u0abfu0aaau0acdu0ab8 not rendered correctly?

It should be rendered like this:

????? , also tested with unicode-converter

However this code (example adapted form iText: Chapter 11: Choosing the right font)

public class FontTest {

    /** The resulting PDF file. */
    public static final String RESULT = "fontTest.pdf";
    /** the text to render. */
    public static final String TEST = "u0ab9u0abfu0aaau0acdu0ab8";

    public void createPdf(String filename) throws IOException, DocumentException {
        Document document = new Document();
        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
        document.open();
        BaseFont bf = BaseFont.createFont(
            "ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
        Font font = new Font(bf, 20);
        ColumnText column = new ColumnText(writer.getDirectContent());
        column.setSimpleColumn(36, 730, 569, 36);
        column.addElement(new Paragraph(TEST, font));
        column.go();
        document.close();
        System.out.println("DONE");
    }

    public static void main(String[] args) throws IOException, DocumentException {
        new FontTest().createPdf(RESULT);
    }
}

Generates this result:

pdf output

That looks different from

?????

I have test with itextpdf-5.5.4.jar,itextpdf-5.5.9.jar and also itext-2.1.7.js3.jar (distributed with jasper-reports)

The font used it the one distributes with MS Office ARIALUNI.TTF and it can be download from here Arial Unicode MS *Maybe there are some legal issues downloading see Mike 'Pomax' Kamermans comment

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Neither iText5 nor iText2 (which is a very outdated version by the way) support rendering of Indic scripts, no matter which font you select.

Rendering Indic scripts is not similar to any Latin scripts, because a long series of additional actions should be taken to get the correct result, e.g. some characters need to be reordered first according to the language rules.

This is a known issue to iText company.

There is a stub implementation for Gujaranti in iText5 called GujaratiLigaturizer, but the implementation is really poor and you cannot expect to get correct results with it.

You can try to process your string with this ligaturizer and then output the resultant string in the following way:

IndicLigaturizer g = new GujaratiLigaturizer();
String processed = g.process(inputString);
// proceed with the processed string

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...