I slightly modified your code, it copies text without changing text format.
public static void main(String[] args) {
try {
InputStream is = new FileInputStream("Japan.docx");
XWPFDocument doc = new XWPFDocument(is);
List<XWPFParagraph> paras = doc.getParagraphs();
XWPFDocument newdoc = new XWPFDocument();
for (XWPFParagraph para : paras) {
if (!para.getParagraphText().isEmpty()) {
XWPFParagraph newpara = newdoc.createParagraph();
copyAllRunsToAnotherParagraph(para, newpara);
}
}
FileOutputStream fos = new FileOutputStream(new File("newJapan.docx"));
newdoc.write(fos);
fos.flush();
fos.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
// Copy all runs from one paragraph to another, keeping the style unchanged
private static void copyAllRunsToAnotherParagraph(XWPFParagraph oldPar, XWPFParagraph newPar) {
final int DEFAULT_FONT_SIZE = 10;
for (XWPFRun run : oldPar.getRuns()) {
String textInRun = run.getText(0);
if (textInRun == null || textInRun.isEmpty()) {
continue;
}
int fontSize = run.getFontSize();
System.out.println("run text = '" + textInRun + "' , fontSize = " + fontSize);
XWPFRun newRun = newPar.createRun();
// Copy text
newRun.setText(textInRun);
// Apply the same style
newRun.setFontSize( ( fontSize == -1) ? DEFAULT_FONT_SIZE : run.getFontSize() );
newRun.setFontFamily( run.getFontFamily() );
newRun.setBold( run.isBold() );
newRun.setItalic( run.isItalic() );
newRun.setStrike( run.isStrike() );
newRun.setColor( run.getColor() );
}
}
There's still a little problem with fontSize. Sometimes POI can't determine the size of a run (i write its value to console to trace it) and gives -1. It defines perfectly the size of font when i set it myself (say, i select some paragraphs in Word and set its font manually, either size or font family). But when it treats another POI-generated text, it sometimes gives -1. So i intriduce a default font size (10 in the above example) to be set when POI gives -1.
Another issue seems to emerge with Calibri font family. But in my tests, POI sets it to Arial by default, so i don't have the same trick with default fontFamily, as it was for fontSize.
Other font properties (Bold, italic, etc.) work well.
Probably, all these font problems are due to the fact that in my tests text was copied from .doc file. If you have .doc as input, open .doc file in Word, then "Save as.." and choose .docx format. Then in your program use only XWPFDocument
instead of HWPFDocument
, and i suppose it will be okay.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…