本文整理汇总了Java中org.apache.pdfbox.pdmodel.common.COSObjectable类的典型用法代码示例。如果您正苦于以下问题:Java COSObjectable类的具体用法?Java COSObjectable怎么用?Java COSObjectable使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
COSObjectable类属于org.apache.pdfbox.pdmodel.common包,在下文中一共展示了COSObjectable类的7个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。
示例1: testRemoveLikeStephanImproved
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
/**
* <a href="https://stackoverflow.com/questions/45812696/pdfbox-delete-comment-maintain-strikethrough">
* PDFBox delete comment maintain strikethrough
* </a>
* <br/>
* <a href="https://expirebox.com/files/3d955e6df4ca5874c38dbf92fc43b5af.pdf">
* only_fields.pdf
* </a>
* <a href="https://file.io/DTvqhC">
* (alternative download)
* </a>
* <p>
* The OP only wanted the comment removed, not the strike-through. Thus, we must
* not remove the annotation but merely the comment building attributes.
* </p>
*/
@Test
public void testRemoveLikeStephanImproved() throws IOException {
final COSName POPUP = COSName.getPDFName("Popup");
try (InputStream resource = getClass().getResourceAsStream("only_fields.pdf")) {
PDDocument document = PDDocument.load(resource);
List<PDAnnotation> annotations = new ArrayList<>();
PDPageTree allPages = document.getDocumentCatalog().getPages();
List<COSObjectable> objectsToRemove = new ArrayList<>();
for (int i = 0; i < allPages.getCount(); i++) {
PDPage page = allPages.get(i);
annotations = page.getAnnotations();
for (PDAnnotation annotation : annotations) {
if ("StrikeOut".equals(annotation.getSubtype()))
{
COSDictionary annotationDict = annotation.getCOSObject();
COSBase popup = annotationDict.getItem(POPUP);
annotationDict.removeItem(POPUP);
annotationDict.removeItem(COSName.CONTENTS); // plain text comment
annotationDict.removeItem(COSName.RC); // rich text comment
annotationDict.removeItem(COSName.T); // author
if (popup != null)
objectsToRemove.add(popup);
}
}
annotations.removeAll(objectsToRemove);
}
document.save(new File(RESULT_FOLDER, "only_fields-removeImproved.pdf"));
}
}
开发者ID:mkl-public,项目名称:testarea-pdfbox2,代码行数:52,代码来源:RemoveStrikeoutComment.java
示例2: processPages
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
/**
* This will process all of the pages and the text that is in them.
*
* @param pages The pages object in the document.
*
* @throws IOException If there is an error parsing the text.
*/
protected void processPages( List<COSObjectable> pages ) throws IOException
{
if( startBookmark != null )
{
startBookmarkPageNumber = getPageNumber( startBookmark, pages );
}
if( endBookmark != null )
{
endBookmarkPageNumber = getPageNumber( endBookmark, pages );
}
if( startBookmarkPageNumber == -1 && startBookmark != null &&
endBookmarkPageNumber == -1 && endBookmark != null &&
startBookmark.getCOSObject() == endBookmark.getCOSObject() )
{
//this is a special case where both the start and end bookmark
//are the same but point to nothing. In this case
//we will not extract any text.
startBookmarkPageNumber = 0;
endBookmarkPageNumber = 0;
}
Iterator<COSObjectable> pageIter = pages.iterator();
while( pageIter.hasNext() )
{
PDPage nextPage = (PDPage)pageIter.next();
PDStream contentStream = nextPage.getContents();
currentPageNo++;
if( contentStream != null )
{
COSStream contents = contentStream.getStream();
processPage( nextPage, contents );
}
}
}
开发者ID:hemangandhi,项目名称:my-cv-site,代码行数:42,代码来源:FormattedReader.java
示例3: processPages
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
protected void processPages(List<COSObjectable> pages) throws IOException
{
if (startBookmark != null)
{
startBookmarkPageNumber = getPageNumber(startBookmark, pages);
}
if (endBookmark != null)
{
endBookmarkPageNumber = getPageNumber(endBookmark, pages);
}
if (startBookmarkPageNumber == -1 && startBookmark != null &&
endBookmarkPageNumber == -1 && endBookmark != null &&
startBookmark.getCOSObject() == endBookmark.getCOSObject())
{
//this is a special case where both the start and end bookmark
//are the same but point to nothing. In this case
//we will not extract any getText.
startBookmarkPageNumber = 0;
endBookmarkPageNumber = 0;
}
for (COSObjectable page : pages)
{
PDPage nextPage = (PDPage) page;
PDStream contentStream = nextPage.getContents();
currentPageNo++;
if (contentStream != null)
{
COSStream contents = contentStream.getStream();
processPage(nextPage, contents);
}
}
}
开发者ID:nemausus,项目名称:research-paper-parser,代码行数:36,代码来源:PDFParser.java
示例4: sanitizeRecursiveNameTree
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
private <T extends COSObjectable> void sanitizeRecursiveNameTree(PDNameTreeNode<T> efTree, Consumer<T> callback) {
if (efTree == null)
return;
Map<String, T> _names;
try {
_names = efTree.getNames();
} catch (IOException e) {
LOGGER.error("Error in sanitizeRecursiveNameTree", e);
return;
}
if (_names != null) {
_names.values().forEach(callback);
}
if (efTree.getKids() == null)
return;
for (PDNameTreeNode<T> node : efTree.getKids()) {
sanitizeRecursiveNameTree(node, callback);
}
}
开发者ID:docbleach,项目名称:DocBleach,代码行数:22,代码来源:PdfBleach.java
示例5: getPageNumber
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
private int getPageNumber( PDOutlineItem bookmark, List<COSObjectable> allPages ) throws IOException
{
int pageNumber = -1;
PDPage page = bookmark.findDestinationPage( document );
if( page != null )
{
pageNumber = allPages.indexOf( page )+1;//use one based indexing
}
return pageNumber;
}
开发者ID:hemangandhi,项目名称:my-cv-site,代码行数:11,代码来源:FormattedReader.java
示例6: processEmbeddedDocNames
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
private void processEmbeddedDocNames(
Map<String, COSObjectable> embeddedFileNames,
EmbeddedDocumentExtractor embeddedExtractor) throws IOException,
SAXException, TikaException {
if (embeddedFileNames == null) {
return;
}
for (Map.Entry<String, COSObjectable> ent : embeddedFileNames.entrySet()) {
PDComplexFileSpecification spec = (PDComplexFileSpecification) ent
.getValue();
PDEmbeddedFile file = spec.getEmbeddedFile();
Metadata metadata = new Metadata();
// TODO: other metadata?
metadata.set(Metadata.RESOURCE_NAME_KEY, ent.getKey());
metadata.set(Metadata.CONTENT_TYPE, file.getSubtype());
metadata.set(Metadata.CONTENT_LENGTH, Long.toString(file.getSize()));
if (embeddedExtractor.shouldParseEmbedded(metadata)) {
TikaInputStream stream = TikaInputStream.get(file.createInputStream());
try {
embeddedExtractor.parseEmbedded(stream, new EmbeddedContentHandler(
handler), metadata, false);
} finally {
stream.close();
}
}
}
}
开发者ID:kolbasa,项目名称:OCRaptor,代码行数:30,代码来源:PDF2XHTML.java
示例7: getPageNumber
import org.apache.pdfbox.pdmodel.common.COSObjectable; //导入依赖的package包/类
private int getPageNumber(PDOutlineItem bookmark, List<COSObjectable> allPages) throws IOException
{
int pageNumber = -1;
PDPage page = bookmark.findDestinationPage(document);
if (page != null)
{
pageNumber = allPages.indexOf(page) + 1;//use one based indexing
}
return pageNumber;
}
开发者ID:nemausus,项目名称:research-paper-parser,代码行数:11,代码来源:PDFParser.java
注:本文中的org.apache.pdfbox.pdmodel.common.COSObjectable类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论