In his update the OP nearly has it right, there merely are two errors:
He tries to read the InputStream
parameter content twice:
CMSTypedData msg = new CMSProcessableByteArray(IOUtils.toByteArray(content));
[...]
Attribute attr = new Attribute(CMSAttributes.messageDigest,
new DERSet(new DEROctetString(md.digest(IOUtils.toByteArray(content)))));
Thus, all data had already been read from the stream before the second attempt which consequently returned an empty byte[]
. So the message digest attribute contained a wrong hash value.
He creates the final CMS container in a convoluted way:
return new CMSSignedData(msg, s.getEncoded()).getEncoded();
Reducing the latter to what is actually needed, it turns out that there is no need for the CMSTypedData msg
anymore. Thus, the former is implicitly resolved.
After re-arranging the digest calculation to the top of the method and additionally switching to SHA256 (as SHA1 is deprecated in many contexts, I prefer to use a different hash algorithm) and allowing for a certificate chain
instead of a single certificate
, the method looks like this:
// Digest generation step
MessageDigest md = MessageDigest.getInstance("SHA256", "BC");
byte[] digest = md.digest(IOUtils.toByteArray(content));
// Separate signature container creation step
List<Certificate> certList = Arrays.asList(chain);
JcaCertStore certs = new JcaCertStore(certList);
CMSSignedDataGenerator gen = new CMSSignedDataGenerator();
Attribute attr = new Attribute(CMSAttributes.messageDigest,
new DERSet(new DEROctetString(digest)));
ASN1EncodableVector v = new ASN1EncodableVector();
v.add(attr);
SignerInfoGeneratorBuilder builder = new SignerInfoGeneratorBuilder(new BcDigestCalculatorProvider())
.setSignedAttributeGenerator(new DefaultSignedAttributeTableGenerator(new AttributeTable(v)));
AlgorithmIdentifier sha256withRSA = new DefaultSignatureAlgorithmIdentifierFinder().find("SHA256withRSA");
CertificateFactory certFactory = CertificateFactory.getInstance("X.509");
InputStream in = new ByteArrayInputStream(chain[0].getEncoded());
X509Certificate cert = (X509Certificate) certFactory.generateCertificate(in);
gen.addSignerInfoGenerator(builder.build(
new BcRSAContentSignerBuilder(sha256withRSA,
new DefaultDigestAlgorithmIdentifierFinder().find(sha256withRSA))
.build(PrivateKeyFactory.createKey(pk.getEncoded())),
new JcaX509CertificateHolder(cert)));
gen.addCertificates(certs);
CMSSignedData s = gen.generate(new CMSAbsentContent(), false);
return s.getEncoded();
(CreateSignature method signWithSeparatedHashing
)
Used in a fairly minimal signing code frame
void sign(PDDocument document, OutputStream output, SignatureInterface signatureInterface) throws IOException
{
PDSignature signature = new PDSignature();
signature.setFilter(PDSignature.FILTER_ADOBE_PPKLITE);
signature.setSubFilter(PDSignature.SUBFILTER_ADBE_PKCS7_DETACHED);
signature.setName("Example User");
signature.setLocation("Los Angeles, CA");
signature.setReason("Testing");
signature.setSignDate(Calendar.getInstance());
document.addSignature(signature);
ExternalSigningSupport externalSigning =
document.saveIncrementalForExternalSigning(output);
byte[] cmsSignature = signatureInterface.sign(externalSigning.getContent());
externalSigning.setSignature(cmsSignature);
}
(CreateSignature method sign
)
like this
try ( InputStream resource = getClass().getResourceAsStream("test.pdf");
OutputStream result = new FileOutputStream(new File(RESULT_FOLDER, "testSignedWithSeparatedHashing.pdf"));
PDDocument pdDocument = PDDocument.load(resource) )
{
sign(pdDocument, result, data -> signWithSeparatedHashing(data));
}
(CreateSignature test method testSignWithSeparatedHashing
)
results in properly signed PDFs, as proper at least as the certificates and private key in question are for the task at hand.
One remark:
The OP used IOUtils.toByteArray(content))
(and so do I in the code above). But considering the OP's starting remark
what if the PDF file is too big to be uploaded? ex: 100mb
doing so is not such a great idea as it loads a big file into memory at once only for hashing. If one really wants to consider the resource footprint of one's application, one should read the stream a few KB at a time and consecutively digest the data using MessageDigest.update
and only use MessageDigest.digest
at the end to get the result hash value.