In Java, the "default" AES/GCM provider SunJCE will - during the decryption process - internally buffer 1) encrypted bytes used as input or 2) decrypted bytes produced as result. Application code doing decryption will notice that Cipher.update(byte[])
return an empty byte array and Cipher.update(ByteBuffer, ByteBuffer)
return written length 0. Then when the process completes, Cipher.doFinal()
will return all decoded bytes.
First question is: Which bytes is it that are being buffered, number 1 or number 2 above?
I assume the buffering occurs only during decryption and not encryption because firstly, the problems that arise from this buffering (shortly described) does not occur in my Java client doing the encryption of files read from the disk, it always occur on the server side, receiving those files and doing the decryption. Secondly, it is said so here. Judging only by my own experience, I can not be sure because my client uses a CipherOutputStream
. The client does not explicitly use methods on the Cipher instance. Hence I can not deduce whether internal buffering is used or not because I can not see what the update- and final method return.
My real problems arise when the encrypted files I transmit from client to server become large. By large I mean over 100 MB.
What happens then is that Cipher.update() throw a OutOfMemoryError
. Obviously due to the internal buffer growing and growing.
Also, despite internal buffering and no result bytes received from Cipher.update(), Cipher.getOutputSize(int) continously report an ever growing target buffer length. Hence, my application code is forced to allocate an ever growing ByteBuffer
that is feed into Cipher.update(ByteBuffer, ByteBuffer). If I try to cheat and pass in a byte buffer with a smaller capacity, then the update method throw a ShortBufferException
#1. Knowing I create huge byte buffers for no use is quite demoralizing.
Given that internal buffering is the root of all evil, then the obvious solution for me to apply here is to split the files into chunks, of say 1 MB each - I never have problems sending small files, only large ones. But, I struggle hard to understand why internal buffering happens in the first place.
The previously linked SO answer says that GCM:s authentication tag is "added at the end of the ciphertext", but that it "does not have to be put at the end" and this practice is what "messes up the online nature of GCM decryption".
Why is it that putting the tag at the end only messes up the server's job of doing decryption?
Here is how my reasoning goes. To compute an authentication tag, or MAC if you will, the client use some kind of a hash function. Apparently, MessageDigest.update()
does not use an ever growing internal buffer.
Then on the receiving end, can not the server do the very same thing? For starters, he can decrypt the bytes, albeit unauthenticated ones, feed those into his update function of the hash algorithm and when the tag arrives, finish the digest and verify the MAC that the client sent.
I'm not a crypto guy so please speak to me as if I am both dumb and crazy but loving enough to care some for =) I thank you wholeheartedly for the time taken to read through this question and perhaps even shed some light!
UPDATE #1
I don't use AD (Associated Data).
UPDATE #2
Wrote software that demonstrate AES/GCM encryption using Java, as well as the Secure Remote Protocol (SRP) and binary file transfers in Java EE. The front-end client is written in JavaFX and can be used to dynamically change encryption configuration or send files using chunks. At the end of a file transfer, some statistics about time used to transfer the file and time for the server to decrypt is presented. The repository also has a document with some of my own GCM and Java related research.
Enjoy: https://github.com/MartinanderssonDotcom/secure-login-file-transfer/
#1
It is interesting to note that if my server who do the decryption does not handle the cipher himself, instead he use a CipherInputStream
, then OutOfMemoryError is not thrown. Instead, the client manage to transfer all bytes over the wire but somewhere during the decryption, the request thread hang indefinitely and I can see that one Java thread (might be the same thread) fully utilize a CPU core, all while leaving the file on disk inaccessible and with a reported file size of 0. Then after a tremendously long amount of time, the Closeable
source is closed and my catch clause manage to catch an IOException caused by: "javax.crypto.AEADBadTagException: Input too short - need tag".
What render this situation weird is that transmitting smaller files work flawless with the exact same piece of code - so obviously the tag can be properly verified. The problem must have the same root cause as when using the cipher explicitly, i.e. an ever growing internal buffer. I can not track on the server how many bytes was successfully read/deciphered because as soon as the reading of the cipher input stream begin, then compiler reordering or other JIT optimizations make all my logging statements evaporate into thin air. They are [apparently] not executed at all.
Note that this GitHub project and its associated blog post says CipherInputStream is broken. But the tests provided by this project does not fail for me when using Java 8u25 and the SunJCE provider. And as has already been said, everything work for me if only I use small files.
See Question&Answers more detail:
os