Updated Answer
You want to avoid all sorts of DOS attacks (on lines, on size of the file, etc). But in the end of the function, you're trying to convert the entire file into one single String
!!! Assume that you limit the line to 8 KB, but what happens if somebody sends you a file with two 8 KB lines? The line reading part will pass, but when finally you combine everything into a single string, the String will choke all available memory.
So since finally you're converting everything into one single String, limiting line size doesn't matter, nor is safe. You have to limit the entire size of the file.
Secondly, what you're basically trying to do is, you're trying to read data in chunks. So you're using BufferedReader
and reading it line-by-line. But what you're trying to do, and what you really want at the end - is some way of reading the file piece by piece. Instead of reading one line at a time, why not instead read 2 KB at a time?
BufferedReader
- by its name - has a buffer inside it. You can configure that buffer. Let's say you create a BufferedReader
with buffer size of 2 KB:
BufferedReader reader = new BufferedReader(..., 2048);
Now if the InputStream
that you pass to BufferedReader
has 100 KB of data, BufferedReader
will automatically read it 2 KB at at time. So it will read the stream 50 times, 2 KB each (50x2KB = 100 KB). Similarly, if you create BufferedReader
with a 10 KB buffer size, it will read the input 10 times (10x10KB = 100 KB).
BufferedReader
already does the work of reading your file chunk-by-chunk. So you don't want to add an extra layer of line-by-line above it. Just focus on the end result - if your file at the end is too big (> available RAM) - how are you going to convert it into a String
at the end?
One better way is to just pass things around as a CharSequence
. That's what Android does. Throughout the Android APIs, you will see that they return CharSequence
everywhere. Since StringBuilder
is also a subclass of CharSequence
, Android will internally use either a String
, or a StringBuilder
or some other optimized string class based on the size/nature of input. So you could rather directly return the StringBuilder
object itself once you've read everything, rather than converting it to a String
. This would be safer against large data. StringBuilder
also maintains the same concept of buffers inside it, and it will internally allocate multiple buffers for large strings, rather than one long string.
So overall:
- Limit the overall file size since you're going to deal with the entire content at some point. Forget about limiting or splitting lines
- Read in chunks
Using Apache Commons IO, here is how you would read data from a BoundedInputStream
into a StringBuilder
, splitting by 2 KB blocks instead of lines:
// import org.apache.commons.io.output.StringBuilderWriter;
// import org.apache.commons.io.input.BoundedInputStream;
// import org.apache.commons.io.IOUtils;
BoundedInputStream boundedInput = new BoundedInputStream(originalInput, <max-file-size>);
BufferedReader reader = new BufferedReader(new InputStreamReader(boundedInput), 2048);
StringBuilder output = new StringBuilder();
StringBuilderWriter writer = new StringBuilderWriter(output);
IOUtils.copy(reader, writer); // copies data from "reader" => "writer"
return output;
Original Answer
Use BoundedInputStream from Apache Commons IO library. Your work becomes much more easier.
The following code will do what you want:
public static String getContentFromInputStream(InputStream inputStream) {
inputStream = new BoundedInputStream(inputStream, <number-of-bytes>);
// Rest code are all same
You just simply wrap your InputStream
with a BoundedInputStream
and you specify a maximum size. BoundedInputStream
will take care of limiting reads up to that maximum size.
Or you can do this when you're creating the reader:
BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(
new BoundedInputStream(inputStream, <no-of-bytes>)
)
);
Basically what we're doing here is, we're limiting the read size at the InputStream
layer itself, rather than doing that when reading lines. So you end up with a reusable component like BoundedInputStream
which limits reading at the InputStream layer, and you can use that wherever you want.
Edit: Added footnote
Edit 2: Added updated answer based on comments