Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
252 views
in Technique[技术] by (71.8m points)

java - Files.readAllBytes vs Files.lines getting MalformedInputException

I would have thought that the following two approaches to read a file should behave equally. But they don't. The second approach is throwing a MalformedInputException.

public static void main(String[] args) {    
    try {
        String content = new String(Files.readAllBytes(Paths.get("_template.txt")));
        System.out.println(content);
    } catch (IOException e) {
        e.printStackTrace();
    }

    try(Stream<String> lines = Files.lines(Paths.get("_template.txt"))) {
        lines.forEach(System.out::println);
    } catch (IOException e) {
        e.printStackTrace();
    }
}

This is the stack trace:

Exception in thread "main" java.io.UncheckedIOException: java.nio.charset.MalformedInputException: Input length = 1
    at java.io.BufferedReader$1.hasNext(BufferedReader.java:574)
    at java.util.Iterator.forEachRemaining(Iterator.java:115)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
    at Test.main(Test.java:19)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at java.io.BufferedReader$1.hasNext(BufferedReader.java:571)
    ... 4 more

What is the difference here, and how do I fix it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This has to do with character encoding. Computers only deal with numbers. To store text, the characters in the text have to be converted to and from numbers, using some scheme. That scheme is called the character encoding. There are many different character encodings; some of the well-known standard character encodings are ASCII, ISO-8859-1 and UTF-8.

In the first example, you read all the bytes (numbers) in the file and then convert them to characters by passing them to the constructor of class String. This will use the default character encoding of your system (whatever it is on your operating system) to convert the bytes to characters.

In the second example, where you use Files.lines(...), the UTF-8 character encoding will be used, according to the documentation. When a sequence of bytes is found in the file that is not a valid UTF-8 sequence, you'll get a MalformedInputException.

The default character encoding of your system may or may not be UTF-8, so that can explain a difference in behaviour.

You'll have to find out what character encoding is used for the file, and then explicitly use that. For example:

String content = new String(Files.readAllBytes(Paths.get("_template.txt")),
        StandardCharsets.ISO_8859_1);

Second example:

Stream<String> lines = Files.lines(Paths.get("_template.txt"),
        StandardCharsets.ISO_8859_1);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...