hasNextLine() calls findWithinHorizon() which in turns calls findPatternInBuffer(), searching a match for a line terminator character pattern defined as .*(
|[
u2028u2029u0085])|.+$
Strange thing is that with both ways to construct a Scanner (with FileInputStream or via File), findPatternInBuffer returns a positive match if the file contains (independently from file size) for instance the 0x0A line terminator; but in the case the file contains a character out of ascii (ie >= 7f), using FileInputStream returns true while using File returns false.
Very simple test case:
create a file which contains just char "a"
# hexedit file
00000000 61 0A a.
# java Test.java
using File: true
using FileInputStream: true
now edit the file with hexedit to:
# hexedit file
00000000 61 0A 80 a..
# java Test.java
using File: false
using FileInputStream: true
in the test java code there is nothing else than what already in the question:
import java.io.*;
import java.lang.*;
import java.util.*;
public class Test {
public static void main(String[] args) {
try {
File file1 = new File("file");
Scanner s1 = new Scanner(file1);
System.out.println("using File: "+s1.hasNextLine());
File file2 = new File("file");
Scanner s2 = new Scanner(new FileInputStream(file2));
System.out.println("using FileInputStream: "+s2.hasNextLine());
} catch (IOException e) {
e.printStackTrace();
}
}
}
SO, it turns out this is a charset issue. In facts, changing the test to:
Scanner s1 = new Scanner(file1, "latin1");
we get:
# java Test
using File: true
using FileInputStream: true
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…