I want to read UTF-8 input in Perl, no matter if it comes from the standard input or from a file, using the diamond operator: while(<>){...}
.
So my script should be callable in these two ways, as usual, giving the same output:
./script.pl utf8.txt
cat utf8.txt | ./script.pl
But the outputs differ! Only the second call (using cat
) seems to work as designed, reading UTF-8 properly. Here is the script:
#!/usr/bin/perl -w
binmode STDIN, ':utf8';
binmode STDOUT, ':utf8';
while(<>){
my @chars = split //, $_;
print "$_
" foreach(@chars);
}
How can I make it read UTF-8 correctly in both cases? I would like to keep using the diamond operator <>
for reading, if possible.
EDIT:
I realized I should probably describe the different outputs. My input file contains this sequence: axCAxA7b
. The method with cat
correctly outputs:
a
xCAxA7
b
But the other method gives me this:
a
xC3x8A
xC2xA7
b
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…