Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
249 views
in Technique[技术] by (71.8m points)

java - why does "STRING".getBytes() work different according to the Operation System

I am running the code below and I am getting different outcome from "some_string".getBytes() depending if I am in Windows or Unix. The issue happens with any string (I tried a very simple ABC and same problem.

See the differences below printed in console.

The code below is well-tested using Java 7. If you copy it entirely it will run.

Additionally, see the difference in Hexadecimal in the two images below. The first two images shows the file created in Windows. You can see the hexadecimal values with ANSI and EBCDIC respectively. The third image, the black one, is from Unix. You can see the hexadecimal (-c option) and the character readable in which I believe it is EBCDIC.

So, my straight question is: why does such code work different since I am just using Java 7 in both case? Should I check any especific property in somewhere? Maybe, Java in Windows get certain default format and in Unix it get another. If so, which property must I check or settup?

enter image description here

Unix Console:

$ ./java -cp /usr/test.jar test.mainframe.read.test.TestGetBytes
H = 76 - L
< wasn't found

Windows Console:

H = 60 - <
H1 = 69 - E
H2 = 79 - O
H3 = 77 - M
H4 = 62 - >
End of Message found

The entire code:

package test.mainframe.read.test;

import java.util.ArrayList;

public class TestGetBytes {

       public static void main(String[] args) {
              try {
                     ArrayList ipmMessage = new ArrayList();
                     ipmMessage.add(newLine());

                     //Windows Path
                     writeMessage("C:/temp/test_bytes.ipm", ipmMessage);
                     reformatFile("C:/temp/test_bytes.ipm");
                     //Unix Path
                     //writeMessage("/usr/temp/test_bytes.ipm", ipmMessage);
                     //reformatFile("/usr/temp/test_bytes.ipm");
              } catch (Exception e) {

                     System.out.println(e.getMessage());
              }
       }

       public static byte[] newLine() {
              return "<EOM>".getBytes();
       }

       public static void writeMessage(String fileName, ArrayList ipmMessage)
                     throws java.io.FileNotFoundException, java.io.IOException {

              java.io.DataOutputStream dos = new java.io.DataOutputStream(
                           new java.io.FileOutputStream(fileName, true));
              for (int i = 0; i < ipmMessage.size(); i++) {
                     try {
                           int[] intValues = (int[]) ipmMessage.get(i);
                           for (int j = 0; j < intValues.length; j++) {
                                  dos.write(intValues[j]);
                           }
                     } catch (ClassCastException e) {
                           byte[] byteValues = (byte[]) ipmMessage.get(i);
                           dos.write(byteValues);
                     }
              }
              dos.flush();
              dos.close();

       }

       // reformat to U1014
       public static void reformatFile(String filename)
                     throws java.io.FileNotFoundException, java.io.IOException {
              java.io.FileInputStream fis = new java.io.FileInputStream(filename);
              java.io.DataInputStream br = new java.io.DataInputStream(fis);

              int h = br.read();
              System.out.println("H = " + h + " - " + (char)h);

              if ((char) h == '<') {// Check for <EOM>

                     int h1 = br.read();
                     System.out.println("H1 = " + h1 + " - " + (char)h1);
                     int h2 = br.read();
                     System.out.println("H2 = " + h2 + " - " + (char)h2);
                     int h3 = br.read();
                     System.out.println("H3 = " + h3 + " - " + (char)h3);
                     int h4 = br.read();
                     System.out.println("H4 = " + h4 + " - " + (char)h4);
                     if ((char) h1 == 'E' && (char) h2 == 'O' && (char) h3 == 'M'
                                  && (char) h4 == '>') {
                           System.out.println("End of Message found");
                     }
                     else{
                           System.out.println("EOM not found but < was found");
                     }
              }
              else{
                     System.out.println("< wasn't found");
              }
       }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You are not specifying a charset when calling getBytes(), so it uses the default charset of the underlying platform (or of Java itself if specified when Java is started). This is stated in the String documentation:

public byte[] getBytes()

Encodes this String into a sequence of bytes using the platform's default charset, storing the result into a new byte array.

getBytes() has an overloaded version that lets you specify a charset in your code.

public byte[] getBytes(Charset charset)

Encodes this String into a sequence of bytes using the given charset, storing the result into a new byte array.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...