Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

unicode - Convert GB2312 to UTF-8

I have a text file that contains localized language strings that is currently encoded in GB2312 (simplified Chinese), but all of my other language files are in UTF-8. I am finding it very difficult to work with this file, as none of my text editors will work properly with it and keep corrupting it. Are there any tools to convert this to UTF-8, and are there any downsides to doing this? Would it be better to just keep it as GB2312 and use a different editor (if so, can you recommend one)?

Update: I'm using Windows XP (English install).

Update #2: I've tried using Notepad++ and Notepad2 to edit the GB2312 files, but both are unable to read the files and corrupt them.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can try this online service that uses the Open Source iconv utility.
You can also install Charco, a command-line version of it on your machine.

For GB2312, you can use CP936 as the encoding.

If you are a .Net developer you can make a small tool that does just that.
I've struggled with this as well and found that it was actually simple to solve from a programmatic point of view.

All you need is something like this (I tested it and it works):

In C#

static void Main(string[] args) {
    string infile = args[0];
    string outfile = args[1];

    using (StreamReader sr = new StreamReader(infile, Encoding.GetEncoding(936))) {
        using (StreamWriter sw = new StreamWriter(outfile, false, Encoding.UTF8)) {
            sw.Write(sr.ReadToEnd());
            sw.Close();
        }
        sr.Close();
    }
}

In VB.Net

Private Shared Sub Main(ByVal args() As String)
    Dim infile As String = args(0)
    Dim outfile As String = args(1)
    Dim sr As StreamReader = New StreamReader(infile, Encoding.GetEncoding(936))
    Dim sw As StreamWriter = New StreamWriter(outfile, false, Encoding.UTF8)
    sw.Write(sr.ReadToEnd)
    sw.Close
    sr.Close
End Sub

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...