Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
654 views
in Technique[技术] by (71.8m points)

c# - Out of memory exception while loading large json file from disk

I have a 1.2 GB json file which when deserialized ought to give me a list with 15 mil objects.

The machine on which I'm trying to deserialize the same is a windows 2012 server(64 bit) with 16 core and 32 GB Ram.

The application has been built with target of x64.

Inspite of this when I try to read the json doc and convert it to list of objects I'm getting Out of memory exception. when I look at task manager I find that only 5GB memory has been used.

The codes i tried are as below..

a.

 string plays_json = File.ReadAllText("D:\Hun\enplays.json");

                plays = JsonConvert.DeserializeObject<List<playdata>>(plays_json);

b.

 string plays_json = "";
        using (var reader = new StreamReader("D:\Hun\enplays.json"))
        {
            plays_json = reader.ReadToEnd();
            plays = JsonConvert.DeserializeObject<List<playdata>>(plays_json);
        }

c.

 using (StreamReader sr = File.OpenText("D:\Hun\enplays.json"))
        {
            StringBuilder sb = new StringBuilder();
            sb.Append(sr.ReadToEnd());
            plays_json = sb.ToString();
            plays = JsonConvert.DeserializeObject<List<playdata>>(plays_json);
        }

All help is sincerely appreciated

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The problem is that you are reading your entire huge file into memory and then trying to deserialize it all at once into a huge list. You should be using a StreamReader to process your file incrementally. Example (b) in your question doesn't cut it, even though you are using a StreamReader there, because you are still reading the entire file via ReadToEnd(). You should be doing something like this instead:

using (StreamReader sr = new StreamReader("D:\Hun\enplays.json"))
using (JsonTextReader reader = new JsonTextReader(sr))
{
    var serializer = new JsonSerializer();

    while (reader.Read())
    {
        if (reader.TokenType == JsonToken.StartObject)
        {
            // Deserialize each object from the stream individually and process it
            var playdata = serializer.Deserialize<playdata>(reader);

            ProcessPlayData(playdata);
        }
    }
}

The ProcessPlayData method should process a single playdata object and then ideally write the result to a file or a database rather than an in-memory list (otherwise you may find yourself back in the same situation again). If you must store the results of processing each item into an in-memory list, then you might want to consider using a linked list or a similar structure that does not try to allocate memory in one contiguous block and does not need to reallocate and copy when it needs to expand.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...