Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
215 views
in Technique[技术] by (71.8m points)

python - combine duplicate keys in json

i have a json that looks like this:

{
  "course1": [
    {
      "courseName": "test",
      "section": "123",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "456",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "789",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "1011",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1213",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1415",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ]
}

and i want to combine any block/object/list (i don't know what it called), that they have the same key value. like this:

{
  "course1": [
    {
      "courseName": "test",
      "section": "123",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "456",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "789",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "1011",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1213",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "1415",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ]
}

how can i do this using regular expression in python? or any regular expression query?

also, i tried to use json.dumps() and work my way from there but for some reason when i use it with any json that contains Arabic characters it freaks out and messes up the whole thing. so i'm stuck with regular expression unfortunately.

and thank you for your help :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

stdlib json offers a hook to allow decoding objects with duplicate keys. This simple "extend" hook should work for your example data:

def myhook(pairs):
    d = {}
    for k, v in pairs:
        if k not in d:
          d[k] = v
        else:
          d[k] += v
    return d

mydata = json.loads(bad_json, object_pairs_hook=myhook)

Although there's nothing in the JSON specification to disallow duplicate keys, it SHOULD probably be avoided in the first place:

1.1. Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

...

  1. Objects

An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The names within an object SHOULD be unique.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...