Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

python - Flatten a dictionary of dictionaries (2 levels deep) of lists

I'm trying to wrap my brain around this but it's not flexible enough.

In my Python script I have a dictionary of dictionaries of lists. (Actually it gets a little deeper but that level is not involved in this question.) I want to flatten all this into one long list, throwing away all the dictionary keys.

Thus I want to transform

{1: {'a': [1, 2, 3], 'b': [0]},
 2: {'c': [4, 5, 1], 'd': [3, 8]}}

to

[1, 2, 3, 0, 4, 5, 1, 3, 8]

I could probably set up a map-reduce to iterate over items of the outer dictionary to build a sublist from each subdictionary and then concatenate all the sublists together.

But that seems inefficient for large data sets, because of the intermediate data structures (sublists) that will get thrown away. Is there a way to do it in one pass?

Barring that, I would be happy to accept a two-level implementation that works... my map-reduce is rusty!

Update: For those who are interested, below is the code I ended up using.

Note that although I asked above for a list as output, what I really needed was a sorted list; i.e. the output of the flattening could be any iterable that can be sorted.

def genSessions(d):
    """Given the ipDict, return an iterator that provides all the sessions,
    one by one, converted to tuples."""
    for uaDict in d.itervalues():
        for sessions in uaDict.itervalues():
            for session in sessions:
                yield tuple(session)

...

# Flatten dict of dicts of lists of sessions into a list of sessions.
# Sort that list by start time
sessionsByStartTime = sorted(genSessions(ipDict), key=operator.itemgetter(0))
# Then make another copy sorted by end time.
sessionsByEndTime = sorted(sessionsByStartTime, key=operator.itemgetter(1))

Thanks again to all who helped.

[Update: replaced nthGetter() with operator.itemgetter(), thanks to @intuited.]

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I hope you realize that any order you see in a dict is accidental -- it's there only because, when shown on screen, some order has to be picked, but there's absolutely no guarantee.

Net of ordering issues among the various sublists getting catenated,

[x for d in thedict.itervalues()
   for alist in d.itervalues()
   for x in alist]

does what you want without any inefficiency nor intermediate lists.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...