Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
559 views
in Technique[技术] by (71.8m points)

python - How to select a subset of a dictionary by index?

I have a h5py file, which contains information about a dataset. There are n items inside the dataset and k keys. For example, for each item I have stored a value for the keys bbox, number_keypoints, etc.As the dataset is too huge for me, I want to randomly sample from the dataset and create a smaller h5py or json file.

Let's say, I want to sample items [1, 6, 16]. Then, I want to practically take these indices for all keys (I hope it is clear, what I am trying to do).

Here is, what my idea looks like:

import h5py
with h5py.File(my_file, "r") as f:
    arr = [1, 6, 16]
    f = {key: value for i, (key, value) in enumerate(f.items()) if i in arr}

Unfortunately, this doesn't work. Can anyone help me here?

question from:https://stackoverflow.com/questions/65921820/how-to-select-a-subset-of-a-dictionary-by-index

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use what's called fancy indexing in the h5py guide:

Let's say you have a data set ds with numbers from 1 to 10 of which you want to take indices specified by arr =[2,4,5]. you can get the subset using sub_ds = ds[arr] will get you an array of length 3 containing the values of the desired indices in arr.

if you have an array of keys called keys (you can use f.keys() only if you don't have groups under your root, only datasets. otherwise you'll get an error), to get what you want you can modify your code to:

import h5py
with h5py.File(my_file, "r") as f:
    arr = [1, 6, 16]
    f_subset = {key: f[key][arr] for key in keys}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...