Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
892 views
in Technique[技术] by (71.8m points)

python - Pandas can't read hdf5 file created with h5py

I get pandas error when I try to read HDF5 format files that I have created with h5py. I wonder if I am just doing something wrong?

import h5py
import numpy as np
import pandas as pd
h5_file = h5py.File('test.h5', 'w')
h5_file.create_dataset('zeros', data=np.zeros(shape=(3, 5)), dtype='f')
h5_file.close()
pd_file = pd.read_hdf('test.h5', 'zeros')

gives an error: TypeError: cannot create a storer if the object is not existing nor a value are passed

I tried to specify key set to '/zeros' (as I would do it with h5py when reading the file) with no luck.

If I use pandas.HDFStore to read it in, I get an empty store back:

store = pd.HDFStore('test.h5')
>>> store
<class 'pandas.io.pytables.HDFStore'>
File path: test.h5
Empty

I have no troubles reading just created file back with h5py:

h5_back = h5py.File('test.h5', 'r')
h5_back['/zeros']
<HDF5 dataset "zeros": shape (3, 5), type "<f4">

Using these versions:

Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

pd.__version__
'0.16.2'
h5py.__version__
'2.5.0'

Many thanks in advance, Masha

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I've worked a little on the pytables module in pandas.io and from what I know pandas interaction with HDF files is limited to specific structures that pandas understands. To see what these look like, you can try

import pandas as pd
import numpy as np
pd.Series(np.zeros((3,5),dtype=np.float32).to_hdf('test.h5','test')

If you open 'test.h5' in HDFView, you will see a path /test with 4 items that are needed to recreate the DataFrame.

HDFView of test.h5

So I think your only option for reading in NumPy arrays is to read them in directly and then convert these to Pandas objects.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...