python - numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?

Question

Welcome To Ask or Share your Answers For Others

python - numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?

I'm running genfromtxt like below:

date_conv = lambda x: str(x).replace(":", "/")
time_conv = lambda x: str(x)

a = np.genfromtxt(input.txt, delimiter=',', skip_header=4,
      usecols=[0, 1] + radii_indices, converters={0: date_conv, 1: time_conv})

Where input.txt is from this gist.

When I look at the results, it is a 1D array not a 2D array:

>>> np.shape(a)
(918,)

It seems to be an array of tuples instead:

>>> a[0]
('06/03/2006', '08:27:23', 6.4e-05, 0.000336, 0.001168, 0.002716, 0.004274, 0.004658, 0.003756, 0.002697, 0.002257, 0.002566, 0.003522, 0.004471, 0.00492, 0.005602, 0.006956, 0.008442, 0.008784, 0.006976, 0.003917, 0.001494, 0.000379, 6.4e-05)

If I remove the converters specification from the genfromtxt call it works fine and produces a 2D array:

>>> np.shape(a)
(918, 24)

question from:https://stackoverflow.com/questions/9534408/numpy-genfromtxt-produces-array-of-what-looks-like-tuples-not-a-2d-array-why

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T05:45:38+0000

What is returned is called a structured ndarray, see e.g. here: http://docs.scipy.org/doc/numpy/user/basics.rec.html. This is because your data is not homogeneous, i.e. not all elements have the same type: the data contains both strings (the first two columns) and floats. Numpy arrays have to be homogeneous (see here for an explanation).

The structured array 'solves' this constraint of homogeneity by using tuples for each record or row, that's the reason the returned array is 1D: one series of tuples, but each tuple (row) consists of several fields, so you can regard it as rows and columns. The different columns are accessible as a['nameofcolumn'] e.g. a['Julian_Day'].

The reason that it returns a 2D array when removing the converters for the first two columns is that in that case, genfromtxt regards all data of the same type, and a normal ndarray is returned (the default type is float, but you can specify this with the dtype argument).

EDIT: If you want to make use of the column names, you can use the names argument (and set the skip_header at only three):

a2 = np.genfromtxt("input.txt", delimiter=',', skip_header=3, names = True, dtype = None,
                  usecols=[0, 1] + radii_indices, converters={0: date_conv, 1: time_conv})

the you can do e.g.:

>>> a2['Dateddmmyyyy']
array(['06/03/2006', '06/03/2006', '18/03/2006', '19/03/2006',
       '19/03/2006', '19/03/2006', '19/03/2006', '19/03/2006',
       '19/03/2006', '19/03/2006'], 
      dtype='|S10')

Categories

python - numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?

python - numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags