Python newbie here- I am comfortable with pandas (however, I know next to nothing when it comes to arrays).
I have a dataframe df
with four columns namely: Name
, colA
, Income
, colB
My primary goal is to carry out data analysis on this dataset, however the challenge I am having is the presence of arrays in two columns of my dataset (see below- colA
and colB
).
The key thing to know- colA
and colB
are features extracted from image data.
I want to do the following :
- convert the arrays in
colA
and colB
to regular columns as seen in Name
and Income
- Map the converted arrays(the new columns) back to the corresponding row/index number in the original dataframe
- Assign column names to the new columns of the mapped arrays such as
colA1
, colA2
, colA3
,colB1
,colB2
,colB3
,colB4
...... (so that one will be able to know where the new columns were derived from)
df
index, Name, colA ,Income, colB
1. Peter, [[[3,4],[3,9],[3,0],[2,1]]] , 32100, [[3,4,1,3,1],[1,2,2,2,1],[6,5,0,1,1],[1,2,1,1,1]]
2. John , [[[1,2],[3,5],[1,0],[0,1]]] , 43256, [[5,4,2,3,4],[5,1,2,2,5],[7,5,0,1,2],[4,2,1,1,3]]
3. Mark , [[[5,8],[5,9],[1,0],[1,4]]] , 29811, [[4,4,1,3,2],[6,2,2,2,8],[6,1,0,1,3],[9,2,1,9,9]]
4. Jane , [[[8,4],[1,2],[5,3],[1,8]]] ,134500, [[3,4,7,3,7],[1,2,5,6,2],[6,5,1,3,2],[9,2,3,2,5]]
5. Jill , [[[6,6],[2,1],[1,1],[5,6]]] ,233120, [[5,4,5,3,9],[1,2,5,2,0],[0,5,0,4,2],[1,5,1,6,1]]
Desired output :
# the new df should look something like the example below or something more appropriate for data analysis in a pandas dataframe
index Name, Income colA1, colA2, colA3,colA4,colA5,ColB1,colB2,colB3,colB4
1. Peter, 32100, 3,4,3,9,3,0,2,1 3,4,1,3,1,1,2,2,2,1,6,5,0,1,1,1,2,1,1,1
2. John , 43256, 1,2,3,5,1,0,0,1,5,4,2,3,4,5,1,2,2,5,7,5,0,1,2,4,2,1,1,3
3. Mark , 29811 5,8,5,9,1,0,1,4 4,4,1,3,2,6,2,2,2,8,6,1,0,1,3,9,2,1,9,9
4. Jane , 134500, 8,4,1,2,5,3,1,8,3,4,7,3,7,1,2,5,6,2,6,5,1,3,2,9,2,3,2,5
5. Jill , 233120, 6,6,2,1,1,1,5,6,5,4,5,3,9,1,2,5,2,0,0,5,0,4,2,1,5,1,6,1
Unfortunately, I don't have a trial code because I don't know my way with arrays. Thanks for your attempt.