Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
79 views
in Technique[技术] by (71.8m points)

python - How can we concatenate two columns based on names?

I was working with a multiindex dataframe (which I find unbeleivably complicated to work with). I flattened the multiindex into jest Level0, with this line of code.

df_append.columns = df_append.columns.map('|'.join).str.strip('|')

Now, when I print columns, I get this.

Index(['IDRSSD', 'RCFD3531|TRDG ASSETS-US TREAS SECS IN DOM OFF',
       'RCFD3532|TRDG ASSETS-US GOV AGC CORP OBLGS',
       'RCFD3533|TRDG ASSETS-SECS ISSD BY ST  POL SUB',
       'TEXTF660|3RD ITEMIZED AMT FOR OTHR TRDG ASTS',
       'Unnamed: 115_level_0|Unnamed: 115_level_1',
       'Unnamed: 133_level_0|Unnamed: 133_level_1',
       'Unnamed: 139_level_0|Unnamed: 139_level_1',
       'Unnamed: 20_level_0|Unnamed: 20_level_1',
       'Unnamed: 87_level_0|Unnamed: 87_level_1', 'file', 'schedule_code',
       'year', 'qyear'],
      dtype='object', length=202)

I am trying to concatenate two columns into one single column, like this.

df_append['period'] = df_append['IDRSSD'].astype(str) + '-' + df_append['qyear'].astype(str)

Here is the error that I am seeing.

Traceback (most recent call last):

  File "C:Users
yansAnaconda3libsite-packagespandascoreindexesase.py", line 2895, in get_loc
    return self._engine.get_loc(casted_key)

  File "pandas\_libsindex.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libsindex.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libshashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas\_libshashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'IDRSSD'


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "<ipython-input-153-92d2e8486595>", line 1, in <module>
    df_append['period'] = df_append['IDRSSD'].astype(str) + '-' + df_append['qyear'].astype(str)

  File "C:Users
yansAnaconda3libsite-packagespandascoreframe.py", line 2902, in __getitem__
    indexer = self.columns.get_loc(key)

  File "C:Users
yansAnaconda3libsite-packagespandascoreindexesase.py", line 2897, in get_loc
    raise KeyError(key) from err

KeyError: 'IDRSSD'

To me, it looks like I have a column named 'IDRSSD' and a column named 'qyear', but Python disagrees. Or, perhaps I am misinterpreting the error message. Can I get these two columns concatenated into one, or is this impossible to do? Thanks everyone.

question from:https://stackoverflow.com/questions/65852903/how-can-we-concatenate-two-columns-based-on-names

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I tried the method below. It worked for me.

1.) First convert the column to string:

df_append['IDRSSD'] = df_append['IDRSSD'].astype(str)
df_append['qyear'] = df_append['qyear'].astype(str)

2.) Now join then both the columns into one using '-' as seperator

df_append['period'] = df_append[['IDRSSD', 'qyear']].apply(lambda x: '-'.join(x), axis=1)

Attached the screenshot of my approach. enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...