I am bit new to dask. I have large csv file and large list. Length of row of csv are equal to length of the list. I am trying to create a new column in the Dask dataframe from a list. In pandas, it pretty straight forward, however in Dask I am having hard time creating new column for it. I am avoiding to use pandas because my data is 15GB+.
Please see my tries below.
csv Data
name,text,address
john,some text here,MD
tim,some text here too,WA
Code tried
import dask.dataframe as dd
import numpy as np
ls = ['one','two']
ddf = dd.read_csv('../data/test.csv')
ddf.head()
Try #1:
ddf['new'] = ls # TypeError: Column assignment doesn't support type list
Try #2: What should be passed here for condlist?
ddf['new'] = np.select(choicelist=ls) # TypeError: _select_dispatcher() missing 1 required positional argument: 'condlist'
Looking for this output:
name text address new
0 john some text here MD one
1 tim some text here too WA two
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…