Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
407 views
in Technique[技术] by (71.8m points)

python - Convert Pandas dataframe to Dask dataframe

Suppose I have pandas dataframe as:

df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})

When I convert it into dask dataframe what should name and divisions parameter consist of:

from dask import dataframe as dd 
sd=dd.DataFrame(df.to_dict(),divisions=1,meta=pd.DataFrame(columns=df.columns,index=df.index))

TypeError: init() missing 1 required positional argument: 'name'

Edit : Suppose I create a pandas dataframe like:

pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})

Similarly how to create dask dataframe as it needs three additional arguments as name,divisions and meta.

sd=dd.Dataframe({'a':[1,2,3],'b':[4,5,6]},name=,meta=,divisions=)

Thank you for your reply.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I think you can use dask.dataframe.from_pandas:

from dask import dataframe as dd 
sd = dd.from_pandas(df, npartitions=3)
print (sd)
dd.DataFrame<from_pa..., npartitions=2, divisions=(0, 1, 2)>

EDIT:

I find solution:

import pandas as pd
import dask.dataframe as dd
from dask.dataframe.utils import make_meta

df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})

dsk = {('x', 0): df}

meta = make_meta({'a': 'i8', 'b': 'i8'}, index=pd.Index([], 'i8'))
d = dd.DataFrame(dsk, name='x', meta=meta, divisions=[0, 1, 2])
print (d)
dd.DataFrame<x, npartitions=2, divisions=(0, 1, 2)>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...