python - How can I filter lines on load in Pandas read_csv function?

Question

Welcome To Ask or Share your Answers For Others

python - How can I filter lines on load in Pandas read_csv function?

1 Answer

深蓝 · Answer 1 · 2021-10-16T21:24:13+0000

There isn't an option to filter the rows before the CSV file is loaded into a pandas object.

You can either load the file and then filter using df[df['field'] > constant], or if you have a very large file and you are worried about memory running out, then use an iterator and apply the filter as you concatenate chunks of your file e.g.:

import pandas as pd
iter_csv = pd.read_csv('file.csv', iterator=True, chunksize=1000)
df = pd.concat([chunk[chunk['field'] > constant] for chunk in iter_csv])

You can vary the chunksize to suit your available memory. See here for more details.

Categories

python - How can I filter lines on load in Pandas read_csv function?

python - How can I filter lines on load in Pandas read_csv function?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags