Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
398 views
in Technique[技术] by (71.8m points)

csv content filtering by list elments in Python

I got stuck in getting the right result from a simple piece of Python code ( I am a Python beginner anyway). Given a csv input file (ListInput.csv): pKT, pET, pUT,

and another csv file which contains features of many of these elements (Table.csv):

pBR,156,AATGGT,673,HHHTTTT,
pUT,54,CCATGTACCTAT,187,PRPTP,
pHTM,164,GGTATAG,971,WYT,
pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,

............ and so on

I aim to get a selection based on the first csv file elements in order to get a csv file as output (WorkingList.txt), in this case the expected result would be:

pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,
pUT,54,CCATGTACCTAT,187,PRPTP,

I wrote the following script which does not gives errors but end up with an empty file as output. I am tryng to understand why since a couple of days with no success. Any help is gratly appreciated.

#!/usr/bin/python
import csv

v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)

vt = open('Table.csv', 'rt')
csv_vt = csv.reader(vt)

with open("WorkingList.txt", "a+t") as myfile:
    pass


for el in csv_v:
    for var in csv_vt:
        if el == var[0]:
            myfile.write(var)

myfile.close()
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

First problem:

You consume your input csv iterator csv_vt at the first iteration. You need to do:

vt.seek(0)

to rewind the file for the inner loop. This leave a O(n^2) search algorithm but at least it works.

Second problem:

you're opening & closing my_file in the with block. When you reach your for loop, my_file is already closed because you went out of the with block (that's the guarantee of the with block).

Hadn't you have the first problem you'd had cross paths with "operation on closed file" when trying to write the output.

I'd rewrite the last part within the with block and remove the close().

Third problem

you cannot write a list to a file, you have to create a csv.writer object first.

So to sum it up, you could solve all problems plus the performance problem with the following code:

#!/usr/bin/python
import csv

v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)

with open('Table.csv', 'rt') as vt:
    csv_vt = csv.reader(vt)
    # create a dictionary to speed up lookup
    # read the table only once
    vdict = {var[0]:var for var in csv_vt}

with open("WorkingList.txt", newline="") as myfile:  # for Python 3.x
## with open("WorkingList.txt", "wb") as myfile:  # for Python 2
    cw = csv.writer(myfile)
    for el in csv_v:
        if el[0] in vdict:
            cw.writerow(vdict[el])

v.close()

vdict is the lookup table which replaces your inner loop (only works if the "keys" are unique, which seem to be the case given your input samples)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...