Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
365 views
in Technique[技术] by (71.8m points)

iterating over file object in Python does not work, but readlines() does but is inefficient

In the following code, if I use:

for line in fin:

It only executes for 'a'

But if I use:

wordlist = fin.readlines()
for line in wordlist:

Then it executes for a thru z.

But readlines() reads the whole file at once, which I don't want.

How to avoid this?

def avoids():
    alphabet = 'abcdefghijklmnopqrstuvwxyz'
    num_words = {}

    fin = open('words.txt')

    for char in alphabet:
      num_words[char] = 0
      for line in fin:
        not_found = True
        word = line.strip()
        if word.lower().find(char.lower()) != -1:
          num_words[char] += 1
    fin.close()
    return num_words
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

the syntax for line in fin can only be used once. After you do that, you've exhausted the file and you can't read it again unless you "reset the file pointer" by fin.seek(0). Conversely, fin.readlines() will give you a list which you can iterate over and over again.


I think a simple refactor with Counter (python2.7+) could save you this headache:

from collections import Counter
with open('file') as fin:
    result = Counter()
    for line in fin:
        result += Counter(set(line.strip().lower()))

which will count the number of words in your file (1 word per line) that contain a particular character (which is what your original code does I believe ... Please correct me if I'm wrong)

You could also do this easily with a defaultdict (python2.5+):

from collections import defaultdict
with open('file') as fin:
    result = defaultdict(int)
    for line in fin:
        chars = set(line.strip().lower())
        for c in chars:
            result[c] += 1

And finally, kicking it old-school -- I don't even know when setdefault was introduced...:

fin = open('file')
result = dict()
for line in fin:
    chars = set(line.strip().lower())
    for c in chars:
        result[c] = result.setdefault(c,0) + 1

fin.close()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...