Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
172 views
in Technique[技术] by (71.8m points)

python - I want to run a for loop that creates a list, then loops to scan for changes to it in a 2nd list

I have a pre-existing condition so I am trying to find a vaccine, many of the websites say "check back for updates" this checks for changes in the site hash and sends an email when a change is detected. I found an old python 2 script that checked for a change in a site, it would print that to the console, so I adapted it to Python 3 (successfully!) and added code to send an email (successfully!) but now I want it to do the whole thing to a list of URLs, and I am struggling (this is the first real piece of code I have ever worked on, I'm halfway through the Udemy Python Masterclass).

I want it to iterate through a list of URLs with a for loop, put all of the url hashs into a list, then wait 60 seconds, do it again, and compare the 2 lists, if one of the sites has a change, I want that URL emailed to me. I know I am close but I would truly appreciate some help <3 I think it all works except for the code around the 2 #TODO comments.

def get_hash(current_url):  # this function works
    opener = urllib.request.build_opener()
    opener.add_headers = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11')]
    response = opener.open(current_url)
    the_page = response.read()
    return hashlib.sha224(the_page).hexdigest()


current_hash = get_hash()  # This works - Get the current hash, which is what the website is now.


while True:  # run forever, I am struggling below here...
    for provider in providers_list:  # Loops through URL list
        provider_url = provider  # set the current URL to a variable
        provider_hash_1 = get_hash(provider_url)  # set changed provider url to variable

        # TODO: define the old vs new somehow

        providers_hash_1 = []
        providers_hash_2 = []

        # TODO: I'm STRUGGLING

        # email everyone in list the provider url, everything after this works.
        if provider_hash_2 != provider_hash_1:  # get hash of url(i) != previous_hash(i)
            print("{0} changed at {1}".format(provider_url, datetime.datetime.now()))
            # send email
            context = ssl.create_default_context()  # creates secure SSL context
            email_message = """
            Subject: Stay Healthy!

            Try {}""".format(provider_url)
            with smtplib.SMTP_SSL("smtp.gmail.com", 465, context=context) as server:
                server.login("[email protected]", "PASSWORD")
                server.sendmail("[email protected]", "[email protected]",
                                email_message)

    time.sleep(60)

NEW CODE BELOW

providers_dict = {
    'provider_url':
        [
            "https://vaccine.heb.com/",
            "https://request.austinregionalclinic.com/",
            "https://www.haysinformed.com/covid-19",
            "https://www.hillcountrymemorial.org/covid-19-updates/",
            "http://www.tsfm.cc/",
            "https://www.bellcountyhealth.org/covid-19_vaccine/index.php",
            "https://www.brookshirebrothers.com/covid-19",
            "https://www.dshs.texas.gov/coronavirus/immunize/vaccine-hubs.aspx",
            "http://www.umhtx.org/patient/covid-10-update/"
        ],
    'provider_cache': [],
    'provider_hash': []
}

provider_cache = {provider_url: None for provider_url in providers_dict}


def get_hash(current_url):
    opener = urllib.request.build_opener()
    opener.add_headers = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11')]
    response = opener.open(current_url)
    the_page = response.read()
    return hashlib.sha224(the_page).hexdigest()


while True:
    for provider_url, provider_hash in provider_cache.items():
        current_hash = get_hash(provider_url)  # Get the current hash, which is what the website is now

        # email everyone in list the provider url
        if provider_hash != current_hash:  # get hash of url(i) != previous_hash(i)
            provider_cache[provider_url] = current_hash
            print("{0} changed at {1}".format(provider_url, datetime.datetime.now()))

            # send email
            

    time.sleep(60)
question from:https://stackoverflow.com/questions/65931707/i-want-to-run-a-for-loop-that-creates-a-list-then-loops-to-scan-for-changes-to

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

So what you need is some sort of cache to store hashes in. I would suggest that the provider_list and providers_hash_1 & 2 are exchanged for one dictionary.

Say for instance you have a list of providers like:

providers = ["URL://provider1.com", "URL://provider2.com", ...]

Then before entering your end-less while you could do something like the following for initialisation:

provider_cache = {provider_url: None for provider_url in providers}

In the inner for loop, you can now iterate over the items of the dictionary like:

for provider_url, provider_hash in provider_cache.items():
    current_hash = get_hash(provider_url)

    if provider_hash != current_hash:
        provider_cache[provider_url] = current_hash

        [And do whatever else you need to do for notifications, etc.]

Hope this eases the task of making a notification system. Did not want to type up the full solution but rather the headlines.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...