Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
839 views
in Technique[技术] by (71.8m points)

twitter - tweepy Streaming API : full text

I am using tweepy streaming API to get the tweets containing a particular hashtag . The problem that I am facing is that I am unable to extract full text of the tweet from the Streaming API . Only 140 characters are available and after that it gets truncated.

Here is the code:

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)


def analyze_status(text):

    if 'RT' in text[0:3]:
        return True
    else:
        return False

    class MyStreamListener(tweepy.StreamListener):

    def on_status(self, status):

    if not analyze_status(status.text):

        with open('fetched_tweets.txt', 'a') as tf:
            tf.write(status.text.encode('utf-8') + '

')

        print(status.text)

    def on_error(self, status):
    print("Error Code : " + status)

    def test_rate_limit(api, wait=True, buffer=.1):
        """
        Tests whether the rate limit of the last request has been reached.
        :param api: The `tweepy` api instance.
        :param wait: A flag indicating whether to wait for the rate limit reset
                 if the rate limit has been reached.
        :param buffer: A buffer time in seconds that is added on to the waiting
                   time as an extra safety margin.
        :return: True if it is ok to proceed with the next request. False otherwise.
        """
        # Get the number of remaining requests
        remaining = int(api.last_response.getheader('x-rate-limit-remaining'))
        # Check if we have reached the limit
        if remaining == 0:
        limit = int(api.last_response.getheader('x-rate-limit-limit'))
        reset = int(api.last_response.getheader('x-rate-limit-reset'))
        # Parse the UTC time
        reset = datetime.fromtimestamp(reset)
        # Let the user know we have reached the rate limit
        print "0 of {} requests remaining until {}.".format(limit, reset)

        if wait:
            # Determine the delay and sleep
            delay = (reset - datetime.now()).total_seconds() + buffer
            print "Sleeping for {}s...".format(delay)
            sleep(delay)
            # We have waited for the rate limit reset. OK to proceed.
            return True
        else:
            # We have reached the rate limit. The user needs to handle the rate limit manually.
            return False

        # We have not reached the rate limit
        return True

    myStreamListener = MyStreamListener()
    myStream = tweepy.Stream(auth=api.auth, listener=myStreamListener,
                             tweet_mode='extended')

    myStream.filter(track=['#bitcoin'], async=True)

Does any one have a solution ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

tweet_mode=extended will have no effect in this code, since the Streaming API does not support that parameter. If a Tweet contains longer text, it will contain an additional object in the JSON response called extended_tweet, which will in turn contain a field called full_text.

In that case, you'll want something like print(status.extended_tweet.full_text) to extract the longer text.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.8k users

...