Python scrape.Bill类代码示例

OStack程序员社区-中国程序员成长平台 › 门户 › 编程› Python›Python编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Python中pupa.scrape.Bill类的典型用法代码示例。如果您正苦于以下问题：Python Bill类的具体用法？Python Bill怎么用？Python Bill使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

在下文中一共展示了Bill类的20个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: test_fix_bill_id

def test_fix_bill_id():
    j = create_jurisdiction()
    j.legislative_sessions.create(name='1900', identifier='1900')

    org1 = ScrapeOrganization(name='House', classification='lower')
    bill = ScrapeBill('HB 1', '1900', 'Test Bill ID',
                      classification='bill', chamber='lower')

    oi = OrganizationImporter('jid')

    oi.import_data([org1.as_dict()])

    from pupa.settings import IMPORT_TRANSFORMERS
    IMPORT_TRANSFORMERS['bill'] = {
        'identifier': lambda x: re.sub(r'([A-Z]*)\s*0*([-\d]+)', r'\1 \2', x, 1)
    }

    bi = BillImporter('jid', oi, DumbMockImporter())
    bi.import_data([bill.as_dict()])

    ve = ScrapeVoteEvent(legislative_session='1900', motion_text='passage',
                         start_date='1900-04-02', classification='passage:bill',
                         result='fail', bill_chamber='lower', bill='HB1',
                         identifier='4',
                         bill_action='passage',
                         organization=org1._id)

    VoteEventImporter('jid', DumbMockImporter(), oi, bi).import_data([
        ve.as_dict(),
    ])

    IMPORT_TRANSFORMERS['bill'] = {}

    ve = VoteEvent.objects.get()
    ve.bill.identifier == 'HB 1'

开发者ID:opencivicdata，项目名称:pupa，代码行数:35，代码来源:test_vote_event_importer.py

示例2: scrape_bill

    def scrape_bill(self, chamber, session, bill_id, session_id):
        """
        Scrapes documents, actions, vote counts and votes for
        a given bill.
        """
        bill_json_url = 'https://apps.azleg.gov/api/Bill/?billNumber={}&sessionId={}&' \
                        'legislativeBody={}'.format(bill_id, session_id, self.chamber_map[chamber])
        response = self.get(bill_json_url)
        # print(response.content)
        page = json.loads(response.content.decode('utf-8'))

        bill_title = page['ShortTitle']
        bill_id = page['Number']
        internal_id = page['BillId']
        bill_type = self.get_bill_type(bill_id)
        bill = Bill(
            bill_id,
            legislative_session=session,
            chamber=chamber,
            title=bill_title,
            classification=bill_type,
        )

        bill = self.scrape_actions(bill, page, chamber)
        bill = self.scrape_versions(bill, internal_id)
        bill = self.scrape_sponsors(bill, internal_id)
        bill = self.scrape_subjects(bill, internal_id)

        bill_url = 'https://apps.azleg.gov/BillStatus/BillOverview/{}?SessionId={}'.format(
                    internal_id, session_id)
        bill.add_source(bill_url)

        bill = self.sort_bill_actions(bill)

        yield bill

开发者ID:neelneelpurk，项目名称:openstates，代码行数:35，代码来源:bills.py

示例3: get_bill

 def get_bill(self, bill_id, **kwargs):
     url = kwargs.pop('url')
     agenda_item = kwargs.pop('agenda_item')
     _type = self.get_type(bill_id)
     bill = Bill(bill_id, self.session, type=_type, **kwargs)
     bill.add_source(url, note='detail')
     return bill

开发者ID:ChaelCodes，项目名称:scrapers-us-municipal，代码行数:7，代码来源:bills.py

示例4: scrape

    def scrape(self):

        for i, page in enumerate(self.searchLegislation()) :
            for legislation_summary in self.parseSearchResults(page) :
                title = legislation_summary['Title'].strip()
                if title == "":
                    continue

                if legislation_summary['Type'].lower() in ('order', 
                                                           'claim', 
                                                           'communication', 
                                                           'report', 
                                                           'oath of office') :
                    continue
                else :
                    bill_type = legislation_summary['Type'].lower()

                bill_session = self.session(legislation_summary['Intro\xa0Date'])

                bill = Bill(identifier=legislation_summary['Record #'],
                            legislative_session=bill_session,
                            title=title,
                            classification=bill_type,
                            from_organization=self.jurisdiction.name)

                bill.add_source(legislation_summary['url'])

                bill, votes = self.addDetails(bill, legislation_summary['url'])

                yield bill
                for vote in votes :
                    yield vote

开发者ID:rshorey，项目名称:scrapers-us-municipal，代码行数:32，代码来源:bills.py

示例5: scrape_bill

    def scrape_bill(self, chamber, session, bill_id, session_id):
        bill_json_url = 'https://apps.azleg.gov/api/Bill/?billNumber={}&sessionId={}&' \
                        'legislativeBody={}'.format(bill_id, session_id, self.chamber_map[chamber])
        response = self.get(bill_json_url)
        page = json.loads(response.content.decode('utf-8'))

        if not page:
            self.warning('null page for %s', bill_id)
            return

        bill_title = page['ShortTitle']
        bill_id = page['Number']
        internal_id = page['BillId']
        bill_type = self.get_bill_type(bill_id)
        bill = Bill(
            bill_id,
            legislative_session=session,
            chamber=chamber,
            title=bill_title,
            classification=bill_type,
        )

        self.scrape_actions(bill, page, chamber)
        self.scrape_versions_and_documents(bill, internal_id)
        self.scrape_sponsors(bill, internal_id)
        self.scrape_subjects(bill, internal_id)
        yield from self.scrape_votes(bill, page)

        bill_url = 'https://apps.azleg.gov/BillStatus/BillOverview/{}?SessionId={}'.format(
                    internal_id, session_id)
        bill.add_source(bill_url)

        self.sort_bill_actions(bill)

        yield bill

开发者ID:sunlightlabs，项目名称:openstates，代码行数:35，代码来源:bills.py

示例6: test_bill_sponsor_by_identifier

def test_bill_sponsor_by_identifier():
    create_jurisdiction()
    org = create_org()

    bill = ScrapeBill('HB 1', '1900', 'Axe & Tack Tax Act',
                      classification='tax bill', chamber='lower')
    bill.add_sponsorship_by_identifier(name="SNODGRASS",
                                       classification='sponsor',
                                       entity_type='person',
                                       primary=True,
                                       identifier="TOTALLY_REAL_ID",
                                       scheme="TOTALLY_REAL_SCHEME")

    oi = OrganizationImporter('jid')
    pi = PersonImporter('jid')

    zs = ScrapePerson(name='Zadock Snodgrass')
    zs.add_identifier(identifier='TOTALLY_REAL_ID',
                      scheme='TOTALLY_REAL_SCHEME')
    pi.import_data([zs.as_dict()])
    za_db = Person.objects.get()
    Membership.objects.create(person_id=za_db.id,
                              organization_id=org.id)

    BillImporter('jid', oi, pi).import_data([bill.as_dict()])

    obj = Bill.objects.get()
    (entry,) = obj.sponsorships.all()
    assert entry.person.name == "Zadock Snodgrass"

开发者ID:rshorey，项目名称:pupa，代码行数:29，代码来源:test_bill_importer.py

示例7: get_bill

 def get_bill(self, bill_id, **kwargs):
     if bill_id == '1':
         assert kwargs == {'extra': 'param'}
         raise self.ContinueScraping
     else:
         assert bill_id == '2'
         assert kwargs == {}
         b = Bill('1', self.session, 'title')
         b.add_source('http;//example.com')
         return b

开发者ID:paultag，项目名称:pupa，代码行数:10，代码来源:test_scraper.py

示例8: toy_bill

def toy_bill():
    b = Bill(
        identifier="HB 2017",
        legislative_session="2012A",
        title="A bill for an act to raise the cookie budget by 200%",
        from_organization="Foo Senate",
        classification="bill",
    )
    b.add_source("http://uri.example.com/", note="foo")
    return b

开发者ID:pombredanne，项目名称:pupa，代码行数:10，代码来源:test_bill_scrape.py

示例9: scrape_bill

    def scrape_bill(self, row, chamber, session):
        bill_id = row['LegislationNumber']

        # TODO: re-evaluate if these should be separate bills
        if 'SA' in bill_id or 'HA' in bill_id:
            self.warning('skipping amendment %s', bill_id)
            return

        bill_type = self.classify_bill(bill_id)
        bill = Bill(identifier=bill_id,
                    legislative_session=session,
                    chamber=chamber,
                    title=row['LongTitle'],
                    classification=bill_type)
        if row['Synopsis']:
            bill.add_abstract(row['Synopsis'], 'synopsis')
        if row['ShortTitle']:
            bill.add_title(row['ShortTitle'], 'short title')
        if row['SponsorPersonId']:
            self.add_sponsor_by_legislator_id(bill, row['SponsorPersonId'], 'primary')

        # TODO: Is there a way get additional sponsors and cosponsors, and versions/fns via API?
        html_url = 'https://legis.delaware.gov/BillDetail?LegislationId={}'.format(
            row['LegislationId']
        )
        bill.add_source(html_url, note='text/html')

        html = self.lxmlize(html_url)

        # Additional Sponsors: '//label[text()="Additional Sponsor(s):"]/following-sibling::div/a'
        additional_sponsors = html.xpath('//label[text()="Additional Sponsor(s):"]'
                                         '/following-sibling::div/a/@href')
        for sponsor_url in additional_sponsors:
            sponsor_id = sponsor_url.replace('https://legis.delaware.gov/LegislatorDetail?'
                                             'personId=', '')
            self.add_sponsor_by_legislator_id(bill, sponsor_id, 'primary')

        # CoSponsors: '//label[text()="Co-Sponsor(s):"]/following-sibling::div/a'
        cosponsors = html.xpath('//label[text()="Additional Sponsor(s):"]/'
                                'following-sibling::div/a/@href')
        for sponsor_url in cosponsors:
            sponsor_id = sponsor_url.replace('https://legis.delaware.gov/LegislatorDetail?'
                                             'personId=', '')
            self.add_sponsor_by_legislator_id(bill, sponsor_id, 'cosponsor')

        versions = html.xpath('//label[text()="Original Text:"]/following-sibling::div/a/@href')
        for version_url in versions:
            media_type = self.mime_from_link(version_url)
            version_name = 'Bill Text'
            # on_duplicate='error'
            bill.add_version_link(version_name, version_url, media_type=media_type)

        fiscals = html.xpath('//div[contains(@class,"fiscalNote")]/a/@href')
        for fiscal in fiscals:
            self.scrape_fiscal_note(bill, fiscal)

        self.scrape_actions(bill, row['LegislationId'])
        yield from self.scrape_votes(bill, row['LegislationId'], session)

        yield bill

开发者ID:cliftonmcintosh，项目名称:openstates，代码行数:60，代码来源:bills.py

示例10: test_bill_chamber_param

def test_bill_chamber_param():
    create_jurisdiction()
    org = create_org()

    bill = ScrapeBill('HB 1', '1900', 'Axe & Tack Tax Act',
                      classification='tax bill', chamber='lower')

    oi = OrganizationImporter('jid')
    BillImporter('jid', oi).import_data([bill.as_dict()])

    assert Bill.objects.get().from_organization_id == org.id

开发者ID:Vanuan，项目名称:pupa，代码行数:11，代码来源:test_bill_importer.py

示例11: test_vote_event_bill_actions_two_stage

def test_vote_event_bill_actions_two_stage():
    # this test is very similar to what we're testing in test_vote_event_bill_actions w/
    # ve3 and ve4, that two bills that reference the same action won't conflict w/ the
    # OneToOneField, but in this case we do it in two stages so that the conflict is found
    # even if the votes weren't in the same scrape
    j = create_jurisdiction()
    j.legislative_sessions.create(name='1900', identifier='1900')
    org1 = ScrapeOrganization(name='House', classification='lower')
    bill = ScrapeBill('HB 1', '1900', 'Axe & Tack Tax Act', from_organization=org1._id)

    bill.add_action(description='passage', date='1900-04-02', chamber='lower')

    ve1 = ScrapeVoteEvent(legislative_session='1900', motion_text='passage',
                          start_date='1900-04-02', classification='passage:bill',
                          result='pass', bill_chamber='lower', bill='HB 1',
                          bill_action='passage',
                          organization=org1._id)
    ve2 = ScrapeVoteEvent(legislative_session='1900', motion_text='passage',
                          start_date='1900-04-02', classification='passage:bill',
                          result='pass', bill_chamber='lower', bill='HB 1',
                          bill_action='passage',
                          organization=org1._id)
    # disambiguate them
    ve1.pupa_id = 'one'
    ve2.pupa_id = 'two'

    oi = OrganizationImporter('jid')
    oi.import_data([org1.as_dict()])

    bi = BillImporter('jid', oi, DumbMockImporter())
    bi.import_data([bill.as_dict()])

    # first imports just fine
    VoteEventImporter('jid', DumbMockImporter(), oi, bi).import_data([
        ve1.as_dict(),
    ])
    votes = list(VoteEvent.objects.all())
    assert len(votes) == 1
    assert votes[0].bill_action is not None

    # when second is imported, ensure that action stays pinned to first just as it would
    # have if they were both in same import
    VoteEventImporter('jid', DumbMockImporter(), oi, bi).import_data([
        ve1.as_dict(),
        ve2.as_dict(),
    ])
    votes = list(VoteEvent.objects.all())
    assert len(votes) == 2
    assert votes[0].bill_action is not None
    assert votes[1].bill_action is None

开发者ID:opencivicdata，项目名称:pupa，代码行数:50，代码来源:test_vote_event_importer.py

示例12: scrape_bill_info

    def scrape_bill_info(self, session, chambers):
        info_url = "ftp://ftp.cga.ct.gov/pub/data/bill_info.csv"
        data = self.get(info_url)
        page = open_csv(data)

        chamber_map = {'H': 'lower', 'S': 'upper'}

        for row in page:
            bill_id = row['bill_num']
            chamber = chamber_map[bill_id[0]]

            if chamber not in chambers:
                continue

            # assert that the bill data is from this session, CT is tricky
            assert row['sess_year'] == session

            if re.match(r'^(S|H)J', bill_id):
                bill_type = 'joint resolution'
            elif re.match(r'^(S|H)R', bill_id):
                bill_type = 'resolution'
            else:
                bill_type = 'bill'

            bill = Bill(identifier=bill_id,
                        legislative_session=session,
                        title=row['bill_title'],
                        classification=bill_type,
                        chamber=chamber)
            bill.add_source(info_url)

            for introducer in self._introducers[bill_id]:
                bill.add_sponsorship(name=str(introducer),
                                     classification='primary',
                                     primary=True,
                                     entity_type='person')

            try:
                for subject in self._subjects[bill_id]:
                    bill.subject.append(subject)

                self.bills[bill_id] = [bill, chamber]

                yield from self.scrape_bill_page(bill)
            except SkipBill:
                self.warning('no such bill: ' + bill_id)
                pass

开发者ID:cliftonmcintosh，项目名称:openstates，代码行数:47，代码来源:bills.py

示例13: test_full_vote_event

def test_full_vote_event():
    j = Jurisdiction.objects.create(id='jid', division_id='did')
    j.legislative_sessions.create(name='1900', identifier='1900')
    sp1 = ScrapePerson('John Smith', primary_org='lower')
    sp2 = ScrapePerson('Adam Smith', primary_org='lower')
    org = ScrapeOrganization(name='House', classification='lower')
    bill = ScrapeBill('HB 1', '1900', 'Axe & Tack Tax Act', from_organization=org._id)
    vote_event = ScrapeVoteEvent(legislative_session='1900', motion_text='passage',
                                 start_date='1900-04-01', classification='passage:bill',
                                 result='pass', bill_chamber='lower', bill='HB 1',
                                 organization=org._id)
    vote_event.set_count('yes', 20)
    vote_event.yes('John Smith')
    vote_event.no('Adam Smith')

    oi = OrganizationImporter('jid')
    oi.import_data([org.as_dict()])

    pi = PersonImporter('jid')
    pi.import_data([sp1.as_dict(), sp2.as_dict()])

    mi = MembershipImporter('jid', pi, oi, DumbMockImporter())
    mi.import_data([sp1._related[0].as_dict(), sp2._related[0].as_dict()])

    bi = BillImporter('jid', oi, pi)
    bi.import_data([bill.as_dict()])

    VoteEventImporter('jid', pi, oi, bi).import_data([vote_event.as_dict()])

    assert VoteEvent.objects.count() == 1
    ve = VoteEvent.objects.get()
    assert ve.legislative_session == LegislativeSession.objects.get()
    assert ve.motion_classification == ['passage:bill']
    assert ve.bill == Bill.objects.get()
    count = ve.counts.get()
    assert count.option == 'yes'
    assert count.value == 20
    votes = list(ve.votes.all())
    assert len(votes) == 2
    for v in ve.votes.all():
        if v.voter_name == 'John Smith':
            assert v.option == 'yes'
            assert v.voter == Person.objects.get(name='John Smith')
        else:
            assert v.option == 'no'
            assert v.voter == Person.objects.get(name='Adam Smith')

开发者ID:anukat2015，项目名称:pupa，代码行数:46，代码来源:test_vote_event_importer.py

示例14: parse_bill

    def parse_bill(self, chamber, session, special, link):
        bill_num = link.text.strip()
        type_abbr = re.search('type=(B|R|)', link.attrib['href']).group(1)

        if type_abbr == 'B':
            btype = ['bill']
        elif type_abbr == 'R':
            btype = ['resolution']

        bill_id = "%s%s %s" % (utils.bill_abbr(chamber), type_abbr, bill_num)

        url = utils.info_url(chamber, session, special, type_abbr, bill_num)
        page = self.get(url).text
        page = lxml.html.fromstring(page)
        page.make_links_absolute(url)

        xpath = '/'.join([
            '//div[contains(@class, "BillInfo-ShortTitle")]',
            'div[@class="BillInfo-Section-Data"]',
        ])
        title = page.xpath(xpath).pop().text_content().strip()
        if not title:
            return
        bill = Bill(bill_id, legislative_session=session, title=title, chamber=chamber,
                    classification=btype)
        bill.add_source(url)

        self.parse_bill_versions(bill, page)

        self.parse_history(bill, chamber, utils.history_url(chamber, session, special,
                           type_abbr, bill_num))

        # only fetch votes if votes were seen in history
        # if vote_count:
        yield from self.parse_votes(
            bill,
            utils.vote_url(chamber, session, special, type_abbr, bill_num),
        )

        # Dedupe sources.
        sources = bill.sources
        for source in sources:
            if 1 < sources.count(source):
                sources.remove(source)

        yield bill

开发者ID:neelneelpurk，项目名称:openstates，代码行数:46，代码来源:bills.py

示例15: test_bill_update

def test_bill_update():
    create_jurisdiction()
    create_org()

    bill = ScrapeBill('HB 1', '1900', 'First Bill')

    oi = OrganizationImporter('jid')
    _, what = BillImporter('jid', oi).import_item(bill.as_dict())
    assert what == 'insert'
    _, what = BillImporter('jid', oi).import_item(bill.as_dict())
    assert what == 'noop'

    # ensure no new object was created
    assert Bill.objects.count() == 1

    # test basic update
    bill = ScrapeBill('HB 1', '1900', '1st Bill')
    _, what = BillImporter('jid', oi).import_item(bill.as_dict())
    assert what == 'update'
    assert Bill.objects.get().title == '1st Bill'

开发者ID:Vanuan，项目名称:pupa，代码行数:20，代码来源:test_bill_importer.py

示例16: handle_list_item

    def handle_list_item(self, item):
        bill_id = item.text.strip()
        title = item.xpath("string(../following-sibling::td[1])").strip()
        sponsor = item.xpath("string(../following-sibling::td[2])").strip()
        bill_url = item.attrib['href'] + '/ByCategory'

        if bill_id.startswith(('SB ', 'HB ', 'SPB ', 'HPB ')):
            bill_type = 'bill'
        elif bill_id.startswith(('HR ', 'SR ')):
            bill_type = 'resolution'
        elif bill_id.startswith(('HJR ', 'SJR ')):
            bill_type = 'joint resolution'
        elif bill_id.startswith(('SCR ', 'HCR ')):
            bill_type = 'concurrent resolution'
        elif bill_id.startswith(('SM ', 'HM ')):
            bill_type = 'memorial'
        else:
            raise ValueError('Failed to identify bill type.')

        bill = Bill(bill_id, self.kwargs['session'], title,
                    chamber='lower' if bill_id[0] == 'H' else 'upper',
                    classification=bill_type)
        bill.add_source(bill_url)

        # normalize id from HB 0004 to H4
        subj_bill_id = re.sub('(H|S)\w+ 0*(\d+)', r'\1\2', bill_id)
        bill.subject = list(self.kwargs['subjects'][subj_bill_id])

        sponsor = re.sub(r'^(?:Rep|Sen)\.\s', "", sponsor)
        for sp in sponsor.split(', '):
            bill.add_sponsorship(sp, 'primary', 'person', True)

        yield from self.scrape_page_items(BillDetail, url=bill_url, obj=bill)

        yield bill

开发者ID:OrlandoSentinel，项目名称:TabsOnTallahassee，代码行数:35，代码来源:bills.py

示例17: test_bill_sponsor_limit_lookup

def test_bill_sponsor_limit_lookup():
    create_jurisdiction()
    org = create_org()

    bill = ScrapeBill('HB 1', '1900', 'Axe & Tack Tax Act',
                      classification='tax bill', chamber='lower')
    bill.add_sponsorship_by_identifier(name="SNODGRASS",
                                       classification='sponsor',
                                       entity_type='person',
                                       primary=True,
                                       identifier="TOTALLY_REAL_ID",
                                       scheme="TOTALLY_REAL_SCHEME")

    oi = OrganizationImporter('jid')
    pi = PersonImporter('jid')

    zs = ScrapePerson(name='Zadock Snodgrass', birth_date="1800-01-01")
    zs.add_identifier(identifier='TOTALLY_REAL_ID',
                      scheme='TOTALLY_REAL_SCHEME')
    pi.import_data([zs.as_dict()])

    za_db = Person.objects.get()
    Membership.objects.create(person_id=za_db.id,
                              organization_id=org.id)

    zs2 = ScrapePerson(name='Zadock Snodgrass', birth_date="1900-01-01")
    zs2.add_identifier(identifier='TOTALLY_REAL_ID',
                       scheme='TOTALLY_REAL_SCHEME')

    # This is contrived and perhaps broken, but we're going to check this.
    # We *really* don't want to *ever* cross jurisdiction bounds.
    PersonImporter('another-jurisdiction').import_data([zs.as_dict()])

    BillImporter('jid', oi, pi).import_data([bill.as_dict()])

    obj = Bill.objects.get()
    (entry,) = obj.sponsorships.all()
    assert entry.person.name == "Zadock Snodgrass"
    assert entry.person.birth_date == "1800-01-01"

开发者ID:rshorey，项目名称:pupa，代码行数:39，代码来源:test_bill_importer.py

示例18: handle_page

    def handle_page(self):
        bills = self.doc.xpath('//ul[@class="linkSect"]/li')
        for bill in bills:
            link = bill.getchildren()[0]
            bill_id = str(link.text_content())

            if not bill_id.startswith(('S', 'H')):
                continue

            # create a bill
            desc = bill.xpath('text()')[0].strip()
            chamber = {
                'H': 'lower',
                'S': 'upper',
            }[bill_id[0]]
            bill_type = {
                'B': 'bill',
                'J': 'joint resolution',
                'R': 'resolution'
            }[bill_id[1]]
            bill = Bill(bill_id, self.kwargs['session'], desc,
                        chamber=chamber, classification=bill_type)

            bill_url = link.get('href')
            sponsor_url = BASE_URL + URL_PATTERNS['sponsors'].format(
                self.kwargs['session_id'],
                bill_id.replace(' ', ''),
            )

            list(self.scrape_page_items(BillSponsorPage, url=sponsor_url, obj=bill))
            yield from self.scrape_page_items(BillDetailPage, url=bill_url, obj=bill)
            bill.subject = self.kwargs['subjects'][bill_id]
            bill.add_source(bill_url)
            yield bill

        next_url = self.doc.xpath('//a/b[text()="More..."]/../@href')
        if next_url:
            yield from self.scrape_page_items(BillListPage, url=next_url[0], **self.kwargs)

开发者ID:neelneelpurk，项目名称:openstates，代码行数:38，代码来源:bills.py

示例19: _recursively_process_bills

    def _recursively_process_bills(
            self, request_session, chamber, session, first_item=1):
        '''
        Once a search has been initiated, this function will save a
        Bill object for every Paper from the given chamber
        '''

        url = 'http://legislature.maine.gov/LawMakerWeb/searchresults.asp'
        r = request_session.get(url, params={'StartWith': first_item})
        r.raise_for_status()

        bills = lxml.html.fromstring(r.text).xpath('//tr/td/b/a')
        if bills:
            for bill in bills:
                bill_id_slug = bill.xpath('./@href')[0]
                bill_url = 'http://legislature.maine.gov/LawMakerWeb/{}'.format(bill_id_slug)
                bill_id = bill.text[:2] + " " + bill.text[2:]

                bill = Bill(
                    identifier=bill_id,
                    legislative_session=session,
                    title="",
                    chamber=chamber,
                )
                bill.add_source(bill_url)

                yield from self.scrape_bill(bill, chamber)
                yield bill

            # Make a recursive call to this function, for the next page
            PAGE_SIZE = 25
            yield from self._recursively_process_bills(
                request_session=request_session,
                chamber=chamber,
                session=session,
                first_item=first_item + PAGE_SIZE
            )

开发者ID:cliftonmcintosh，项目名称:openstates，代码行数:37，代码来源:bills.py

示例20: scrape_bill_list

    def scrape_bill_list(self, chamber, session, url):
        if 'joint_resolution' in url:
            bill_type = 'joint resolution'
        elif 'resolution' in url:
            bill_type = 'resolution'
        elif 'bill' in url:
            bill_type = 'bill'

        try:
            data = self.get(url).text
        except scrapelib.HTTPError:
            self.warning('skipping URL %s' % url)
            return
        doc = lxml.html.fromstring(data)
        doc.make_links_absolute(url)
        bill_list = doc.xpath('//ul[@class="infoLinks"]/li/div[@class="row-fluid"]')
        for b in bill_list:
            bill_url = b.xpath('./div[@class="span3"]/a/@href')[0]
            bill_id = bill_url.rsplit('/', 1)[-1]
            bill_id = bill_id.upper()

            title = b.xpath(
                './div[@class="span6"]/text()'
            )[0].replace(' - Relating to: ', '').strip()

            bill = Bill(
                bill_id,
                legislative_session=session,
                title=title,
                chamber=chamber,
                classification=bill_type,
            )
            bill.subject = list(set(self.subjects[bill_id]))
            yield from self.scrape_bill_history(bill, bill_url, chamber)

            yield bill

开发者ID:cliftonmcintosh，项目名称:openstates，代码行数:36，代码来源:bills.py

注：本文中的pupa.scrape.Bill类示例由纯净天空整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Python scrape.Event类代码示例发布时间：2022-05-25

Python pungi.spyOn函数代码示例发布时间：2022-05-25

Python util.grid_equal函数代码示例

1 Python 入门教程

Python入门教程 Python 是一种解释型、面向对象、动态数据类型的高级程序设计语言。 P

阅读：13925|2022-01-22

2 Python wikiutil.getFrontPage函数代码示例

Python wikiutil.getFrontPage函数代码示例

阅读：10291|2022-05-24

3 Python 简介

Python 简介 Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本

阅读：4165|2022-01-22

4 Python tests.group函数代码示例

Python tests.group函数代码示例

阅读：4064|2022-05-27

5 Python util.check_if_user_has_permission

Python util.check_if_user_has_permission函数代码示例

阅读：3889|2022-05-27

6 Python 操练实例98

Python 练习实例98 Python 100例题目：从键盘输入一个字符串，将小写字母全部转换成大

阅读：3538|2022-01-22

7 Python 环境搭建

Python 环境搭建本章节我们将向大家介绍如何在本地搭建 Python 开发环境。 Py

阅读：3065|2022-01-22

8 Python 基础语法

Python 基础语法 Python 语言与 Perl，C 和 Java 等语言有许多相似之处。但是，也

阅读：2723|2022-01-22

9 Python output.darkgreen函数代码示例

Python output.darkgreen函数代码示例

阅读：2682|2022-05-25

10 Python 中文编码

Python 中文编码前面章节中我们已经学会了如何用 Python 输出 Hello, World!，英文没

阅读：2344|2022-01-22

客服电话

电子邮件

Python scrape.Bill类代码示例

示例1: test_fix_bill_id

示例2: scrape_bill

示例3: get_bill

示例4: scrape

示例5: scrape_bill

示例6: test_bill_sponsor_by_identifier

示例7: get_bill

示例8: toy_bill

示例9: scrape_bill

示例10: test_bill_chamber_param

示例11: test_vote_event_bill_actions_two_stage

示例12: scrape_bill_info

示例13: test_full_vote_event

示例14: parse_bill

示例15: test_bill_update

示例16: handle_list_item

示例17: test_bill_sponsor_limit_lookup

示例18: handle_page

示例19: _recursively_process_bills

示例20: scrape_bill_list

请发表评论

全部评论

上一篇：

下一篇：

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.get_stdout函数代码示例

关于我们

产品与服务

解决方案

139-2527-9053