Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
452 views
in Technique[技术] by (71.8m points)

python - beautifulsoup won't recognize lxml

I'm attempting to use lxml as the parser for BeautifulSoup because the default one is MUCH slower, however i'm getting this error:

    soup = BeautifulSoup(html, "lxml")
  File "/home/rob/python/stock/local/lib/python2.7/site-packages/bs4/__init__.py", line 152, in __init__
    % ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

I have uninstalled and reinstalled lxml as well as beautifulsoup many times, however it still will not read it. I've tried reinstalled lxml dependencies as well and i'm still getting this.

I even made a new virtual environment and installed fresh everything and still get this error.

Anyone have any idea whats going on here?

Edits

Using latest versions of bs4 and lxml on Python 2.7.x on ubuntu desktop

i can import lxml but i cannot from lxml import etree that is returning:

  File "<stdin>", line 1, in <module>
ImportError: /usr/lib/x86_64-linux-gnu/libxml2.so.2: version `LIBXML2_2.9.0' not found (required by /home/rob/python/stock/local/lib/python2.7/site-packages/lxml/etree.so)

i have libxml however i'm not sure the version, but i installed and reinstalled the latest. also tried to manually install 2.9.0 and still nothing

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It looks like lxml has not been successfully installed. To install lxml on Ubuntu, run

sudo apt-get install libxslt1-dev libxml2

In virtualenv:

pip install --upgrade lxml
pip install cssselect

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...