最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【记录】Mac中安装和运行pyspider

Mac crifan 2801浏览 0评论

折腾:

【已解决】写Python爬虫爬取汽车之家品牌车系车型数据

期间,去在mac中安装和运行pyspider

先去安装:

<code>➜  AutocarData pip install pyspider
Collecting pyspider
  Downloading https://files.pythonhosted.org/packages/d0/97/d6062c928f53d899ff2a8538fed11d4d425ba3d27c96248a2c601c1c9fef/pyspider-0.3.10.tar.gz (110kB)
    100% |████████████████████████████████| 112kB 50kB/s
Collecting Flask&gt;=0.10 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/77/32/e3597cb19ffffe724ad4bf0beca4153419918e7fa4ba6a34b04ee4da3371/Flask-0.12.2-py2.py3-none-any.whl (83kB)
    100% |████████████████████████████████| 92kB 24kB/s
Collecting Jinja2&gt;=2.7 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl (126kB)
    100% |████████████████████████████████| 133kB 35kB/s
Collecting chardet&gt;=2.2 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)
    100% |████████████████████████████████| 143kB 41kB/s
Collecting cssselect&gt;=0.9 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/7b/44/25b7283e50585f0b4156960691d951b05d061abf4a714078393e51929b30/cssselect-1.0.3-py2.py3-none-any.whl
Collecting lxml (from pyspider)
  Downloading https://files.pythonhosted.org/packages/18/95/abf8204fbbc9a01e0e156029cd1ee974237b5798b9e84477df6c4fabfbd2/lxml-4.2.1-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (8.8MB)
    20% |██████▋                         | 1.8MB 110kB/s eta 0:01:04^C
Operation cancelled by user
</code>

主动取消了:

发现起安装了太多的东西,包括Flask-》为了避免全局的安装的版本,和之前开发的Flask的项目有冲突,所以还是去在虚拟环境中安装吧。

<code>➜  AutocarData pipenv install
Creating a virtualenv for this project…
Using /usr/local/opt/python/bin/python3.6 (3.6.4) to create virtualenv…
⠋Already using interpreter /usr/local/opt/python/bin/python3.6
Using base prefix '/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6'
New python executable in /Users/crifan/.local/share/virtualenvs/AutocarData-xI-iqIq4/bin/python3.6
Also creating executable in /Users/crifan/.local/share/virtualenvs/AutocarData-xI-iqIq4/bin/python
Installing setuptools, pip, wheel...done.

Virtualenv location: /Users/crifan/.local/share/virtualenvs/AutocarData-xI-iqIq4
Creating a Pipfile for this project…
Pipfile.lock not found, creating…
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
Updated Pipfile.lock (625834)!
Installing dependencies from Pipfile.lock (625834)…
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 0/0 — 00:00:00
To activate this project's virtualenv, run the following:
 $ pipenv shell
➜  AutocarData pipenv shell
Spawning environment shell (/bin/zsh). Use 'exit' to leave.
. /Users/crifan/.local/share/virtualenvs/AutocarData-xI-iqIq4/bin/activate
➜  AutocarData . /Users/crifan/.local/share/virtualenvs/AutocarData-xI-iqIq4/bin/activate
➜  AutocarData which python
/Users/crifan/.local/share/virtualenvs/AutocarData-xI-iqIq4/bin/python
</code>

然后再去在虚拟环境中安装pyspider:

<code>➜  AutocarData pip install pyspider
Collecting pyspider
  Cache entry deserialization failed, entry ignored
  Using cached https://files.pythonhosted.org/packages/d0/97/d6062c928f53d899ff2a8538fed11d4d425ba3d27c96248a2c601c1c9fef/pyspider-0.3.10.tar.gz
Collecting Flask&gt;=0.10 (from pyspider)
  Cache entry deserialization failed, entry ignored
  Using cached https://files.pythonhosted.org/packages/77/32/e3597cb19ffffe724ad4bf0beca4153419918e7fa4ba6a34b04ee4da3371/Flask-0.12.2-py2.py3-none-any.whl
Collecting Jinja2&gt;=2.7 (from pyspider)
  Cache entry deserialization failed, entry ignored
  Using cached https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl
Collecting chardet&gt;=2.2 (from pyspider)
  Cache entry deserialization failed, entry ignored
  Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting cssselect&gt;=0.9 (from pyspider)
  Cache entry deserialization failed, entry ignored
  Using cached https://files.pythonhosted.org/packages/7b/44/25b7283e50585f0b4156960691d951b05d061abf4a714078393e51929b30/cssselect-1.0.3-py2.py3-none-any.whl
Collecting lxml (from pyspider)
  Cache entry deserialization failed, entry ignored
  Downloading https://files.pythonhosted.org/packages/a4/7c/0c333ccdaa04628b4df46d36b8a700d7810ffecd1371de796e2403fe9380/lxml-4.2.1-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (8.7MB)
    100% |████████████████████████████████| 8.7MB 214kB/s
Collecting pycurl (from pyspider)
  Downloading https://files.pythonhosted.org/packages/77/d9/d272b38e6e25d2686e22f6058820298dadead69340b1c57ff84c87ef81f0/pycurl-7.43.0.1.tar.gz (195kB)
    100% |████████████████████████████████| 204kB 101kB/s
Collecting requests&gt;=2.2 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/49/df/50aa1999ab9bde74656c2919d9c0c085fd2b3775fd3eca826012bef76d8c/requests-2.18.4-py2.py3-none-any.whl (88kB)
    100% |████████████████████████████████| 92kB 71kB/s
Collecting Flask-Login&gt;=0.2.11 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/c1/ff/bd9a4d2d81bf0c07d9e53e8cd3d675c56553719bbefd372df69bf1b3c1e4/Flask-Login-0.4.1.tar.gz
Collecting u-msgpack-python&gt;=1.6 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/fa/41/4b831124d203fbb1be8e1b37f77ace336d9bb5661872885fa94ca98ebf06/u_msgpack_python-2.5.0-py2.py3-none-any.whl
Collecting click&gt;=3.3 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/34/c1/8806f99713ddb993c5366c362b2f908f18269f8d792aff1abfd700775a77/click-6.7-py2.py3-none-any.whl (71kB)
    100% |████████████████████████████████| 71kB 66kB/s
Collecting six&gt;=1.5.0 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Collecting tblib&gt;=1.3.0 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/4a/82/1b9fba6e93629a8557f9784cd8f1ae063c8762c26446367a6764edd328ce/tblib-1.3.2-py2.py3-none-any.whl
Collecting wsgidav&gt;=2.0.0 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/0a/5b/ee5f62917b51756a76b2edf8a7334ef8d659c79af3aaccf16cbd1319936a/WsgiDAV-2.3.0-py2.py3-none-any.whl (140kB)
    100% |████████████████████████████████| 143kB 80kB/s
Collecting tornado&lt;=4.5.3,&gt;=3.2 (from pyspider)
  Downloading https://files.pythonhosted.org/packages/e3/7b/e29ab3d51c8df66922fea216e2bddfcb6430fb29620e5165b16a216e0d3c/tornado-4.5.3.tar.gz (484kB)
    100% |████████████████████████████████| 491kB 86kB/s
Collecting pyquery (from pyspider)
  Downloading https://files.pythonhosted.org/packages/09/c7/ce8c9c37ab8ff8337faad3335c088d60bed4a35a4bed33a64f0e64fbcf29/pyquery-1.4.0-py2.py3-none-any.whl
Collecting Werkzeug&gt;=0.7 (from Flask&gt;=0.10-&gt;pyspider)
  Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
    100% |████████████████████████████████| 327kB 99kB/s
Collecting itsdangerous&gt;=0.21 (from Flask&gt;=0.10-&gt;pyspider)
Collecting MarkupSafe&gt;=0.23 (from Jinja2&gt;=2.7-&gt;pyspider)
Collecting idna&lt;2.7,&gt;=2.5 (from requests&gt;=2.2-&gt;pyspider)
  Downloading https://files.pythonhosted.org/packages/27/cc/6dd9a3869f15c2edfab863b992838277279ce92663d334df9ecf5106f5c6/idna-2.6-py2.py3-none-any.whl (56kB)
    100% |████████████████████████████████| 61kB 111kB/s
Collecting urllib3&lt;1.23,&gt;=1.21.1 (from requests&gt;=2.2-&gt;pyspider)
  Downloading https://files.pythonhosted.org/packages/63/cb/6965947c13a94236f6d4b8223e21beb4d576dc72e8130bd7880f600839b8/urllib3-1.22-py2.py3-none-any.whl (132kB)
    100% |████████████████████████████████| 133kB 98kB/s
Collecting certifi&gt;=2017.4.17 (from requests&gt;=2.2-&gt;pyspider)
  Downloading https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl (150kB)
    100% |████████████████████████████████| 153kB 139kB/s
Collecting defusedxml (from wsgidav&gt;=2.0.0-&gt;pyspider)
  Downloading https://files.pythonhosted.org/packages/87/1c/17f3e3935a913dfe2a5ca85fa5ccbef366bfd82eb318b1f75dadbf0affca/defusedxml-0.5.0-py2.py3-none-any.whl
Building wheels for collected packages: pyspider, pycurl, Flask-Login, tornado
  Running setup.py bdist_wheel for pyspider ... done
  Stored in directory: /Users/crifan/Library/Caches/pip/wheels/39/60/ec/9ba1af9e0798333d32198784880b8cc5b22f00a81801c6fcec
  Running setup.py bdist_wheel for pycurl ... done
  Stored in directory: /Users/crifan/Library/Caches/pip/wheels/77/c8/b6/bed2606b4ae3cf738c99c111d88ce33d8ae82171c40cbddbf0
  Running setup.py bdist_wheel for Flask-Login ... done
  Stored in directory: /Users/crifan/Library/Caches/pip/wheels/39/10/74/d68194e28d5f7a83de5f66e5b2deff5ccbb424fe45e6b0e927
  Running setup.py bdist_wheel for tornado ... done
  Stored in directory: /Users/crifan/Library/Caches/pip/wheels/72/bf/f4/b68fa69596986881b397b18ff2b9af5f8181233aadcc9f76fd
Successfully built pyspider pycurl Flask-Login tornado
Installing collected packages: click, Werkzeug, itsdangerous, MarkupSafe, Jinja2, Flask, chardet, cssselect, lxml, pycurl, idna, urllib3, certifi, requests, Flask-Login, u-msgpack-python, six, tblib, defusedxml, wsgidav, tornado, pyquery, pyspider
Successfully installed Flask-0.12.2 Flask-Login-0.4.1 Jinja2-2.10 MarkupSafe-1.0 Werkzeug-0.14.1 certifi-2018.4.16 chardet-3.0.4 click-6.7 cssselect-1.0.3 defusedxml-0.5.0 idna-2.6 itsdangerous-0.24 lxml-4.2.1 pycurl-7.43.0.1 pyquery-1.4.0 pyspider-0.3.10 requests-2.18.4 six-1.11.0 tblib-1.3.2 tornado-4.5.3 u-msgpack-python-2.5.0 urllib3-1.22 wsgidav-2.3.0
➜  AutocarData pipenv graph
pyspider==0.3.10
  - chardet [required: &gt;=2.2, installed: 3.0.4]
  - click [required: &gt;=3.3, installed: 6.7]
  - cssselect [required: &gt;=0.9, installed: 1.0.3]
  - Flask [required: &gt;=0.10, installed: 0.12.2]
    - click [required: &gt;=2.0, installed: 6.7]
    - itsdangerous [required: &gt;=0.21, installed: 0.24]
    - Jinja2 [required: &gt;=2.4, installed: 2.10]
      - MarkupSafe [required: &gt;=0.23, installed: 1.0]
    - Werkzeug [required: &gt;=0.7, installed: 0.14.1]
  - Flask-Login [required: &gt;=0.2.11, installed: 0.4.1]
    - Flask [required: Any, installed: 0.12.2]
      - click [required: &gt;=2.0, installed: 6.7]
      - itsdangerous [required: &gt;=0.21, installed: 0.24]
      - Jinja2 [required: &gt;=2.4, installed: 2.10]
        - MarkupSafe [required: &gt;=0.23, installed: 1.0]
      - Werkzeug [required: &gt;=0.7, installed: 0.14.1]
  - Jinja2 [required: &gt;=2.7, installed: 2.10]
    - MarkupSafe [required: &gt;=0.23, installed: 1.0]
  - lxml [required: Any, installed: 4.2.1]
  - pycurl [required: Any, installed: 7.43.0.1]
  - pyquery [required: Any, installed: 1.4.0]
    - cssselect [required: &gt;0.7.9, installed: 1.0.3]
    - lxml [required: &gt;=2.1, installed: 4.2.1]
  - requests [required: &gt;=2.2, installed: 2.18.4]
    - certifi [required: &gt;=2017.4.17, installed: 2018.4.16]
    - chardet [required: &lt;3.1.0,&gt;=3.0.2, installed: 3.0.4]
    - idna [required: &gt;=2.5,&lt;2.7, installed: 2.6]
    - urllib3 [required: &lt;1.23,&gt;=1.21.1, installed: 1.22]
  - six [required: &gt;=1.5.0, installed: 1.11.0]
  - tblib [required: &gt;=1.3.0, installed: 1.3.2]
  - tornado [required: &lt;=4.5.3,&gt;=3.2, installed: 4.5.3]
  - u-msgpack-python [required: &gt;=1.6, installed: 2.5.0]
  - wsgidav [required: &gt;=2.0.0, installed: 2.3.0]
    - defusedxml [required: Any, installed: 0.5.0]

</code>

运行:

pyspider

【已解决】pyspider运行出错:ImportError pycurl libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)

接着又出现:

【已解决】pyspider运行出错:Error Could not create web server listening on port 25555

【总结】

Mac中安装pyspider的步骤:

直接:

<code>pip install pyspider
</code>

然后再去运行:

<code>pyspider
</code>

 即可。

运行时如果出现:

ImportError pycurl libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)

则解决办法是:

<code>pip uninstall pycurl
export PYCURL_SSL_LIBRARY=openssl
export LDFLAGS=-L/usr/local/opt/openssl/lib;export CPPFLAGS=-I/usr/local/opt/openssl/include;pip install pycurl --compile --no-cache-dir
</code>

后续如果遇到:

Error Could not create web server listening on port 25555

则解决办法是:

在Mac中,找到谁占用了25555端口,然后再去kill掉:

<code>➜  AutocarData lsof -i:25555
COMMAND     PID   USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
phantomjs 46971 crifan   12u  IPv4 0xe4d24cdcaf5e481f      0t0  TCP *:25555 (LISTEN)
➜  AutocarData kill  46971
</code>

转载请注明:在路上 » 【记录】Mac中安装和运行pyspider

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
82 queries in 0.167 seconds, using 22.04MB memory