Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
202 views
in Technique[技术] by (71.8m points)

python 3.x - Add numpy.get_include() argument to setuptools without preinstalled numpy

I am currently developing a python package that uses cython and numpy and I want the package to be installable using the pip install command from a clean python installation. All dependencies should be installed automatically. I am using setuptools with the following setup.py:

import setuptools

my_c_lib_ext = setuptools.Extension(
    name="my_c_lib",
    sources=["my_c_lib/some_file.pyx"]
)

setuptools.setup(
    name="my_lib",
    version="0.0.1",
    author="Me",
    author_email="[email protected]",
    description="Some python library",
    packages=["my_lib"],
    ext_modules=[my_c_lib_ext],
    setup_requires=["cython >= 0.29"],
    install_requires=["numpy >= 1.15"],
    classifiers=[
        "Programming Language :: Python :: 3",
        "Operating System :: OS Independent"
    ]
)

This has worked great so far. The pip install command downloads cython for the build and is able to build my package and install it together with numpy.

Now I want to improve the performance of my cython code, which leads to some changes in my setup.py. I need to add include_dirs=[numpy.get_include()] to either the call of setuptools.Extension(...) or setuptools.setup(...) which means that I also need to import numpy. (See http://docs.cython.org/en/latest/src/tutorial/numpy.html and Make distutils look for numpy header files in the correct place for rationals.)

This is bad. Now the user cannot call pip install from a clean environment, because import numpy will fail. The user needs to pip install numpy before installing my library. Even if I move "numpy >= 1.15" from install_requires to setup_requires the installation fails, because the import numpy is evaluated earlier.

Is there a way to evaluate the include_dirs at a later point of the installation, for example, after the dependencies from setup_requires or install_requires have been resolved? I really like to have all dependencies resolved automatically and I dont want the user to type multiple pip install commands.

The following snippet works, but it is not officially supported because it uses an undocumented (and private) method:

class NumpyExtension(setuptools.Extension):
    # setuptools calls this function after installing dependencies
    def _convert_pyx_sources_to_lang(self):
        import numpy
        self.include_dirs.append(numpy.get_include())
        super()._convert_pyx_sources_to_lang()

my_c_lib_ext = NumpyExtension(
    name="my_c_lib",
    sources=["my_c_lib/some_file.pyx"]
)

The article How to Bootstrap numpy installation in setup.py proposes using a cmdclass with custom build_ext class. Unfortunately, this breaks the build of the cython extension because cython also customizes build_ext.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

First question, when is numpy needed? It is needed during the setup (i.e. when build_ext-funcionality is called) and in the installation, when the module is used. That means numpy should be in setup_requires and in install_requires.

There are following alternatives to solve the issue for the setup:

  1. using PEP 517/518 (which is more straight forward IMO)
  2. using setup_requires-argument of setup and postponing import of numpy until setup's requirements are satisfied (which is not the case at the start of setup.py's execution)

PEP 517/518-solution:

Put next to setup.py a pyproject.toml-file , with the following content:

[build-system]
requires = ["setuptools", "wheel", "Cython>=0.29", "numpy >= 1.15"]

which defines packages needed for building, and then install using pip install . in the folder with setup.py. A disadvantage of this method is that python setup.py install no longer works, as it is pip that reads pyproject.toml. However, I would use this approach whenever possible.


Posponing import

This approach is more complicated and somewhat hacky, but works also without pip.

First, let's take a look at unsuccessful tries so far:

pybind11-trick @chrisb's "pybind11"-trick, which can be found here: With help of an indirection, one delays the call to import numpy until numpy is present during the setup-phase, i.e.:

class get_numpy_include(object):

    def __str__(self):
        import numpy
        return numpy.get_include()
...
my_c_lib_ext = setuptools.Extension(
    ...
    include_dirs=[get_numpy_include()]
)

Clever! The problem: it doesn't work with the Cython-compiler: somewhere down the line, Cython passes the get_numpy_include-object to os.path.join(...,...) which checks whether the argument is really a string, which it obviously isn't.

This could be fixed by inheriting from str, but the above shows the dangers of the approach in the long run - it doesn't use the designed mechanics, is brittle and may easily fail in the future.

the classical build_ext-solution

Which looks as following:

...
from setuptools.command.build_ext import build_ext as _build_ext

class build_ext(_build_ext):
    def finalize_options(self):
        _build_ext.finalize_options(self)
        # Prevent numpy from thinking it is still in its setup process:
        __builtins__.__NUMPY_SETUP__ = False
        import numpy
        self.include_dirs.append(numpy.get_include())

setupttools.setup(
    ...
    cmdclass={'build_ext':build_ext},
    ...
)

Yet also this solution doesn't work with cython-extensions, because pyx-files don't get recognized.

The real question is, how did pyx-files get recognized in the first place? The answer is this part of setuptools.command.build_ext:

...
try:
    # Attempt to use Cython for building extensions, if available
    from Cython.Distutils.build_ext import build_ext as _build_ext
    # Additionally, assert that the compiler module will load
    # also. Ref #1229.
    __import__('Cython.Compiler.Main')
except ImportError:
    _build_ext = _du_build_ext
...

That means setuptools tries to use the Cython's build_ext if possible, and because the import of the module is delayed until build_ext is called, it founds Cython present.

The situation is different when setuptools.command.build_ext is imported at the beginning of the setup.py - the Cython isn't yet present and a fall back without cython-functionality is used.

mixing up pybind11-trick and classical solution

So let's add an indirection, so we don't have to import setuptools.command.build_ext directly at the beginning of setup.py:

....
# factory function
def my_build_ext(pars):
     # import delayed:
     from setuptools.command.build_ext import build_ext as _build_ext#
 
     # include_dirs adjusted: 
     class build_ext(_build_ext):
         def finalize_options(self):
             _build_ext.finalize_options(self)
             # Prevent numpy from thinking it is still in its setup process:
             __builtins__.__NUMPY_SETUP__ = False
             import numpy
             self.include_dirs.append(numpy.get_include())
     
    #object returned:
    return build_ext(pars)
...
setuptools.setup(
    ...
    cmdclass={'build_ext' : my_build_ext},
    ...
)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...