CEP 14 - 新的配方格式 – 第 2 部分 - 允许的键和值
标题 | 新的配方格式 – 第 2 部分 - 允许的键和值 |
状态 | 已接受 |
作者 | Wolf Vollprecht <wolf@prefix.dev> |
创建于 | 2023 年 5 月 23 日 |
更新于 | 2024 年 1 月 22 日 |
讨论 | https://github.com/conda-incubator/ceps/pull/56 |
实施 | https://github.com/prefix-dev/rattler-build |
摘要
我们提出了一个新的配方格式,其灵感主要来源于 conda-build。主要的改变是纯 YAML 格式,不包含任意的 Jinja 或具有语义含义的注释。
本文档基于 CEP 13,后者定义了配方的 YAML 语法。在本 CEP 中,我们定义了配方允许的键和值。
动机
conda-build 格式目前规范不足。对于新格式,我们尝试在一个文档中列出所有值和类型并进行记录。这是更大努力的一部分(有关新的 YAML 语法,请参阅 CEP 13)。
历史
关于新的配方规范应该是什么样子的讨论已经开始。可以在这里找到此讨论的片段:https://github.com/mamba-org/conda-specs/blob/7b53425caa11357487cba3fa9c7397744edb41f8/proposed_specs/recipe.md 新规范的原因是
- 使其更易于解析(“纯 yaml”)。conda-build 使用注释和 jinja 的混合来实现很大的灵活性,但计算机很难解析配方
- 消除围绕多个输出的一些不一致之处(
build
与build/script
等) - 消除对递归解析和求解的任何需求
- 通过确定性格式满足自动化和依赖树分析的需求
与 conda-build 的主要区别
输出和顶层包是互斥的,并且输出与顶层键具有完全相同的结构。如果输出存在,则顶层键与输出键“合并”(例如,对于 about 部分)。
格式
Schema 版本
配方的 YAML schema 的隐式版本为整数 1。为了区分“旧”格式和新格式,我们使用文件名。旧格式是 meta.yaml
,新格式是 recipe.yaml
。可以通过向配方添加 schema_version
键来显式设置版本。
# optional, since implicitly defaults to 1
schema_version: 1 # integer
为了从编辑器中的自动完成和其他 LSP 功能中受益,我们可以向配方添加 schema URL。
# yaml-language-server: $schema=https://raw.githubusercontent.com/prefix-dev/recipe-format/73cd2eed94c576213c5f25ab57adf6d8c83e792a/schema.json
Context 部分
context 部分是一个字典,包含可用于字符串插值的键值对。键值对的右侧是一个标量(布尔值、数字或字符串)。
变量可以引用 context 部分中的其他变量。
[!NOTE] YAML 标准不强制执行键的顺序。我们希望解析器按照它们在文件中定义的顺序解析映射(尤其是 context 部分)。但是,我们不要求配方必须具有此行为才能符合 YAML 标准。鉴于此,实现需要确保在字符串插值之前完成拓扑排序。
# The context dictionary defines arbitrary key-value pairs for Jinja interpolation
# and replaces the {% set ... %} commands commonly used in recipes
context:
variable: test
# note that we can reference previous values, that means that they are rendered in order
other_variable: test_${{ variable }}
Package 部分
没有 outputs 部分的配方是必需的。这是唯一必需的部分(对于单输出配方)。
package:
# The name of the package
name: string
# The version of the package, following the conda version spec
# Note that versions are always strings (1.23 is _not_ a valid version since it is a `float`. Needs to be quoted.)
version: string
Build 部分
build:
# the build number
number: integer # defaults to 0
# the build string. This is usually omitted (can use `${{ hash }}` variable here)
string: string # defaults to a build string made from "package hash & build number"
# A list of Jinja conditions under which to skip the build of this package (they are joined by `or`)
# What is valid in an `if:` condition is valid
skip: [list of expressions]
# whether the package is a noarch package, and if yes, whether it is "generic" or "python"
# defaults to null ("arch" package)
noarch: Option<OneOf<"generic" | "python">>
# script can be a single string or a list of strings
# if script is a single string and ends with `.sh` or `.bat`, then we interpret it as a file
script: string | [string] | Script
# merge the build and host environments (used in many R packages on Windows)
# was `merge_build_host`
merge_build_and_host_envs: bool (defaults to false)
# include files even if they are already in the environment
# as part of some other host dependency
always_include_files: [path]
# do not soft- or hard-link these files, but always copy them was `no_link`
always_copy_files: [glob]
variant:
# Keys to forcibly use for the variant computation (even if they are not in the dependencies)
use_keys: [string]
# Keys to forcibly ignore for the variant computation (even if they are in the dependencies)
ignore_keys: [string]
# used to prefer this variant less
# note: was `track_features`
down_prioritize_variant: integer (negative, defaults to 0)
# settings concerning only Python
python:
# List of strings, where each string follows this format:
# PythonEntryPoint: `bsdiff4 = bsdiff4.cli:main_bsdiff4`
entry_points: [PythonEntryPoint]
# Specifies if python.app should be used as the entrypoint on macOS
# was `osx_is_app`
use_python_app_entrypoint: bool (defaults to false) # macOS only!
# used on conda-forge, still needed?
preserve_egg_dir: bool (default false)
# skip compiling pyc for some files (was `skip_compile_pyc`)
skip_pyc_compilation: [glob]
# settings concerning the prefix detection in files
prefix_detection:
# force the file type of the given files to be TEXT or BINARY
# for prefix replacement
force_file_type:
# force TEXT file type
# was `???`
text: [glob]
# force binary file type
# was `???`
binary: [glob]
# ignore all or specific files for prefix replacement
# was `ignore_prefix_files`
ignore: bool | [path] (defaults to false)
# whether to detect binary files with prefix or not
# was `detect_binary_files_with_prefix`
ignore_binary_files: bool (defaults to true on Unix and (always) false on Windows)
# settings for shared libraries and executables
dynamic_linking:
# linux only, list of rpaths (was rpath)
rpaths: [path] (defaults to ['lib/'])
# whether to relocate binaries or not. If this is a list of paths, then
# only the listed paths are relocated
binary_relocation: bool (defaults to true) | [glob]
# Allow linking against libraries that are not in the run requirements
# (was `missing_dso_whitelist`)
missing_dso_allowlist: [glob]
# Allow runpath / rpath to point to these locations outside of the environment
# (was `runpath_whitelist`)
rpath_allowlist: [glob]
# what to do when detecting overdepending
overdepending_behavior: OneOf<"ignore" | "error"> # (defaults to "error")
# what to do when detecting overlinking
overlinking_behavior: OneOf<"ignore" | "error"> # (defaults to "error")
# REMOVED:
# pre-link: string (was deprecated for a long time)
# Whether to include the recipe or not in the final package - should be specified on command line or other config file?
# noarch_python: bool
# features: list
# msvc_compiler: str
# requires_features: dict
# provides_features: dict
# preferred_env: str
# preferred_env_executable_paths: list
# disable_pip: bool
# marked as "still experimental"
# pin_depends: Enum<"record" | "strict">
# overlinking_ignore_patterns: [glob]
# defaults to patchelf (only cudatoolkit is using `lief` for some reason)
# rpaths_patcher: None
# post-link: path
# pre-unlink: path
# pre-link: path
Script 部分
script:
# the interpreter to use for the script
interpreter: string # defaults to bash on UNIX and cmd.exe on Windows
# the script environment. You can use Jinja to pass through environment variables
# with the `env` key (`${{ env.get("MYVAR") }}`).
env: {string: string}
# secrets that are set as env variables but never shown in the logs or the environment
# The variables are taken from the parent environment by name (e.g. `MY_SECRET`)
secrets: [string]
# The file to use as the script. Automatically adds `bat` or `sh` to the filename
# on Windows or UNIX respectively (if no file extension is given).
file: string # build.sh or build.bat
# A string or list of strings that is the script contents (mutually exclusive with `file`)
content: string | [string]
Source 部分
source: [SourceElement]
其中不同的 source 元素定义如下。
URL source
# url pointing to the source tar.gz|zip|tar.bz2|... (this can be a list of mirrors that point to the same file)
url: url | [url]
# destination folder in work directory
target_directory: path
# rename the downloaded file to this name
file_name: string
# hash of the file
sha256: hex string
# legacy md5 sum of the file (test both, prefer sha256)
md5: hex string
# relative path from recipe file
patches: [path]
Local source
路径可以是
- 目录(“../bla”)
- 归档文件的路径(“../bla.tar.gz”)
- 文件的路径(“../bla.txt”)
# file, absolute or relative to recipe file
path: path
# if there is a gitignore, adhere to it and ignore files that are matched by the ignore rules
# i.e. only copy the subset of files that are not ignored by the rules
use_gitignore: bool (defaults to true)
# destination folder
target_directory: path
# rename the downloaded file to this name
file_name: string
# absolute or relative path from recipe file
patches: [path]
Git source
# URL to the git repository or path to local git repository
git: url | path
# the following 3 keys are mutually exclusive (branch, tag, and rev)
# branch to checkout to
branch: string
# tag to checkout to
tag: string
# revision to checkout to (hash or ref)
rev: string
# depth of the git clone (mutually exclusive with rev)
depth: signed integer (defaults to -1 -> not shallow)
# should this use git-lfs?
lfs: bool (defaults to false)
# destination folder in work directory
target_directory: path
# absolute or relative path from recipe file
patches: [path]
已删除的 source 定义
SVN 和 HG(mercury)source 定义已被删除,因为它们不再相关。
Requirements 部分
requirements:
# build time dependencies, in the build_platform architecture
build: [PackageSelector]
# dependencies to link against, in the target_platform architecture
host: [PackageSelector]
# the section below is copied into the index.json and required at package installation
run: [PackageSelector]
# constrain optional packages (was `run_constrained`)
run_constraints: [PackageSelector]
# the run exports of this package
run_exports: [PackageSelector] OR RunExports
# the run exports to ignore when calculating the requirements
ignore_run_exports:
# ignore run exports by name (e.g. `libgcc-ng`)
by_name: [string]
# ignore run exports that come from the specified packages
from_package: [string]
PackageSelector
配方中的 PackageSelector
当前定义为最多包含两个空格的字符串,如下所示
<name> <version> <build_string>
# examples:
python
python 3.8
python 3.8 h1234567_0
python >=3.8,<3.9
python >=3.8,<3.9 h1234567_0
python 3.9.*
[!NOTE]
MatchSpec
在 conda 中定义了更多选项。目前我们坚持 conda-build 的定义。例如,conda
MatchSpec 允许指定通道、构建编号(在方括号中)和许多其他内容。我们可能会在以后统一这些内容。
RunExports
部分
可以指定的不同类型的 run exports 是
# strong run exports go from build -> host & -> run
strong: [PackageSelector]
# weak run exports go from host -> run
weak: [PackageSelector]
# strong constraints adds a run constraint from build -> run_constraints (was `strong_constrains`)
strong_constraints: [PackageSelector]
# weak constraints adds a run constraint from host -> run_constraints (was `weak_constrains`)
weak_constraints: [PackageSelector]
# noarch run exports go from host -> run for `noarch` builds
noarch: [PackageSelector]
Test 部分
Test 部分的当前状态将在此规范中删除。注意:test 部分还有与 run_test.sh
、run_test.bat
以及 run_test.py
和 run_test.pl
脚本文件相关的奇怪的隐式行为,这些脚本文件作为测试的一部分运行。
run_test.sh
、run_test.bat
以及 run_test.py
和 run_test.pl
脚本文件相关的奇怪的隐式行为,这些脚本文件作为测试的一部分运行。test:
# files (from recipe directory) to include with the tests
files: [glob]
# files (from the work directory) to include with the tests
source_files: [glob]
# requirements at test time, in the target_platform architecture
requires: [PackageSelector]
# commands to execute
commands: [string]
# imports to execute with python (e.g. `import <string>`)
imports: [string]
# downstream packages that should be tested against this package
downstreams: [PackageSelector]
新的 test 部分由测试元素列表组成。每个元素独立执行,并且可以有不同的要求。定义了多种类型的测试元素,例如 command
测试元素、python
测试元素和 downstream
测试元素。
以前,test 部分被写入单个文件夹 (info/test
)。在新格式中,我们建议将每个测试元素写入单独的文件夹 (info/tests/<index>
)。这使我们能够独立运行每个测试元素。
tests: [TestElement]
Command 测试元素
command 测试元素呈现为包含 test_time_dependencies.json
的单个文件夹,其中包含两个键(build
和 run
),它们包含原始的“PackageSelector”字符串。script
呈现为 script.json
,其中包含 interpreter
、env
和其他键(如 Script
部分中定义)。文件被复制到 info/tests/<index>
文件夹中。
# script to execute
# reuse script definition from above
script: string | [string] | Script
# optional extra requirements
requirements:
# extra requirements with build_platform architecture (emulators, ...)
build: [PackageSelector]
# extra run dependencies
run: [PackageSelector]
# extra files to add to the package for the test
files:
# files from $SRC_DIR
source: [glob]
# files from $RECIPE_DIR
recipe: [glob]
Python 测试元素
python 测试元素呈现一个 test_import.py
文件,其中包含要测试的导入。它还会自动运行 pip check
命令以检查是否缺少依赖项。
python:
# list of imports to try
imports: [string]
pip_check: bool # defaults to true
Downstream 测试元素
downstream 测试元素呈现为 test_downstream.json
文件,其中包含带有原始“PackageSelector”字符串的 downstream
键。
downstream: PackageSelector
Outputs 部分
conda-build 在处理多个输出时存在难以理解的行为。我们提出了一些大幅简化。新格式中的每个输出都具有与“顶层”配方相同的键。
来自顶层 build
、source
和 about
部分的值被(深度)合并到每个输出中。
顶层 package 字段被顶层 recipe
字段替换。来自顶层 recipe
的 version
也被合并到每个输出中。顶层名称被忽略(但可以用于,例如,conda-forge 情况下的 feedstock-name)。
# note: instead of `package`, it says `recipe` here
recipe:
name: string # mostly ignored, could be used for feedstock name
version: string # merged into each output if not overwritten by output
outputs:
- package:
name: string
version: string (defaults to top-level version)
build:
script: ...
requirements:
# same definition as top level
# same definitions as on top level, by default merged from outer recipe
about:
source:
tests:
在构建之前,输出按其依赖关系进行拓扑排序。每个输出都充当独立的配方。
[!NOTE] 以前的版本包含关于“仅缓存”输出的想法。我们已将其移至未来的 CEP。
题外话:多个输出的变体计算
合并节点完成后,多个输出被视为独立的配方。因此,每个变体都是为每个输出单独计算的。
另一个棘手的问题是,包可以通过 pin_subpackage(name, exact=True)
约束“强”连接。在这种情况下,pinned 包也应该是输出的“变体配置”的一部分,并进行适当的压缩。
例如,我们可以有三个输出:libmamba
、libmambapy
和 mamba
。
libmamba
-> 创建单个变体,因为它是一个低级 C++ 库libmambapy
-> 创建多个包,每个 Python 版本一个mamba
-> 创建多个包(每个 Python + libmambapy 版本一个)
这在历史上一直是 conda-build 中的一个问题,因为它没有将 pin_subpackage
边缘视为“变体”,有时为两个不同的输出创建相同的哈希值。例如,在以下情况下,conda-build 只会创建一个 foofoo
包(而不是精确地 pin 两个不同的 libfoo 包的两个变体)
About 部分
about:
# a summary of what this package does
summary: string
# a longer description of what this package does (should we allow referencing files here?)
description: string
# the license of the package in SPDX format
license: string (SPDX enforced)
# the license files
license_file: path | [path] (relative paths are found in source directory _or_ recipe directory)
# URL that points to the license – just stored as metadata
license_url: url
# URL to the homepage (used to be `home`)
homepage: url
# URL to the repository (used to be `dev_url`)
repository: url
# URL to the documentation (used to be `doc_url`)
documentation: url
# REMOVED:
# prelink_message:
# license_family: string (deprecated due to SPDX)
# identifiers: [string]
# tags: [string]
# keywords: [string]
# doc_source_url: url
Extra 部分
# a free form YAML dictionary
extra:
<key>: <value>
示例配方
以下是一个示例配方,使用了 https://github.com/conda-incubator/ceps/pull/54 中讨论的 YAML 语法
context:
name: xtensor
version: 0.24.6
package:
name: ${{ name|lower }}
version: ${{ version }}
source:
url: https://github.com/xtensor-stack/xtensor/archive/${{ version }}.tar.gz
sha256: f87259b51aabafdd1183947747edfff4cff75d55375334f2e81cee6dc68ef655
build:
number: 0
skip:
# note that the value is a minijinja expression
- osx or win
requirements:
build:
- ${{ compiler('cxx') }}
- cmake
- if: unix
then: make
host:
- xtl >=0.7,<0.8
run:
- xtl >=0.7,<0.8
run_constraints:
- xsimd >=8.0.3,<10
tests:
- script:
- if: unix
then:
- test -d ${PREFIX}/include/xtensor
- test -f ${PREFIX}/include/xtensor/xarray.hpp
- test -f ${PREFIX}/share/cmake/xtensor/xtensorConfig.cmake
- test -f ${PREFIX}/share/cmake/xtensor/xtensorConfigVersion.cmake
- if: win
then:
- if not exist %LIBRARY_PREFIX%\include\xtensor\xarray.hpp (exit 1)
- if not exist %LIBRARY_PREFIX%\share\cmake\xtensor\xtensorConfig.cmake (exit 1)
- if not exist %LIBRARY_PREFIX%\share\cmake\xtensor\xtensorConfigVersion.cmake (exit 1)
# compile a test package
- if: unix
then:
files:
- testfiles/cmake/*
requirements:
build:
- ${{ compiler('cxx') }}
- cmake
- ninja
script: |
cd testiles/cmake/
mkdir build; cd build
cmake -GNinja ..
cmake --build .
./target/hello_world
- downstream: xtensor-python
- downstream: xtensor-blas
- python:
imports:
- xtensor_python
- xtensor_python.numpy_adapter
或多输出包
context:
name: mamba
libmamba_version: "1.4.2"
libmambapy_version: "1.4.2"
# can also reference previous variables here
mamba_version: ${{ libmamba_version }}
release: "2023.04.06"
libmamba_version_split: ${{ libmamba_version.split('.') }}
# we can leave this out
# package:
# name: mamba-split
# this is inherited by every output
source:
url: https://github.com/mamba-org/mamba/archive/refs/tags/${{ release }}.tar.gz
sha256: bc1ec3de0dd8398fcc6f524e6607d9d8f6dfeeedb2208ebe0f2070c8fd8fdd83
build:
number: 0
outputs:
- package:
name: libmamba
version: ${{ libmamba_version }}
build:
script: ${{ "build_mamba.sh" if unix else "build_mamba.bat" }}
requirements:
build:
- ${{ compiler('cxx') }}
- cmake
- ninja
- ${{ "python" if win }}
host:
- libsolv >=0.7.19
- libcurl
- openssl
- libarchive
- nlohmann_json
- cpp-expected
- reproc-cpp >=14.2.1
- spdlog
- yaml-cpp
- cli11
- fmt
- if: win
then: winreg
run_exports:
- ${{ pin_subpackage('libmamba', max_pin='x.x') }}
ignore_run_exports:
from_package:
- spdlog
- if: win
then: python
tests:
- script:
- if: unix
then:
- test -d ${PREFIX}/include/mamba
- test -f ${PREFIX}/include/mamba/version.hpp
- test -f ${PREFIX}/lib/cmake/libmamba/libmambaConfig.cmake
- test -f ${PREFIX}/lib/cmake/libmamba/libmambaConfigVersion.cmake
- test -e ${PREFIX}/lib/libmamba${SHLIB_EXT}
else:
- if not exist %LIBRARY_PREFIX%\include\mamba\version.hpp (exit 1)
- if not exist %LIBRARY_PREFIX%\lib\cmake\libmamba\libmambaConfig.cmake (exit 1)
- if not exist %LIBRARY_PREFIX%\lib\cmake\libmamba\libmambaConfigVersion.cmake (exit 1)
- if not exist %LIBRARY_PREFIX%\bin\libmamba.dll (exit 1)
- if not exist %LIBRARY_PREFIX%\lib\libmamba.lib (exit 1)
- if: unix
then:
- cat $PREFIX/include/mamba/version.hpp | grep "LIBMAMBA_VERSION_MAJOR ${{ libmamba_version_split[0] }}"
- cat $PREFIX/include/mamba/version.hpp | grep "LIBMAMBA_VERSION_MINOR ${{ libmamba_version_split[1] }}"
- cat $PREFIX/include/mamba/version.hpp | grep "LIBMAMBA_VERSION_PATCH ${{ libmamba_version_split[2] }}"
- package:
name: libmambapy
version: ${{ libmambapy_version }}
build:
script: ${{ "build_mamba.sh" if unix else "build_mamba.bat" }}
string: py${{ CONDA_PY }}h${{ PKG_HASH }}_${{ PKG_BUILDNUM }}
requirements:
build:
- ${{ compiler('cxx') }}
- cmake
- ninja
- if: build_platform != target_platform
then:
- python
- cross-python_${{ target_platform }}
- pybind11
- pybind11-abi
host:
- python
- pip
- pybind11
- pybind11-abi
- openssl
- yaml-cpp
- cpp-expected
- spdlog
- fmt
- termcolor-cpp
- nlohmann_json
- ${{ pin_subpackage('libmamba', exact=True) }}
run:
- python
- ${{ pin_subpackage('libmamba', exact=True) }}
run_exports:
- ${{ pin_subpackage('libmambapy', max_pin='x.x') }}
ignore_run_exports:
from_package:
- spdlog
tests:
- python:
imports:
- libmambapy
- libmambapy.bindings
- script:
- python -c "import libmambapy._version; assert libmambapy._version.__version__ == '${{ libmambapy_version }}'"
- name: mamba
# version: always the same as top-level
build:
script: ${{ "build_mamba.sh" if unix else "build_mamba.bat" }}
string: py${{ CONDA_PY }}h${{ PKG_HASH }}_${{ PKG_BUILDNUM }}
entry_points:
- mamba = mamba.mamba:main
requirements:
build:
- if: build_platform != target_platform
then:
- python
- cross-python_${{ target_platform }}
host:
- python
- pip
- openssl
- ${{ pin_subpackage('libmambapy', exact=True) }}
run:
- python
- conda >=4.14,<23.4
- ${{ pin_subpackage('libmambapy', exact=True) }}
tests:
- python:
imports: [mamba]
- requirements:
run:
- pip
script:
- mamba --help
# check dependencies with pip
- pip check
- if: win
then:
- if exist %PREFIX%\condabin\mamba.bat (exit 0) else (exit 1)
- if: linux
then:
- test -f ${PREFIX}/etc/profile.d/mamba.sh
# these tests work when run on win, but for some reason not during conda build
- mamba create -n test_py2 python=2.7 --dry-run
- mamba install xtensor xsimd -c conda-forge --dry-run
- if: unix
then:
- test -f ${PREFIX}/condabin/mamba
# for some reason tqdm doesn't have a proper colorama dependency so pip check fails
# but that's completely unrelated to mamba
- python -c "import mamba._version; assert mamba._version.__version__ == '${{ mamba_version }}'"
about:
homepage: https://github.com/mamba-org/mamba
license: BSD-3-Clause
license_file: LICENSE
summary: A fast drop-in alternative to conda, using libsolv for dependency resolution
description: Just a package manager
repository: https://github.com/mamba-org/mamba
extra:
recipe-maintainers:
- the_maintainer_bot