Pipeline configuration
There are a couple of ways to configure PyPPL
:
- Load configuration dictionary before you start writing your pipeline
from pyppl import load_config
# you will need profile names
load_config({'default': {'forks': 10}})
# you awesome pipeline goes here
- Load configuration file before you start writing your pipeline
# pyppl.toml
[default]
forks = 10
from pyppl import load_config
# you will need profile names
load_config('pyppl.toml')
# you awesome pipeline goes here
By default, PyPPL
will load the configurations from ~/.PyPPL.toml
, ./.PyPPL.toml
and environment variables that start with PYPPL_
.
!!! hint
`PyPPL` uses [`toml`](https://github.com/toml-lang/toml) as configuration language.
- Specify configuration at runtime
from pyppl import PyPPL
PyPPL(config = {'default': {'forks': 10}})
# too complicated? use kwargs:
PyPPL(forks = 10) # it will automatically be profiled as default
# You can also load configuration from file at runtime
PyPPL(config_files = ['pyppl.toml'])
PyPPL(logger = {'level': 'debug'})
# still too complicated? use a underline to connect the item and sub-item:
PyPPL(logger_level = 'debug')
!!! note
Some configuration items can be overridden by running configurations, some can't.
1. If a configuration item is set as a `Proc` property explicitly, it won't be overridden:
```python
pXXX.forks = 20
PyPPL(forks = 10)
# will not change forks of pXXX
# pXXX.forks == 20
# But if forks is initialized in construct
pYYY = Proc(forks = 20)
PyPPL(forks = 10)
# pYYY == 10
```
2. If a configuration item is a dictionary, it will be updated
```python
pXXX.envs.a = 1
pXXX.envs.b = 2
PyPPL(envs_a = 3)
# pXXX.envs == {'a': 3, 'b': 2}
```
3. Otherwise, properties for ALL processes in this pipeline will be overwritten by the runtime configurations.
Full configurations
dict(default = dict(
# plugins
plugins = ['no:flowchart', 'no:report'],
# logger options
logger = dict(
file = None,
theme = 'green_on_black',
level = 'info',
leveldiffs = []
),
# The cache option, True/False/export
cache = True,
# Whether expand directory to check signature
dirsig = True,
# How to deal with the errors
# retry, ignore, halt
# halt to halt the whole pipeline, no submitting new jobs
# terminate to just terminate the job itself
errhow = 'terminate',
# How many times to retry to jobs once error occurs
errntry = 3,
# The directory to export the output files
forks = 1,
# Default shell/language
lang = 'bash',
# number of threads used to build jobs and to check job cache status
nthread = min(int(cpu_count() / 2), 16),
# Where cache file and workdir located
ppldir = './workdir',
# Select the runner
runner = 'local',
# The tag of the job
tag = 'notag',
# The template engine (name)
template = 'liquid',
# The template environment
envs = Diot(),
# configs for plugins
plugin_config = {},
))
Note
For plugins
and logger
configurations:
- They should be ALWAYS specified in profile
default
plugins
can only affect plugins APIs at runtime if configrations passed at runtime. For example, if a plugin only runs atProc
definition, it won't be affected by runtime configurations