ConfigurationΒΆ
This project uses configuration files in the YAML format to define the workflow.
Disclaimer: We are still in alpha, this section is likely to change.
A config file consists of muliple sections. The general section describes
the version and name of the config.
general:
version: 0.0.1
name: Benchmark
The input section describes the data that is to be processed and might be
changed in the near future. The first
subsection, files, is a list of files that can be either relative paths,
absolute paths or global paths (e.g. xrootd) and can include wildcards.
input:
files:
- data/L1Ntuple_*.root
The second subsection, sample is used to describe the data: The name of the
dataset, the title and the run number. The name is likely used in file and histogram names,
while the title is meant to be used in string representations
(e.g titles/legends of histograms). If pileup reweighting is required, the
pileup_file parameter needs to be set.
input:
...
sample:
name: Data
title: 2016 Data
pileup_file: ""
run_number: 276243
The trigger subsection describes which trigger is to be used for this dataset. As the sample name and title, the trigger counterparts play a similar role.
input:
...
trigger:
name: SingleMu
title: Single Muon
The analysis section describes which analyzers are to be run.
Global parameters include flags and binning for the analyzers (do_fit,
pu_bins). These can also be specified later separately for each analyzer if
required.
analysis:
do_fit: False
pu_bins: 0,13,20,999
The analyzers subsection of analysis is a list of all analyzers to be run.
These analyzers have to satisfy the same API as
cmsl1t.analyzers.BaseAnalyzer and be visible in the PYTHONPATH.
analysis:
...
analyzers:
- cmsl1t.analyzers.demo_analyzer
Modifiers are a way to enrich the event content by attaching objects to the
event itself. E.g. cmsl1t.recalc.met.l1MetNot28 reads in
event.caloTowers and creates a new object, event.l1MetNot28, that can
then be accessed by all analyzers.
analysis:
...
modifiers:
- cmsl1t.recalc.met.l1MetNot28:
in: event.caloTowers
out: event.l1MetNot28
- cmsl1t.recalc.met.l1MetNot28HF:
in: event.caloTowers
out: event.l1MetNot28HF
Next, you can specify if you want progress information (e.g. a progress bar)
and how often this information is updated (report_every in units of
events).
analysis:
...
progress_bar:
report_every: 1000
# or to switch it off
# progress_bar:
# enable: False
And finally the output section describes where the output, usually ROOT files,
is stored. The `template entry is composed of a list of paths that are
joined to create the full output file. The template expects the following named
parameters:
datesample_namerun_numbertrigger_name
which are automatically filled by the config parser
output:
# template is a list here that is joined (os.path.join) in the config
# parser
template:
- benchmark/new
- "{date}_{sample_name}_run-{run_number}_{trigger_name}"
So a complete example would look something like that:
version: 0.0.1
name: Benchmark
input:
files:
- data/L1Ntuple_*.root
sample:
name: Data
title: 2016 Data
trigger:
name: SingleMu
title: Single Muon
pileup_file: ""
run_number: 276243
analysis:
do_fit: False
pu_type: 0PU12,13PU19,20PU
pu_bins: 0,13,20,999
analyzers:
- cmsl1t.analyzers.demo_analyzer
modifiers:
- cmsl1t.recalc.met.l1MetNot28:
in: event.caloTowers
out: event.l1MetNot28
- cmsl1t.recalc.met.l1MetNot28HF:
in: event.caloTowers
out: event.l1MetNot28HF
output:
# template is a list here that is joined (os.path.join) in the config parser
template:
- benchmark/new
- "{date}_{sample_name}_run-{run_number}_{trigger_name}"