ConfigurationΒΆ
This project uses configuration files in the YAML format to define the workflow.
Disclaimer: We are still in alpha, this section is likely to change.
A config file consists of muliple sections. The general
section describes
the version and name of the config.
general:
version: 0.0.1
name: Benchmark
The input section describes the data that is to be processed and might be
changed in the near future. The first
subsection, files
, is a list of files that can be either relative paths,
absolute paths or global paths (e.g. xrootd) and can include wildcards.
input:
files:
- data/L1Ntuple_*.root
The second subsection, sample
is used to describe the data: The name of the
dataset, the title and the run number. The name is likely used in file and histogram names,
while the title is meant to be used in string representations
(e.g titles/legends of histograms). If pileup reweighting is required, the
pileup_file
parameter needs to be set.
input:
...
sample:
name: Data
title: 2016 Data
pileup_file: ""
run_number: 276243
The trigger subsection describes which trigger is to be used for this dataset. As the sample name and title, the trigger counterparts play a similar role.
input:
...
trigger:
name: SingleMu
title: Single Muon
The analysis
section describes which analyzers are to be run.
Global parameters include flags and binning for the analyzers (do_fit
,
pu_bins
). These can also be specified later separately for each analyzer if
required.
analysis:
do_fit: False
pu_bins: 0,13,20,999
The analyzers subsection of analysis
is a list of all analyzers to be run.
These analyzers have to satisfy the same API as
cmsl1t.analyzers.BaseAnalyzer
and be visible in the PYTHONPATH.
analysis:
...
analyzers:
- cmsl1t.analyzers.demo_analyzer
Modifiers are a way to enrich the event content by attaching objects to the
event itself. E.g. cmsl1t.recalc.met.l1MetNot28
reads in
event.caloTowers
and creates a new object, event.l1MetNot28
, that can
then be accessed by all analyzers.
analysis:
...
modifiers:
- cmsl1t.recalc.met.l1MetNot28:
in: event.caloTowers
out: event.l1MetNot28
- cmsl1t.recalc.met.l1MetNot28HF:
in: event.caloTowers
out: event.l1MetNot28HF
Next, you can specify if you want progress information (e.g. a progress bar)
and how often this information is updated (report_every
in units of
events).
analysis:
...
progress_bar:
report_every: 1000
# or to switch it off
# progress_bar:
# enable: False
And finally the output section describes where the output, usually ROOT files,
is stored. The `template
entry is composed of a list of paths that are
joined to create the full output file. The template expects the following named
parameters:
date
sample_name
run_number
trigger_name
which are automatically filled by the config parser
output:
# template is a list here that is joined (os.path.join) in the config
# parser
template:
- benchmark/new
- "{date}_{sample_name}_run-{run_number}_{trigger_name}"
So a complete example would look something like that:
version: 0.0.1
name: Benchmark
input:
files:
- data/L1Ntuple_*.root
sample:
name: Data
title: 2016 Data
trigger:
name: SingleMu
title: Single Muon
pileup_file: ""
run_number: 276243
analysis:
do_fit: False
pu_type: 0PU12,13PU19,20PU
pu_bins: 0,13,20,999
analyzers:
- cmsl1t.analyzers.demo_analyzer
modifiers:
- cmsl1t.recalc.met.l1MetNot28:
in: event.caloTowers
out: event.l1MetNot28
- cmsl1t.recalc.met.l1MetNot28HF:
in: event.caloTowers
out: event.l1MetNot28HF
output:
# template is a list here that is joined (os.path.join) in the config parser
template:
- benchmark/new
- "{date}_{sample_name}_run-{run_number}_{trigger_name}"