Scripts¶
The project contains some scripts to make the analysis easier by automating as work much as possible in steering the XAMPP framework. Most scripts are just simple wrappers for XAMPPplotting scripts. It is not required to use them but it can make your life a bit easier.
The scripts are located in XAMPPmonoH/python and in XAMPPmonoH/util
You first have to make sure that you have an Athena TestArea set up (e.g. by executing the setup script). Then you can run the steering scripts to produce histograms and plots.
Common tasks¶
These scripts automate the most common tasks like histogram creation and plotting.
Download the XAMPP ntuples of a production¶
This script helps to download all samples that match a common identifier, e.g.
- group.phys-exotics.*.0L0604*
- group.phys-exotics.*.1L0604*
- group.phys-exotics.*.2L0604*
Note: You need to have set up lsetup rucio before using this script.
usage: downloadXAMPPntuples [-h] [-d DESTINATION] [-l] [-w WRITEFILE]
identifier
Positional Arguments¶
identifier | Identifier for dataset, e.g. group.phys-exotics.*.0L0500a.*XAMPP |
Named Arguments¶
-d, --destination | |
Directory where datasets should be downloaded to. Default: “./” | |
-l, --listOnly | Only list datasets and do not download. Default: False |
-w, --writeFile | |
Write dataset list to file located at this path. |
Example: Download all ntuples in the 0lepton region with tag 0604 in the current directory:
python MonoHSteeringScripts/getXAMPPntuples.py group.phys-exotics.*.0L0604*XAMPP
Check that download/production of XAMPP ntuples of a production was complete¶
The script validateXAMPPntuples.py
is a wrapper for XAMPPplotting’s CheckMetaData.py
script. It compares the metadata with information stored in AMI to see if the local set of XAMPP ntuples is complete. It also checks the PRW file information stored in the ntuples to the information in AMI (MC only).
It is assumed that you will have stored the XAMPP ntuples in a separate folder for 0 lepton, 1 lepton and 2 lepton ntuples, so you have to run the script for each lepton multiplicity, which is specified by -l 0L/1L/2L
.
The script will check all 4 groups of ntuples, data1516, mc16a, data17, mc16d and generate latex/pdf files documenting the completeness as tables. The script takes quite some time to run, over 10 min usually. It will also provide text files in which the luminosity in data files is stored. You can look at those most efficiently using tail lumi_15.txt (only the last line is relevant). The three numbers are “lumi in local ntuples”, “reference lumi according to GRL for local ntuples” and the difference between those two numbers.
In case you want to run the XAMPPplotting/python/CheckMetaData.py
script without wrapper, you please make sure that you include both input config and sample list containing the names of the DxAODs used to create the samples listed in the input config.
Otherwise it is not clear what should be compared on AMI.
Currently the EXOT derivation sample lists are the reference. In case you want to use a set of sample lists belonging to another derivation, you have to specify it using the --deriv
argument.
Note: You need to have set up lsetup rucio pyami
before using this script. Best practice is to use this command before the AthAnalysis setup is chosen. Otherwise conflicts may arise during execution.
Note: Also make sure that you use the appropriate datasets in XAMPPmonoH/data/SampleLists
for the production you are checking (e.g. by checking out the appropriate tag for the production before running this script)
Note It does not hurt to look at the script in a text editor before executing it. For example, if you only want to validate mc16a and not data, you can comment out the lines that would call XAMPPplotting/python/CheckMetaData.py
for data ntuples.
usage: validateXAMPPntuples [-h] -l {0L,1L,2L} [--deriv {EXOT,HIGG}]
[-d DESTINATION]
[-i [INPUTCONFIGS [INPUTCONFIGS ...]]]
[-o OUTPUTPATH] [-t]
Named Arguments¶
-l, -L, --leptonRegion | |
Possible choices: 0L, 1L, 2L Choose the lepton region (0L, 1L or 2L) | |
--deriv | Possible choices: EXOT, HIGG Derivation which should be used for checking. Possible choices: EXOT, HIGG Default: “EXOT” |
-d, --destination | |
Directory where datasets are stored which should be checked. Input config is automatically created. | |
-i, --inputConfigs | |
Input configs which should be used instead of automatically creating input config for files in directory Default: [] | |
-o, --outputPath | |
Directory where result will be stored. Default: “result_NtupleValidation” | |
-t, --nolatex | Do not create a latex file Default: False |
Example:
Check the completeness of the downloaded ntuples in the 0lepton region with tag 0604
in the directory ``XAMPPmonoH-00-06-04.0lep`̀`:
python MonoHSteeringScripts/validateXAMPPntuples.py -d XAMPPmonoH-00-06-04.0lep/ validation_0604_0lep -l 0L --deriv HIGG
Create Input Configs¶
This is a script to create input configs for the Mono-h(bb) analysis based on the settings defined in XAMPPmonoH/data/inputConfig_config.json
. There are two ways two run this script:
Create input configs with ntuples lying in a local directory: Use the option -i to point to the input directory.
Create input configs with ntuples located on a localgroupdisk: Do NOT use -i anymore, but specify a pattern, which all datasets of interest have in common. With -e one can specify, which patterns should not be part of the dataset name. Note that the pattern elements must be separated with ” “. If the ntuples are not located at the own localgroupdisk, specify the groupdisk by using -r. Example:
`` python createInputConfig.py -o /project/etp5/amatic/MonoH/XAMPP/source/XAMPPplotting/data/InputConf/MonoH/ZeroLepton/0L0601/mc16d –addPRWFiles -p “group.phys-exotics mc16 1L0601 r10201 XAMPP” -e “0L0601a” -r LRZ-LMU_LOCALGROUPDISK ``
Note: The script supports the creation of input configs for different campaigns. Use the --campaigns
argument to select the campaign(s) among possible choices of mc16a,mc16d,mc16e
which should be filtered out from the set of input files.
Script for creating Data input configs
usage: createInputConfig [-h] [-i INDIR] [-o OUTDIR] [-c CONFIG]
[--campaigns {mc16a,mc16d,mc16e,truth} [{mc16a,mc16d,mc16e,truth} ...]]
[-v] [--excludeExtensions] [-p PATTERN]
[-e [EXCLUDE [EXCLUDE ...]]] [-r RSE]
Named Arguments¶
-i, --input | Folder with files (must be an absolute path); to be used only when reading local directory Default: “” |
-o, --output | Output directory to put file list(s) into Default: “./InputConfigs/MonoH/” |
-c, --config | Path to configuration file in JSON format. Default: “/xampp/build/x86_64-slc6-gcc62-opt/data/XAMPPmonoH/inputConfig_config.json” |
--campaigns | Possible choices: mc16a, mc16d, mc16e, truth Choose which data and mc campaigns should be processed for input config creation Default: [‘mc16a’, ‘mc16d’, ‘mc16e’] |
-v, --verbose | Print out in verbose mode Default: False |
--excludeExtensions | |
Exclude files with multiple e-tags or s-tags Default: False | |
-p, --pattern | specify a pattern which is part of dataset name; to be used only when reading RSE Default: “” |
-e, --exclude | specify a pattern which must not be part of dataset name. argument 0 is used for RSE. Default: [] |
-r, --RSE | specify RSE storage element which should be read |
Example: Create input config for all 0 lepton ntuples:
/python MonoHSteeringScripts/createInputConfig.py -i /ptmp/mpp/pgadow/monoH/data/0700/0lep/ -o XAMPPplotting/data/InputConf/MonoH/MPPMU/XAMPPmonoH-00-07-00.SR
Create DSConfigs¶
Automatically create DS configs to be used in XAMPPplotting to create plots out of the histograms.
The names of the histograms are defined in the configuration file XAMPPmonoH/data/DSConfig_config.json
. This is also the file that defines the colour scheme of the plots. Usually only the background samples are defined in this file.
If additional samples, e.g. some signals, should be appended to the DS config, the option --additional
can be used to add the content of other files (e.g. those stored in XAMPPmonoH/data/DSConfig_signal
) to the DSConfig.
Note: There is the --inputConfigs
option to add the path to the folder with your data input configs to the DSconfig (absolute path required !!!) to automatically calculate the luminosity for scaling the MC correctly for the plot. If this option is chosen, the luminosity set by the -l option will be overwritten.
Script for creating DSconfigs for XAMPPplotting
usage: createDSConfig [-h] -i INPUT [-o OUTPUT] [-c CONFIG]
[-a ADDITIONAL [ADDITIONAL ...]] [-l LUMI]
[--inputConfigs [INPUTCONFIGS [INPUTCONFIGS ...]]]
Named Arguments¶
-i, --input | Folder with root files from WriteDefaultHistos (must be an absolute path) |
-o, --output | Output file for the DSConfig Default: “./DSConfig_MonoH.py” |
-c, --config | Configuration file containing information about MC samples. Default: “/xampp/build/x86_64-slc6-gcc62-opt/data/XAMPPmonoH/DSConfig_config.json” |
-a, --additional | |
Path to file(s) containing additional entries for DSConfig file, e.g. data-driven background estimates. | |
-l, --lumi | Luminosity (overwritten by –inputConfigs and auto lumi calculation) Default: “80.” |
--inputConfigs | Path to folder containing input configs or input config(s) to calculate luminosity automatically. Need to give absolute paths here (no ../) Default: [] |
Example:
Create DSConfig for background ntuples and include signals defined in XAMPPmonoH/data/DSConfig_signal/signal.conf
:
python XAMPPmonoH/python/createDSConfig.py -a XAMPPmonoH/data/DSConfig_signal/signal.conf -i /ptmp/mpp/pgadow/Cluster/OUTPUT/2018-11-22/monoHbb_output/
Produce plots¶
This script allows to produce plots from a collection of XAMPP ntuples almost automatically.
The script automatically scans the input ntuples and creates the required input configs to create histograms and creates the required DSConfigs to create plots from the histograms.
To allow for quick identification of the plotting output it is possible to define a job name using the option -n
.
All other settings are configured using a configuration file. The default configuration file is XAMPPmonoH/data/runHistos_config.json
. It is possible to edit this file before running the script using the option -e
.
The default settings of this file are:
"inputBasePath": "XAMPPplotting/data/InputConf/MonoH/MPPMU",
"outputBasePath": "/ptmp/mpp/pgadow/monoH/plottingOutput/",
"tag": "00-08-00",
"ntuplesPath": {
"SR": "/ptmp/mpp/pgadow/monoH/data/XAMPPmonoH-00-07-00.0lep/",
"CR1": "",
"CR2ee": "/ptmp/mpp/pgadow/monoH/data/XAMPPmonoH-00-07-01.2lep/",
"CR2mumu": "/ptmp/mpp/pgadow/monoH/data/XAMPPmonoH-00-07-01.2lep/",
"PH": ""
},
"regions": ["SR"],
"samples": ["data15", "data16", "data17", "data18", "stops", "stopt", "stopWt", "ttbar", "VHbb", "Wenu_cl", "Wenu_hf", "Wenu_hpt", "Wenu_l", "Wmunu_cl", "Wmunu_hf", "Wmunu_hpt", "Wmunu_l", "Wtaunu_cl", "Wtaunu_hf", "Wtaunu_hpt", "Wtaunu_l", "WW", "WZ", "Zee_cl", "Zee_hf", "Zee_hpt", "Zee_l", "Zmumu_cl", "Zmumu_hf", "Zmumu_hpt", "Zmumu_l", "Ztautau_cl", "Ztautau_hf", "Ztautau_hpt", "Ztautau_l", "Znunu_cl", "Znunu_hf", "Znunu_hpt", "Znunu_l", "ZZ"],
"signals": ["zp2hdm", "zphxxbb", "shxxbb", "2HDMa", "monoSbb"],
"doBlinding": true,
"doSystematics": false,
"doPRW": true,
"doFitInputs": false,
"skipInputConfigCreation": false,
"campaigns": ["mc16a", "mc16d"],
"luminosity": "auto",
"engine": "LOCAL"
Explanation of settings:
Settings name | explanation |
---|---|
inputBasePath | Input configs created on-the-fly will be stored here |
outputBasePath | Histogram and plot output will be stored here |
tag | Input config filename: XAMPPmonoH-<tag>.<identifier> |
regions | Regions to be processed (SR, CR1, CR2ee, CR2mumu, PH) |
samples | Process these background samples (filters input conf.) |
signals | Process these signal samples (filters input configs) |
doBlinding | Blind signal region mass window? true / false |
doSystematics | Run with systematics? true / false |
doPRW | Run with PRW? true / false (should be always true) |
doFitInputs | Create strongly reduced set of output histograms |
campaigns | Process these MC campaigns (match dataXX in samples!) |
luminosity | Which luminosity to scale MC (“auto”: calc from input) |
engine | Which batch system (or “LOCAL”) for processing? |
Produce histograms for mono-h(bb) analysis.
usage: runHistos [-h] [-n NAME] [-c CONFIG] [-e]
Named Arguments¶
-n, --name | String to be added to the output directory to help with bookkeeping of plots. Default: “” |
-c, --config | Path to configuration file for histogram/plot creation. Default: “/xampp/build/x86_64-slc6-gcc62-opt/data/XAMPPmonoH/runHistos_config.json” |
-e, --edit | Modify config file for this script prior to execution with a text editor. Default: False |
Example: Create plots and view + edit configuration file before running script:
python XAMPPmonoH/python/runHistos.py -e
Get yields from XAMPPplotting histograms¶
This script allows to calculate event yields from any ROOT histogram, e.g. also those created by XAMPPplotting’s WriteDefaultHistos
.
usage: getYields [-h] [--useOverflow] [--printAllVariables] [--csv]
[-o OUTPUTPATH] [-l LUMINOSITY]
[--resolvedVariable RESOLVEDVARIABLE]
[--mergedVariable MERGEDVARIABLE]
[inputFiles [inputFiles ...]]
Positional Arguments¶
inputFiles | input files |
Named Arguments¶
--useOverflow | take into account overflow and underflow bins Default: False |
--printAllVariables | |
print all variables Default: False | |
--csv | save output as csv file Default: False |
-o, --outputPath | |
name of the directory with output yields Default: “yields” | |
-l, --luminosity | |
Luminosity for backgrounds to be scaled to Default: 36.1 | |
--resolvedVariable | |
name of the variable for the resolved fit input histogram Default: “m_jj” | |
--mergedVariable | |
name of the variable for the resolved fit input histogram Default: “m_J” |
Convert XAMPPplotting output histograms to WSMaker input format¶
In order to run WSMaker easily we have a script to convert the XAMPPplotting output format into WSMaker input format.
Script for converting XAMPPplotting output to WSMaker input
usage: XAMPPplottingToWSMaker [-h] [-i INPUT_DIR] [-o OUTPUT_DIR] [-l LUMI]
[-c] [--mee] [--mmumu]
Named Arguments¶
-i, --input | Folder with XAMPPplotting output files Default: “./” |
-o, --output | Output directory to put WSMaker input file Default: “./WSMakerInput.root” |
-l, --lumi | Integrated luminosity to be applied as signal and background scaling. |
-c, --charge | Use muon charge instead of higgs candidate mass. Default: False |
--mee | Use dielectron mass instead of higgs candidate mass. Default: False |
--mmumu | Use dimuon mass instead of higgs candidate mass. Default: False |
Currently it expects an input file format that matches the output of runHistos.py (i.e. file names like WW.root).
When systematics jobs are run, it is important to switch on the jet flavour splitting for W and Z+jets samples (pick the dedicated RunConfig region files).
Special care must be taken in case of W+jets and Z+jets input! Currently this script works only if beforehand all W+jets and Z+jets ROOT files are merged, respectively. So for instance, if the W+jets background is split into Wenu_l.root, Wenu_cl.root, Wenu_hf.root, Wenu_hpt.root, [enu -> munu], [enu -> munu], all this files must be merged into W.root using hadd first. A similar condition must be respected for Z+jets splittings. Then only the resulting W.root and Z.root files must be considered by this script.
Also data input files must be prepared before running this script. Assuming there is one folder with data15.root and data16.root files and another folder with a data17.root file, where this script is run twice accordingly, the data15.root and data16.root files first need to be merged using hadd where the result is called data.root. The data17.root file must just be renamed to data.root.:
# data1516
hadd data.root data{15,16}.root
rm data{15,16}.root
# data17
mv data17.root data.root
# W/Z+jets
hadd W.root W{enu,munu,taunu}_{cl,hf,hpt,l}.root
hadd Z.root Z{ee,mumu,tautau,nunu}_{cl,hf,hpt,l}.root
rm W{enu,munu,taunu}_{cl,hf,hpt,l}.root
rm Z{ee,mumu,tautau,nunu}_{cl,hf,hpt,l}.root
Often it occurs that some data runs are missing (sorry ATLAS operations). In this case, the integrated luminosity of the data at hand must be determined. One way to do this conveniently is to just run the script https://gitlab.cern.ch/atlas-mpp-xampp/XAMPPplotting/blob/master/python/CheckMetaData.py with the –lumi option. The outcome must then be passed to this script here via the -l option.
**What next? **
In case some systematics still need to be renamed, an example is provided with XAMPPmonoH/utils/formatSystematicHistos.C
.
Currently the fitting code is developed here:
https://gitlab.cern.ch/atlas-phys/exot/jdm/ANA-EXOT-2018-46/fitting-and-limits/WSMaker_MonoH
Bookkeeping¶
The scripts in XAMPPmonoH/python/bookkeeping
are designed to simplify the bookkeeping of
- sample lists: these lists are stored in
XAMPPmonoH/data/SampleLists
and contain the DxAODs used for ntuple creation - ntuple lists: these lists are circulated to the analysis team once a new ntuple production is finished and contain the XAMPP ntuples required for analysis with XAMPPplotting
Create new MC sample list¶
This script works both for data and MC and creates sample lists for both AODs and all derivations that are defined in the configuration files:
XAMPPmonoH/data/samplelist.txt
: list of all dsids + sample names with option to put a veto (will remove sample from DAOD lists) + option to limit sample only to a specific derivation (if multiple derivations are used)XAMPPmonoH/data/samplelist_config.json
: configuration file with settings for sample list creation, e.g. output folder, good run lists, all derivations for which sample lists should be created, rtags, MC campaign, signal keywords, blacklisted samples, and fastsim / fullsim switches
usage: CreateSampleList [-h] [-c CONFIG] [--appyVetoForAOD] [--skipCheckGRL]
[--noDataLists] [--pTags PTAGS [PTAGS ...]]
Named Arguments¶
-c, --config | Path to configuration file Default: “/xampp/build/x86_64-slc6-gcc62-opt/data/XAMPPmonoH/samplelist_config.json” |
--appyVetoForAOD | |
Skip samples with veto flag on the samplelist.txt also for AOD lists. Default: False | |
--skipCheckGRL | Skip check if the GRLs in the config file are up to date. Default: False |
--noDataLists | Don’t update data sample lists, only update MC data lists (e.g. if you have defined period containers and don’t want those overwritten). Default: False |
--pTags | Option to choose specific p-tags (type e.g. pTags p3840 p3841 to get the MC samples with p3840 and the data samples with p3841). This is useful when more than one derivation versions are available. |
Check sample list for completeness¶
This script can be used to check a sample list for completeness.
A sample list containing data is checked against the good run list (GRL) for completeness.
A sample list containing MC is checked against the content of XAMPPmonoH/data/samplelist.txt
for completeness.
This script checks if you have duplicate DSIDs in your sample list (i.e. list with DxAOD names that are used for ntuple production). The file names of the sample list must follow this format: “data[15,16,17]_[derivationname].txt” or “mc16[a,d,e]_[signals,bkgs]_[derivationname].txt” The script checks the sample list for completeness based on the GRL in case for data or the content of XAMPPmonoH/data/samplelist.txt
usage: CheckSampleList [-h] [--GRL GRL] [--derivation DERIVATION] inputFile
Positional Arguments¶
inputFile | your input file with data/MC lists |
Named Arguments¶
--GRL | your input GRL Default: “” |
--derivation | your input derivation (choose HIGG or EXOT), leave blank for automatic detection Default: “” |
Check XAMPP ntuple list for completeness¶
This script can be used to check if the content of a filelist with XAMPP ntuples is complete with respect to a set of sample lists (e.g. those sample lists that were used to create the ntuples during production).
This script checks if a text file containing a list with XAMPP ntuples (i.e. the output of XAMPPmonoH) contains all samples specified on a set of sample lists with the names of DxAODs that went into the ntuple production. Typical use case: The analysis team just finished a ntuple production on the grid and compiled a text file containing all ntuples for download using “rucio download –ndownloader 5 cat textfile.txt”. Before sending the list to the team you want to check using this script if this text file is complete with respect to the sample list containing the DxAODs that went into the ntuple production.
usage: CheckNtupleList [-h] [--sampleLists [SAMPLELISTS [SAMPLELISTS ...]]]
inputFile
Positional Arguments¶
inputFile | your input file listing XAMPP ntuples |
Named Arguments¶
--sampleLists | sample lists with DxAODs used as a reference to check the ntuple file list for completeness (be sure that those are complete!) Default: [] |
Systematics¶
There exist some scripts that allow to plot the systematic up/down variations in support-note-plot quality and create tables listing the magnitude of the up/down variations:
https://gitlab.cern.ch/atlas-mpp-xampp/XAMPPmonoH/tree/master/XAMPPmonoH/python/studysystematics
Other¶
Update DSID proc list¶
The background modeling uncertainties are assigned based on a process list that matches DSIDs to processes: https://gitlab.cern.ch/atlas-mpp-xampp/XAMPPmonoH/blob/master/XAMPPmonoH/data/mc16_dsid_proc.txt
If a sample is not included on this list, the framework will crash for processing this sample.
To simplify the task of updating this list, a very basic script that extracts the samples and DSIDs from the sample lists exists:
https://gitlab.cern.ch/atlas-mpp-xampp/XAMPPmonoH/blob/master/XAMPPmonoH/python/assignDSIDtoProc.py
Signal significance studies¶
There are some scripts that can be used to study the signal significance: