AGC Analysis Task Versions#
The below table gives a brief overview of the AGC versions. Each version here corresponds to a slightly altered task.
Version |
Datasets |
Cuts |
Systematics |
Machine Learning |
---|---|---|---|---|
0 |
CMS 2015 Open Data (POET) |
Exactly one lepton with \(p_T>25\) GeV; at least four jets with \(p_T>25\) GeV; at least one jet with \(b\)-tag > 0.5 |
\(t\bar{t}\) sample variations, |
None |
1 |
CMS 2015 Open Data (NanoAOD) |
Exactly one lepton with \(p_T>25\) GeV; at least four jets with \(p_T>25\) GeV; at least one jet with \(b\)-tag > 0.5 |
\(t\bar{t}\) sample variations, |
None |
2 (WIP) |
CMS 2015 Open Data (NanoAOD) |
Exactly one lepton with \(p_T>30\) GeV; at least four jets with \(p_T>30\) GeV; at least one jet with \(b\)-tag > 0.5 (see Cuts for additional cuts) |
\(t\bar{t}\) sample variations, |
BDT to predict jet-parton assignment in \(t\bar{t}\) events |
Reference Implementation Versions#
This section is specific to the implementation in the main repository.
The below table gives a brief overview of the different tags of the reference implementation, including descriptions of minor versions and patches which are implementation-specific. Note that the major versions (0, 1, and 2) correspond to differences in analysis task (described above), while minor versions are reserved for individual implementations to assign for small changes and patches. Our reference implementation for each major task (0, 1, 2) will always be the latest tag within that series.
Tag |
Version |
Available Pipelines |
Systematics |
Dependency Management |
---|---|---|---|---|
0.1.0 |
0 |
Pure |
Systematic variations within |
Functions used in |
0.2.0 |
0 |
Pure |
Systematic variations within |
Functions used in |
1.0.0 |
1 |
Pure |
Systematic variations within |
Functions used in |
1.1.0 |
1 |
Pure |
Systematic variations within |
Functions used in |
2.0.0 (WIP) |
2 |
Pure |
Systematic variations within |
Modules are shipped to |
Datasets#
The datasets used for the CMS \(t\bar{t}\) notebook are from the 2015 CMS Open Data release. Versions 0.1.0 and 0.2.0 use ntuples generated using the Physics Objects Extractor Tool (POET).
All versions >=1.0.0 use NanoAOD instead. The NanoAOD was generated from the 2015 CMS Open Data release using this pull request of CMSSW: cms-sw/cmssw#39040. To set this up, the following commands should be run:
source /cvmfs/cms.cern.ch/cmsset_default.sh
scram list CMSSW_10_6_
scram project CMSSW_10_6_30
cd CMSSW_10_6_30/
cmsenv
cd src/
git cms-merge-topic 39040
ls -al
scram build -j5
From this point, for data, you can use:
cmsDriver.py --python_filename doublemuon_cfg.py --eventcontent NANOAOD --customise Configuration/DataProcessing/Utils.addMonitoring --datatier NANOAOD --fileout file:doublemuon_nanoaod.root --conditions 106X_dataRun2_v36 --step NANO --filein file:doublemuon_miniaod.root --era Run2_25ns,run2_nanoAOD_106X2015 --no_exec --data -n -1
For MC, you can use:
cmsDriver.py --python_filename nanoaod15_cfg.py --eventcontent NANOAODSIM --customise Configuration/DataProcessing/Utils.addMonitoring --datatier NANOAODSIM --fileout file:nanoaod15.root --conditions 102X_mcRun2_asymptotic_v8 --step NANO --filein file:miniaod2015.root --era Run2_25ns,run2_nanoAOD_106X2015 --no_exec --mc -n -1
The code used to generate and subsequently merge these files is located in the following repository: ekauffma/produce-nanoAODs
The data used is the same, regardless of MiniAOD vs NanoAOD. The list of datasets separated by process is included below:
ttbar:
nominal:
scale variation:
ME variation:
PS variation:
19999: Powheg + Herwig++, 443 files, 810 GB -> converted
single top:
W+jets:
data:
More information about datasets can be found in analysis-grand-challenge/datasets/cms-open-data-2015/.
Cuts#
For versions 0.1.0, 0.2.0, and 1.0.0, the cuts used are the following:
Leptons (electrons and muons) must have \(p_T>25\) GeV
Events must contain exactly one lepton
Jets must have \(p_T>25\) GeV
Events must have at least four jets
Jets are considered \(b\)-tagged if they have a \(b\)-tag score over B_TAG_THRESHOLD=0.5.
Events must have at least one \(b\)-tagged jet
4j1b Region: Events must have exactly one \(b\)-tagged jet
4j2b Region: Events must have two or more \(b\)-tagged jets
This is modified to better reflect common practices in CMS in subsequent versions, using the following cuts:
Leptons (electrons and muons) must have \(p_T>30\) GeV, \(|\eta|<2.1\), and
sip3d<4
(significance of 3d impact parameter)For electrons, we also require
cutBased==4
(tight)For muons, we also require
tightId
andpfRelIso04_all<0.15
(PF relative isolation dR=0.4, total (deltaBeta corrections))Events must contain exactly one lepton
Jets must have \(p_T>30\) GeV and \(|\eta|>2.4\) as well as satisfy
isTightLeptonVeto
Events must have at least four jets
Jets are considered \(b\)-tagged if they have a \(b\)-tag score over B_TAG_THRESHOLD=0.5.
Events must have at least one \(b\)-tagged jet
4j1b Region: Events must have exactly one \(b\)-tagged jet
4j2b Region: Events must have two or more \(b\)-tagged jets