448 views
# Accessing refitted PVs in Turbo data Looking in to [Michael's refitted PV problem][email]. First thing is to reproduce. Find a 2016 LFN (the `dst-dump` is in a private repo): ``` /lhcb/LHCb/Collision16/CHARMMULTIBODY.MDST/00076512/0003/00076512_00031113_1.charmmultibody.mdst ``` Then write the minimal DaVinci options. ```python from Configurables import DaVinci DaVinci().InputType = 'MDST' DaVinci().RootInTES = '/Event/Turbo' DaVinci().Turbo = True DaVinci().DataType = '2016' ``` Interactively explore. ``` $ lb-run DaVinci/v44r10p1 ipython -i ~/public/run_all.py davinci.py root://bohr3226.tier2.hep.manchester.ac.uk:1094// dpm/tier2.hep.manchester.ac.uk/home/lhcb/lhcb/LHCb/Collision16/CHARMMULTIBODY.MDST/00076512/0003/00076512_00031113_1. charmmultibody.mdst ``` I like to check the paths known to the link manager of the packed Turbo locations, as these will show unpacked locations regardless of whether the unpacking is working or not (as it's often the culprit). Example: ```python In [1]: TES['/Event/Turbo/pPhys/Vertices'].linkMgr().link(0).path() Out[1]: '/Event/Charmmultibody/Turbo/Hlt2CharmHadXiccp2D0Pp_D02KmPipTurbo/decayVertices' ``` The vertices in 2016 are packed under `/Event/Turbo/pPhys/RecVertices`, so we can scan events and see what locations we have. ```python In [2]: def link_manager_paths(data_object): ...: if not data_object: ...: return [] ...: mgr = data_object.linkMgr() ...: nlinks = mgr.size() ...: return [mgr.link(i).path() for i in range(nlinks)] In [4]: vertices = set() In [5]: recvertices = set() In [6]: for _ in range(30000): ...: vertices.update(link_manager_paths(TES['/Event/Turbo/pPhys/Vertices'])) ...: recvertices.update(link_manager_paths(TES['/Event/Turbo/pPhys/RecVertices'])) ...: appMgr.run(1) ...: In [7]: recvertices Out[7]: {'/Event/Turbo/Hlt2CharmHadD02KmPipTurbo/_ReFitPVs', '/Event/Turbo/Primary'} ``` This confirms Michael's findings: the only refitted PVs being packed in HLT2 are those belonging to the `D02KmPip` line. So, something's wrong! There are two causes for this symptom: 1. The refitted PVs are being created in HLT2, but not persisted. 2. The refitted PVs are not being created in HLT2, so there's nothing to persist. We can check whether PV refitting was enabled using TCKsh. ```python $ lb-run Moore/latest iTCKsh ``` Pick a 2016 TCK, then probe the associated algorithms' `ReFitPVs` property: ```python In [1]: tcks_16 = [int(k, 16) for k, v in getTCKs() if k.startswith('0x2') and k[-4:-2] == '16' and 'Physics' in v] In [2]: getProperties(tcks_16[0], '.*Xic0.*LTUNB.*', 'ReFitPVs') Out[2]: {'Hlt2CharmHadXic0ToPpKmKmPip_LTUNBTurboXic0ToPpKmKmPip_LTUNBTurboCombiner': {'ReFitPVs': 'False'}, 'Hlt2CharmHadXic0ToPpKmKmPip_LTUNBTurboXic0ToPpKmKmPip_LTUNBTurboFilter': {'ReFitPVs': 'True'}, 'Hlt2CharmHadXic0ToPpKmKmPip_LTUNBTurboXic0ToPpKmKmPip_LTUNBTurboTisTosTagger': {'ReFitPVs': 'False'}} ``` Indeed, the refitting flag is set on an algorithm used in the line, so why do we see the PVs? What cuts is the algorithm applying (guessing it's a `FilterDesktop` as it has `Filter` in the name)? ``` In [3]: getProperties(tcks_16[0], '.*Hlt2CharmHadXic0ToPpKmKmPip_LTUNBTurboXic0ToPpKmKmPip_LTUNBTurboFilter', 'Code' ...: ) Out[3]: {'Hlt2CharmHadXic0ToPpKmKmPip_LTUNBTurboXic0ToPpKmKmPip_LTUNBTurboFilter': {'Code': 'in_range( 2396.0 , M , 2770.0 )'}} ``` Aha. It's not applying any selection that would request the associated PV. Algorithms like `FilterDesktop`, used here, and `CombineParticles` (that inherit from `DVAlgorithm`) compute particle-to-PV relations only when the relations are requested. PVs are refitted at the point at which the relations are requested. So, no refitted PVs will be created if no actions that trigger the relations to be computed are made. So, Michael's problem is real, and the refitted PVs are lost. This is 'expected behaviour', but rather counterintuitive. Are other lines affected? Most PV-related cuts start contain `BPV`. We can get a rough idea of what lines are using `ReFitPVs` 'correctly'. ``` In [4]: info = getProperties(tcks_16[0], '.*', 'ReFitPVs|Code|MotherCut') In [5]: for alg_name, props in info.items(): ...: if props.get('ReFitPVs', False) == 'True': ...: code = props.get('Code', '') ...: mother_cut = props.get('MotherCut', '') ...: cut = code or mother_cut ...: if 'BPV' in cut: ...: print alg_name, cut Hlt2CharmHadD02HH_D0ToKmPipCombiner : (VFASPF(VCHI2PDOF) < 10.0)& (BPVVDCHI2> 25.0 )& (BPVDIRA > lcldira ) Hlt2CharmHadD02HH_D0ToPimPipCombiner : (VFASPF(VCHI2PDOF) < 10.0)& (BPVVDCHI2> 25.0 )& (BPVDIRA > lcldira ) Hlt2CharmHadD02HH_D0ToKmKpCombiner : (VFASPF(VCHI2PDOF) < 10.0)& (BPVVDCHI2> 25.0 )& (BPVDIRA > lcldira ) Hlt2CharmHadD02HH_D0ToKmKp_LTUNBCombiner : (VFASPF(VCHI2PDOF) < 10.0)& (BPVDIRA > lcldira )& (BPVLTIME() > 0.00025 ) Hlt2CharmHadD02HH_D0ToKmPip_LTUNBCombiner : (VFASPF(VCHI2PDOF) < 10.0)& (BPVDIRA > lcldira )& (BPVLTIME() > 0.00025 ) Hlt2CharmHadD02HH_D0ToPimPip_LTUNBCombiner : (VFASPF(VCHI2PDOF) < 10.0)& (BPVDIRA > lcldira )& (BPVLTIME() > 0.00025 ) ``` There's the combiner used by the `D02KmPip`, which we saw refitted PVs for earlier. Inverting the `BPV in cut` conditional will give us lines which probably are affected. ``` In [6]: for alg_name, props in info.items(): ...: if props.get('ReFitPVs', False) == 'True': ...: code = props.get('Code', '') ...: mother_cut = props.get('MotherCut', '') ...: cut = code or mother_cut ...: if 'BPV' not in cut: ...: print '{:<80}: {}'.format(alg_name, cut) ...: ...: Hlt2CharmHadDspToPimPipPipTurboDs2HHHFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadDstp2D0Pip_D02KmKp_LTUNBTurboD0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadXic0ToPpKmKmPipTurboXic0ToPpKmKmPipTurboFilter : in_range( 2396.0 , M , 2770.0 ) Hlt2CharmHadDstp2D0Pip_D02KmPip_LTUNBTurboD0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadDstp2D0Pip_D02KpPimCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadXic2HHH_XicpToKmPpPipMassFilter : in_range( 2392.0 , M , 2543.0 ) Hlt2CharmHadDstp2D0Pip_D02KS0KS0_KS0DDTurboD02V0V0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadLcpToPpKpPimTurboLc2HHHFilter : in_range( 2211.0 , M , 2362.0 ) Hlt2CharmHadDpToKmKpPipTurboDpm2HHHFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDpToKmKpKpTurboDpm2HHHFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDspToKpKpPimTurboDs2HHHFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadLcpToPpPimPipTurboLc2HHHFilter : in_range( 2211.0 , M , 2362.0 ) Hlt2CharmHadDstp2D0Pip_D02PimPip_LTUNBTurboD0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadDspToKmKpPipMassFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadD0_TAG_CPV_Dstp2D0Pip_D02KmPipCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadDstp2D0Pip_D02KS0KS0_KS0LL_KS0DDTurboD02V0V0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadXic0ToPpKmKmPip_LTUNBTurboXic0ToPpKmKmPip_LTUNBTurboFilter : in_range( 2396.0 , M , 2770.0 ) Hlt2CharmHadDspToKmKpKpTurboDs2HHHFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadLc2HHH_LcpToKmPpPipMassFilter : in_range( 2211.0 , M , 2362.0 ) Hlt2CharmHadDpToKpKpPimTurboDpm2HHHFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDpToPimPipPipTurboDpm2HHHFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDstp2D0Pip_D02KpPim_LTUNBTurboD0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadDpToKmPipPip_ForKPiAsymTurboDpm2KPiPi_ForKPiAsymFilter : in_range( 1819.0 , M , 1919.0 ) Hlt2CharmHadDspToKmPipPipTurboDs2HHHFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadDstp2D0Pip_D02PimPipCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadDspToKpPimPipTurboDs2HHHFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadDpToKmPipPipMassFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDspToKmKpPip_LTUNBTurboDs2HHH_LTUNBFilter : in_range( 1889.0 , M , 2049.0 ) Hlt2CharmHadDpToKmPipPip_LTUNBTurboDpm2HHH_LTUNBFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDpToKpPimPipTurboDpm2HHHFilter : in_range( 1789.0 , M , 1949.0 ) Hlt2CharmHadDstp2D0Pip_D02KS0KS0_KS0LLTurboD02V0V0_TAG_CPVCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadDstp2D0Pip_D02KmKpCombiner : (VFASPF(VCHI2PDOF) < 25.0)& in_range( -9.57018, (M - M1 - M2), 20.42982 ) Hlt2CharmHadLcpToPpKmPip_LTUNBTurboLc2HHH_LTUNBFilter : in_range( 2211.0 , M , 2543.0 ) Hlt2CharmHadLcpToPpKmKpTurboLc2HHHFilter : in_range( 2211.0 , M , 2362.0 ) ``` The only functors I see are `M`, `M1`, `M2`, and `VFASPF(VCHI2PDOF)`. None of these will trigger an access to the P2PV relations, and so won't trigger a PV refit. Lessons for the future: * Have line authors understand what `ReFitPVs` does, and try to spot mis-use in MRs[^1]. * Even better: rework algorithms to behave in a more obvious way when the `ReFitPVs` flag is set. * Analysts should check the data thoroughly as soon as it's available. This 'feature' has been around since at least 2016, probably throughout the whole of Run 2. [email]: https://groups.cern.ch/group/lhcb-davinci/Lists/Archive/Flat.aspx?RootFolder=%2Fgroup%2Flhcb%2Ddavinci%2FLists%2FArchive%2FCan%27t%20find%20refitted%20PVs%20on%20Turbo%20mdsts&FolderCTID=0x01200200B795ECDBD8943F4E90E5F02A017196C9 [^1]: One hack is to prepend/append `BPVVALID` to every `Code` or `MotherCut` string, which will trigger the refitting (assuming every line does need at least one PV).