Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLT crashes when export MALLOC_CONF=junk:true is set. #44956

Closed
VinInn opened this issue May 12, 2024 · 30 comments · Fixed by #45209
Closed

HLT crashes when export MALLOC_CONF=junk:true is set. #44956

VinInn opened this issue May 12, 2024 · 30 comments · Fixed by #45209

Comments

@VinInn
Copy link
Contributor

VinInn commented May 12, 2024

As reported in #44940 (comment)

reproduced on lxplus8-gpu.cern.ch.

use the script in #44940 and just add
export MALLOC_CONF=junk:true

one will get a long list of

%MSG
%MSG-e EcalRecHitError:  EcalRecHitProducer:hltEcalRecHit  12-May-2024 11:39:20 CEST Run: 380466 Event: 490512497
No intercalib const found for xtal 2779096485! something wrong with EcalIntercalibConstants in your DB?
%MSG
%MSG-e EcalLaserDbService:  EcalRecHitProducer:hltEcalRecHit  12-May-2024 11:39:20 CEST Run: 380466 Event: 490512497
 DetId is NOT in ECAL

and then a segfault

@cmsbuild
Copy link
Contributor

cmsbuild commented May 12, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @VinInn.

@makortel, @Dr15Jones, @smuzaffar, @antoniovilela, @rappoccio, @sextonkennedy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@VinInn
Copy link
Contributor Author

VinInn commented May 12, 2024

Does not seem to happen in MC Relvals

@VinInn
Copy link
Contributor Author

VinInn commented May 12, 2024

running "single thread" this is the stack-trace

Thread 12 (Thread 0x7fe7d1fff700 (LWP 3848515) "cmsRun"):
#0  0x00007fe8cd166301 in poll () from /usr/lib64/libc.so.6
#1  0x00007fe8c29de2ff in full_read.constprop () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#2  0x00007fe8c2991afc in edm::service::InitRootHandlers::stacktraceFromThread() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#3  0x00007fe8c2992460 in sig_dostack_then_abort () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00007fe82e9c0974 in HLTRecHitInAllL1RegionsProducer<EcalRecHit>::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginRecoEgammaEgammaHLTProducersPlugins.so
#6  0x00007fe8cfbbb47f in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#7  0x00007fe8cfb9fc2c in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#8  0x00007fe8cfb27f59 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#9  0x00007fe8cfb284c5 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#10 0x00007fe8cfcdbf78 in tbb::detail::d1::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreConcurrency.so
#11 0x00007fe8ce2ca95b in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x7fe811635e00, waiter=..., this=0x7fe8caedbe80) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
#12 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x7fe8caedbe80) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
#13 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/arena.cpp:137
#14 tbb::detail::r1::market::process (this=<optimized out>, j=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/market.cpp:599
#15 0x00007fe8ce2ccb0e in tbb::detail::r1::rml::private_worker::run (this=0x7fe8c7f7a100) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/private_server.cpp:271
#16 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7fe8c7f7a100) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/private_server.cpp:221
#17 0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#18 0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 11 (Thread 0x7fe7d4dff700 (LWP 3848514) "cmsRun"):
#0  0x00007fe8cd417da6 in do_futex_wait.constprop () from /usr/lib64/libpthread.so.0
#1  0x00007fe8cd417e98 in __new_sem_wait_slow.constprop.0 () from /usr/lib64/libpthread.so.0
#2  0x00007fe8adbd8dda in ?? () from /usr/lib64/libcuda.so.1
#3  0x00007fe8adbe8373 in ?? () from /usr/lib64/libcuda.so.1
#4  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 10 (Thread 0x7fe7f9b17700 (LWP 3848298) "cmsRun"):
#0  0x00007fe8cd13b918 in nanosleep () from /usr/lib64/libc.so.6
#1  0x00007fe8cd169178 in usleep () from /usr/lib64/libc.so.6
#2  0x00007fe8c051cdfa in FedRawDataInputSource::readSupervisor() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libEventFilterUtilities.so
#3  0x00007fe8cdaa3a73 in std::execute_native_thread_routine (__p=0x7fe8058ae2e0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#4  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 9 (Thread 0x7fe80ab57700 (LWP 3848262) "cmsRun"):
#0  0x00007fe8cd13b918 in nanosleep () from /usr/lib64/libc.so.6
#1  0x00007fe8cd13b81e in sleep () from /usr/lib64/libc.so.6
#2  0x00007fe8c051239a in evf::FastMonitoringService::snapshotRunner() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libEventFilterUtilities.so
#3  0x00007fe8cdaa3a73 in std::execute_native_thread_routine (__p=0x7fe805aa42e0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#4  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 8 (Thread 0x7fe8211ff700 (LWP 3848219) "cmsRun"):
#0  0x00007fe8cd41545c in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007fe849bf2f9e in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tsl::thread::EigenEnvironment::Task*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#2  0x00007fe849bf3563 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#3  0x00007fe849bf0c78 in std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#4  0x00007fe83c494e32 in tsl::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_framework.so.2
#5  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#6  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 7 (Thread 0x7fe82636d700 (LWP 3848218) "cmsRun"):
#0  0x00007fe8cd41545c in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007fe849bf2f9e in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tsl::thread::EigenEnvironment::Task*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#2  0x00007fe849bf3563 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#3  0x00007fe849bf0c78 in std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#4  0x00007fe83c494e32 in tsl::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_framework.so.2
#5  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#6  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 6 (Thread 0x7fe826b6e700 (LWP 3848217) "cmsRun"):
#0  0x00007fe8cd41545c in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007fe849bf2f9e in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tsl::thread::EigenEnvironment::Task*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#2  0x00007fe849bf3563 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#3  0x00007fe849bf0c78 in std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_cc.so.2
#4  0x00007fe83c494e32 in tsl::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/external/el8_amd64_gcc12/lib/scram_x86-64-v3/libtensorflow_framework.so.2
#5  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#6  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 5 (Thread 0x7fe830d65700 (LWP 3848154) "cmsRun"):
#0  0x00007fe8cd41545c in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00007fe8c0518436 in FedRawDataInputSource::readWorker(unsigned int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libEventFilterUtilities.so
#2  0x00007fe8cdaa3a73 in std::execute_native_thread_routine (__p=0x7fe8b55bde30) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#3  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#4  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 4 (Thread 0x7fe8861de700 (LWP 3848143) "cuda-EvtHandlr"):
#0  0x00007fe8cd166301 in poll () from /usr/lib64/libc.so.6
#1  0x00007fe8adbed89f in ?? () from /usr/lib64/libcuda.so.1
#2  0x00007fe8adcbbdcf in ?? () from /usr/lib64/libcuda.so.1
#3  0x00007fe8adbe8373 in ?? () from /usr/lib64/libcuda.so.1
#4  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 3 (Thread 0x7fe88de26700 (LWP 3848140) "cuda0000380000f"):
#0  0x00007fe8cd166301 in poll () from /usr/lib64/libc.so.6
#1  0x00007fe8adbed89f in ?? () from /usr/lib64/libcuda.so.1
#2  0x00007fe8adcbbdcf in ?? () from /usr/lib64/libcuda.so.1
#3  0x00007fe8adbe8373 in ?? () from /usr/lib64/libcuda.so.1
#4  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 2 (Thread 0x7fe88e6cc700 (LWP 3848133) "cmsRun"):
#0  0x00007fe8cd419672 in waitpid () from /usr/lib64/libpthread.so.0
#1  0x00007fe8c298eb37 in edm::service::cmssw_stacktrace_fork() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#2  0x00007fe8c2991a2a in edm::service::InitRootHandlers::stacktraceHelperThread() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#3  0x00007fe8cdaa3a73 in std::execute_native_thread_routine (__p=0x7fe8c2dbb650) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#4  0x00007fe8cd40f1ca in start_thread () from /usr/lib64/libpthread.so.0
#5  0x00007fe8cd07be73 in clone () from /usr/lib64/libc.so.6
Thread 1 (Thread 0x7fe8cc599640 (LWP 3848049) "cmsRun"):
#0  0x00007fe8cd13b918 in nanosleep () from /usr/lib64/libc.so.6
#1  0x00007fe8cd13b81e in sleep () from /usr/lib64/libc.so.6
#2  0x00007fe8c298e9e0 in sig_pause_for_stacktrace () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00007ffc41330bba in clock_gettime ()
#5  0x00007fe8cd13658a in clock_gettime@GLIBC_2.2.5 () from /usr/lib64/libc.so.6
#6  0x00007fe8cf85a962 in edm::WallclockTimer::start() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreUtilities.so
#7  0x00007fe8cfb8104c in edm::SystemTimeKeeper::startModuleEvent(edm::StreamContext const&, edm::ModuleCallingContext const&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#8  0x00007fe8cfbbb017 in edm::stream::EDFilterAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#9  0x00007fe8cfb9fe6c in edm::WorkerT<edm::stream::EDFilterAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#10 0x00007fe8cfb27f59 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#11 0x00007fe8cfb284c5 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#12 0x00007fe8cfa98bae in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#13 0x00007fe8ce2d3281 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0x7fe8caedbe00) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
#14 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0x7fe8caedbe00) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
#15 tbb::detail::r1::task_dispatcher::execute_and_wait (t=<optimized out>, wait_ctx=..., w_ctx=...) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.cpp:168
#16 0x00007fe8cfaa941b in edm::FinalWaitingTask::wait() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#17 0x00007fe8cfab324d in edm::EventProcessor::processRuns() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#18 0x00007fe8cfab37b1 in edm::EventProcessor::runToCompletion() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_6_MULTIARCHS/lib/el8_amd64_gcc12/scram_x86-64-v3/libFWCoreFramework.so
#19 0x00000000004074ef in tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#20 0x00007fe8ce2bf9ad in tbb::detail::r1::task_arena_impl::execute (ta=..., d=warning: RTTI symbol not found for class 'tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>'
#21 0x0000000000408ed2 in main::{lambda()#1}::operator()() const ()
#22 0x000000000040517c in main ()

Current Modules:

Module: HLTEcalRecHitInAllL1RegionsProducer:hltRechitInRegionsECAL (crashed)
Module: HLTPrescaler:hltPreDiSC3018EIsoANDHEMass70

A fatal system signal has occurred: segmentation violation
[innocent@lxplus800]/tmp/innocent/CMSSW_14_0_6_MULTIARCHS/src%

@mmusich
Copy link
Contributor

mmusich commented May 12, 2024

assign hlt

@cmsbuild
Copy link
Contributor

New categories assigned: hlt

@Martin-Grunewald,@mmusich you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Contributor

mmusich commented May 12, 2024

reproduced with:

#!/bin/bash -ex

scram p CMSSW CMSSW_14_0_5_patch1
cd CMSSW_14_0_5_patch1/src
eval `scramv1 runtime -sh`

export MALLOC_CONF=junk:true

https_proxy=http://cmsproxy.cms:3128 hltConfigFromDB --runNumber 380115 > hlt_run380115.py
cat <<@EOF >> hlt_run380115.py
from EventFilter.Utilities.EvFDaqDirector_cfi import EvFDaqDirector as _EvFDaqDirector
process.EvFDaqDirector = _EvFDaqDirector.clone(
  buBaseDir = '/eos/cms/store/group/tsg/FOG/error_stream/',
  runNumber = 380115
)
from EventFilter.Utilities.FedRawDataInputSource_cfi import source as _source
process.source = _source.clone(
  fileListMode = True,
  fileNames = (
  '/eos/cms/store/group/tsg/FOG/error_stream/run380115/run380115_ls0338_index000079_fu-c2b03-28-01_pid1451372.raw',
  '/eos/cms/store/group/tsg/FOG/error_stream/run380115/run380115_ls0338_index000104_fu-c2b03-28-01_pid1451372.raw'
  )
)
process.options.wantSummary = True

process.options.numberOfThreads = 32
process.options.numberOfStreams = 24
@EOF

mkdir run380115
cmsRun hlt_run380115.py &> crash_run380115.log

@cms-sw/ecal-dpg-l2 please take a look.

@mmusich
Copy link
Contributor

mmusich commented May 13, 2024

Does not seem to happen in MC Relvals

At least it does reproduce on HLT addOnTests on data, see log, preview of cms-sw/cms-bot#2228 output.

@thomreis
Copy link
Contributor

What is the purpose of export MALLOC_CONF=junk:true?

@mmusich
Copy link
Contributor

mmusich commented May 14, 2024

What is the purpose of export MALLOC_CONF=junk:true?

Setting MALLOC_CONF=junk:true configures the memory allocator (malloc) to fill newly allocated memory with junk data.
When junk:true is set, the memory allocator fills each block of memory that it allocates with a pattern of junk data. This can help detect bugs such as reads from or writes to uninitialized memory, because accessing this junk data will likely lead to unexpected behavior, such as crashes or incorrect results.

@fwyzard
Copy link
Contributor

fwyzard commented May 14, 2024

What is the purpose of export MALLOC_CONF=junk:true?

https://github.com/jemalloc/jemalloc/wiki/Use-Case:-Find-a-memory-corruption-bug

@mmusich
Copy link
Contributor

mmusich commented May 14, 2024

Another reproducer (with a more recent release):

Click me
#!/bin/bash -ex

cd CMSSW_14_0_6_MULTIARCHS/src
eval `scramv1 runtime -sh`

export MALLOC_CONF=junk:true

https_proxy=http://cmsproxy.cms:3128 hltConfigFromDB --runNumber 380466 > hlt_run380466.py
cat <<@EOF >> hlt_run380466.py
from EventFilter.Utilities.EvFDaqDirector_cfi import EvFDaqDirector as _EvFDaqDirector
process.EvFDaqDirector = _EvFDaqDirector.clone(
   buBaseDir = '/eos/cms/store/group/tsg/FOG/error_stream/',
   runNumber = 380466
)
from EventFilter.Utilities.FedRawDataInputSource_cfi import source as _source
process.source = _source.clone(
   fileListMode = True,
   fileNames = (
   '/eos/cms/store/group/tsg/FOG/error_stream/run380466/run380466_ls0276_index000212_fu-c2b03-09-01_pid672001.raw',
   '/eos/cms/store/group/tsg/FOG/error_stream/run380466/run380466_ls0276_index000232_fu-c2b03-09-01_pid672001.raw',
   '/eos/cms/store/group/tsg/FOG/error_stream/run380466/run380466_ls0276_index000246_fu-c2b03-09-01_pid672001.raw'
   )
)

process.options.accelerators = ['cpu']

process.hltOnlineBeamSpotESProducer.timeThreshold = int(1e6)

process.options.wantSummary = True

process.options.numberOfThreads = 1
process.options.numberOfStreams = 0
@EOF


directory="run380466"

# Check if the directory exists
if [ -d "$directory" ]; then
    # If it exists, remove it
    rm -rf "$directory"
fi

# Create the directory
mkdir "$directory"

cmsRun hlt_run380466.py &> crash_run380466.log

The crash described in the issue happens here:

auto this_cell = subDetGeom->getGeometry(recHit.id());

adding a check on the existence of the cell

+         if(this_cell==nullptr)
+           continue;

we get past there, but then there is an exception:

----- Begin Fatal Exception 14-May-2024 10:53:58 CEST-----------------------
An exception of category 'PFEcalEndcapRecHitCreator' occurred while
   [0] Processing  Event run: 380466 lumi: 276 event: 490512491 stream: 0
   [1] Running path 'HLT_Diphoton24_16_eta1p5_R9IdL_AND_HET_AND_IsoTCaloIdT_v8'
   [2] Calling method for module PFRecHitProducer/'hltParticleFlowRecHitECALUnseeded'
Exception Message:
detid 2779096485not found in geometry
----- End Fatal Exception -------------------------------------------------

which matches the exception seen in the relval tests of cms-sw/cms-bot#2228 at cms-sw/cms-bot#2228 (comment).

The main question I have is if 2779096485 is an existing xtal or just junk.

@VinInn
Copy link
Contributor Author

VinInn commented May 14, 2024 via email

@thomreis
Copy link
Contributor

ECAL detIDs all start with an 8. So junk it is indeed.

@mmusich
Copy link
Contributor

mmusich commented May 14, 2024

ECAL detIDs all start with an 8. So junk it is indeed.

adding:

diff --git a/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc b/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc
index 56a9292da36..ff352a772f8 100644
--- a/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc
+++ b/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc
@@ -261,6 +261,18 @@ void EcalRecHitProducer::produce(edm::Event& evt, const edm::EventSetup& es) {
   LogInfo("EcalRecHitInfo") << "total # EB calibrated rechits: " << ebRecHits->size();
   LogInfo("EcalRecHitInfo") << "total # EE calibrated rechits: " << eeRecHits->size();
 
+  // Loop over EBRecHitCollection
+  for (const auto& ebRecHit : *ebRecHits) {
+    DetId detId = ebRecHit.detid(); // Get the DetId
+    std::cout << "EB DetId: " << detId.rawId() << std::endl; // Print the rawId of the DetId
+  }
+
+  // Loop over EERecHitCollection
+  for (const auto& eeRecHit : *eeRecHits) {
+    DetId detId = eeRecHit.detid(); // Get the DetId
+    std::cout << "EE DetId: " << detId.rawId() << std::endl; // Print the rawId of the DetId
+  }
+
   evt.put(ebRecHitToken_, std::move(ebRecHits));
   evt.put(eeRecHitToken_, std::move(eeRecHits));
 }

I get:

EB DetId: 838861323

[...]

EE DetId: 872443444
EE DetId: 872443548
EE DetId: 872443698
EE DetId: 872444201
EE DetId: 872444471
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485
EE DetId: 2779096485

so it looks like EcalRecHitProducer is filling the end of EE rechit collection with junk.

@VinInn
Copy link
Contributor Author

VinInn commented May 14, 2024

is most probably leaving uninitialized or coping from uninitialized memory, as it is jmalloc that is filling it with "junk"
Do not understand why valgrind is not reporting it...

@thomreis
Copy link
Contributor

I need to check but I suspect that these extra junk RecHits are already in the input collection of the EcalRecHitProducer. The question is if there are extra digis already or if the junk gets added in the multifit or the EcalUncalibRecHitSoAToLegacy.

@mmusich
Copy link
Contributor

mmusich commented May 14, 2024

I need to check but I suspect that these extra junk RecHits are already in the input collection of the EcalRecHitProducer.

They are:

diff --git a/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc b/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc
index 56a9292da36..b7e5fd4ed66 100644
--- a/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc
+++ b/RecoLocalCalo/EcalRecProducers/plugins/EcalRecHitProducer.cc
@@ -146,11 +146,31 @@ void EcalRecHitProducer::produce(edm::Event& evt, const edm::EventSetup& es) {
     const auto& eeUncalibRecHits = evt.get(eeUncalibRecHitToken_);
     LogDebug("EcalRecHitDebug") << "total # EE uncalibrated rechits: " << eeUncalibRecHits.size();
 
+    // Loop over uncalib EERecHitCollection
+    for (const auto& eeRecHit : eeUncalibRecHits) {
+      DetId detId = eeRecHit.id(); // Get the DetId
+
+      // Check if the rawId corresponds to 2779096485
+      if (detId.rawId() == 2779096485) {
+        std::cout << "EE Uncalib -DetId: " << detId.rawId() << " - Line: " << __LINE__ << std::endl;
+      }
+    }
+
     // loop over uncalibrated rechits to make calibrated ones
     for (const auto& uncalibRecHit : eeUncalibRecHits) {
       worker_->run(evt, uncalibRecHit, *eeRecHits);
     }

yields:

EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155
EE Uncalib -DetId: 2779096485 - Line: 155

mmusich added a commit to mmusich/cmssw that referenced this issue May 14, 2024
@VinInn
Copy link
Contributor Author

VinInn commented May 15, 2024

This is NOT enough to crash

diff --git a/RecoLocalCalo/EcalRecProducers/plugins/alpaka/EcalUncalibRecHitProducerPortable.cc b/RecoLocalCalo/EcalRecProducers/plugins/alpaka/EcalUncalibRecHitProducerPortable.cc
index be36fad7b19..c5cb1c169f4 100644
--- a/RecoLocalCalo/EcalRecProducers/plugins/alpaka/EcalUncalibRecHitProducerPortable.cc
+++ b/RecoLocalCalo/EcalRecProducers/plugins/alpaka/EcalUncalibRecHitProducerPortable.cc
@@ -191,6 +191,13 @@ namespace ALPAKA_ACCELERATOR_NAMESPACE {
     // output device collections
     OutputProduct uncalibRecHitsDevEB{ebDigisSize, queue};
     OutputProduct uncalibRecHitsDevEE{eeDigisSize, queue};
+    {
+      auto eb = uncalibRecHitsDevEB.buffer(); alpaka::memset(queue, eb, 0xa5);
+      auto ee = uncalibRecHitsDevEE.buffer(); alpaka::memset(queue, ee, 0xa5);
+      alpaka::wait(queue);
+    }

@thomreis
Copy link
Contributor

Is alpaka::wait(queue); required after memset? There are two places where memset is used without the wait at the moment.
Adding it does not fix the crashes so just a general question.

@fwyzard
Copy link
Contributor

fwyzard commented May 16, 2024

If the copy uses a host queue the wait is not needed, because the host queues are blocking by default. But it also does not harm, as it shouldn't do anything.

For a host-to-device copy using a device queue, it's needed before the data can be accessed on the device using a different stream.

For a device-to-host copy using device queue, it's needed before the data can be accessed on the host.

That said, we have seen that for small memory copies, the GPU runtime seems to work OK also without the wait...

@makortel
Copy link
Contributor

For a host-to-device copy using a device queue, it's needed before the data can be accessed on the device using a different stream.

Just to clarify that if the "different queue" happens because the second devices-ide access is in a different EDModule, the framework adds necessary synchronization so that explicit alpaka::wait() is not needed. (this is hopefully much more common case than EDModules using multiple queues explicitly)

@mmusich
Copy link
Contributor

mmusich commented Jun 6, 2024

@thomreis @cms-sw/ecal-dpg-l2 please clarify if there is progress on this issue and if there is someone actively working on solving it.
Thank you

@VinInn
Copy link
Contributor Author

VinInn commented Jun 10, 2024

I'm not sure what has been already verified: anyhow if I add

diff --git a/EventFilter/EcalRawToDigi/plugins/EcalDigisFromPortableProducer.cc b/EventFilter/EcalRawToDigi/plugins/EcalDigisFromPortableProducer.cc
index d2c450f1ac2..cad2d1256f8 100644
--- a/EventFilter/EcalRawToDigi/plugins/EcalDigisFromPortableProducer.cc
+++ b/EventFilter/EcalRawToDigi/plugins/EcalDigisFromPortableProducer.cc
@@ -161,12 +161,16 @@ void EcalDigisFromPortableProducer::produce(edm::Event& event, edm::EventSetup c
   digisDataEB.resize(digisEBDataSize);
   digisDataEE.resize(digisEEDataSize);

+
   // copy data
   std::memcpy(digisIdsEB.data(), digisEBSoAView.id(), digisEBSize * sizeof(uint32_t));
   std::memcpy(digisIdsEE.data(), digisEESoAView.id(), digisEESize * sizeof(uint32_t));
   std::memcpy(digisDataEB.data(), digisEBSoAView.data()->data(), digisEBDataSize * sizeof(uint16_t));
   std::memcpy(digisDataEE.data(), digisEESoAView.data()->data(), digisEEDataSize * sizeof(uint16_t));

+  for (auto id : digisIdsEB) assert( id != 2779096485  );
+  for (auto id : digisIdsEE) assert( id != 2779096485  );
+
   digisEB->swap(digisIdsEB, digisDataEB);
   digisEE->swap(digisIdsEE, digisDataEE);

I get
cmsRun: src/EventFilter/EcalRawToDigi/plugins/EcalDigisFromPortableProducer.cc:172: virtual void EcalDigisFromPortableProducer::produce(edm::Event&, const edm::EventSetup&): Assertion `id != 2779096485' failed.

so it is junk already in the digits

@VinInn
Copy link
Contributor Author

VinInn commented Jun 10, 2024

and the origin is here (please not the FIXME)

iff --git a/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc b/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc
index 374a5a9c2c8..9dbf0c8d44d 100644
--- a/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc
+++ b/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc
@@ -202,6 +202,7 @@ namespace ALPAKA_ACCELERATOR_NAMESPACE::ecal::raw {

           ElectronicsIdGPU eid{fed2dcc(fed), ttid, stripid, xtalid};
           auto const didraw = isBarrel ? compute_ebdetid(eid) : eid2did[eid.linearIndex()].rawid();
+          assert(didraw != 2779096485 );
           // FIXME: what kind of channels are these guys
           if (didraw == 0)
             continue;

cmsRun: src/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc:205: void alpaka_serial_sync::ecal::raw::Kernel_unpack::operator()(const TAcc&, const unsigned char*, const uint32_t*, const int*, PortableHostCollection<EcalDigiSoALayout<> >::View, PortableHostCollection<EcalDigiSoALayout<> >::View, PortableHostCollection<EcalElectronicsMappingSoALayout<> >::ConstView, uint32_t) const [with TAcc = alpaka::AccCpuSerial<std::integral_constant<long unsigned int, 1>, unsigned int>; = void; uint32_t = unsigned int; PortableHostCollection<EcalDigiSoALayout<> >::View = EcalDigiSoALayout<>::ViewTemplateFreeParams<128, false, true, false>; PortableHostCollection<EcalElectronicsMappingSoALayout<> >::ConstView = EcalElectronicsMappingSoALayout<>::ConstViewTemplateFreeParams<128, false, true, false>]: Assertion `didraw != 2779096485' failed.

@VinInn
Copy link
Contributor Author

VinInn commented Jun 10, 2024

Needless to say that

diff --git a/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc b/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc
index 374a5a9c2c8..9d6e51fbd51 100644
--- a/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc
+++ b/EventFilter/EcalRawToDigi/plugins/alpaka/UnpackPortable.dev.cc
@@ -203,6 +203,7 @@ namespace ALPAKA_ACCELERATOR_NAMESPACE::ecal::raw {
           ElectronicsIdGPU eid{fed2dcc(fed), ttid, stripid, xtalid};
           auto const didraw = isBarrel ? compute_ebdetid(eid) : eid2did[eid.linearIndex()].rawid();
           // FIXME: what kind of channels are these guys
+          if (didraw == 2779096485 ) continue;
           if (didraw == 0)
             continue;

"cure" the symptom (not the CAUSE!)

It remains that eid2did[eid.linearIndex()].rawid(); is reading from uninitialized memory

@fwyzard
Copy link
Contributor

fwyzard commented Jun 12, 2024

@thomreis this is a list of invalid EE detid that I found running the HLT over 10k events:

invalid EE detid:   8873  fed: 601  dcc:  1  ttid: 10  stripid: 5  xtalid: 1
invalid EE detid:   8874  fed: 601  dcc:  1  ttid: 10  stripid: 5  xtalid: 2
invalid EE detid:   8875  fed: 601  dcc:  1  ttid: 10  stripid: 5  xtalid: 3
invalid EE detid:   8876  fed: 601  dcc:  1  ttid: 10  stripid: 5  xtalid: 4
invalid EE detid:   8877  fed: 601  dcc:  1  ttid: 10  stripid: 5  xtalid: 5
invalid EE detid:  10393  fed: 601  dcc:  1  ttid: 34  stripid: 3  xtalid: 1
invalid EE detid:  10394  fed: 601  dcc:  1  ttid: 34  stripid: 3  xtalid: 2
invalid EE detid:  10395  fed: 601  dcc:  1  ttid: 34  stripid: 3  xtalid: 3
invalid EE detid:  10396  fed: 601  dcc:  1  ttid: 34  stripid: 3  xtalid: 4
invalid EE detid:  10397  fed: 601  dcc:  1  ttid: 34  stripid: 3  xtalid: 5
invalid EE detid:  10401  fed: 601  dcc:  1  ttid: 34  stripid: 4  xtalid: 1
invalid EE detid:  10402  fed: 601  dcc:  1  ttid: 34  stripid: 4  xtalid: 2
invalid EE detid:  10403  fed: 601  dcc:  1  ttid: 34  stripid: 4  xtalid: 3
invalid EE detid:  10404  fed: 601  dcc:  1  ttid: 34  stripid: 4  xtalid: 4
invalid EE detid:  10405  fed: 601  dcc:  1  ttid: 34  stripid: 4  xtalid: 5
invalid EE detid:  10409  fed: 601  dcc:  1  ttid: 34  stripid: 5  xtalid: 1
invalid EE detid:  10410  fed: 601  dcc:  1  ttid: 34  stripid: 5  xtalid: 2
invalid EE detid:  10411  fed: 601  dcc:  1  ttid: 34  stripid: 5  xtalid: 3
invalid EE detid:  10412  fed: 601  dcc:  1  ttid: 34  stripid: 5  xtalid: 4
invalid EE detid:  10413  fed: 601  dcc:  1  ttid: 34  stripid: 5  xtalid: 5
invalid EE detid:  16618  fed: 602  dcc:  2  ttid:  3  stripid: 5  xtalid: 2
invalid EE detid:  16619  fed: 602  dcc:  2  ttid:  3  stripid: 5  xtalid: 3
invalid EE detid:  16620  fed: 602  dcc:  2  ttid:  3  stripid: 5  xtalid: 4
invalid EE detid:  16621  fed: 602  dcc:  2  ttid:  3  stripid: 5  xtalid: 5
invalid EE detid:  18473  fed: 602  dcc:  2  ttid: 32  stripid: 5  xtalid: 1
invalid EE detid:  18474  fed: 602  dcc:  2  ttid: 32  stripid: 5  xtalid: 2
invalid EE detid:  18475  fed: 602  dcc:  2  ttid: 32  stripid: 5  xtalid: 3
invalid EE detid:  18476  fed: 602  dcc:  2  ttid: 32  stripid: 5  xtalid: 4
invalid EE detid:  18477  fed: 602  dcc:  2  ttid: 32  stripid: 5  xtalid: 5
invalid EE detid:  25385  fed: 603  dcc:  3  ttid: 12  stripid: 5  xtalid: 1
invalid EE detid:  25386  fed: 603  dcc:  3  ttid: 12  stripid: 5  xtalid: 2
invalid EE detid:  25387  fed: 603  dcc:  3  ttid: 12  stripid: 5  xtalid: 3
invalid EE detid:  25388  fed: 603  dcc:  3  ttid: 12  stripid: 5  xtalid: 4
invalid EE detid:  25389  fed: 603  dcc:  3  ttid: 12  stripid: 5  xtalid: 5
invalid EE detid:  26537  fed: 603  dcc:  3  ttid: 30  stripid: 5  xtalid: 1
invalid EE detid:  26538  fed: 603  dcc:  3  ttid: 30  stripid: 5  xtalid: 2
invalid EE detid:  26539  fed: 603  dcc:  3  ttid: 30  stripid: 5  xtalid: 3
invalid EE detid:  26540  fed: 603  dcc:  3  ttid: 30  stripid: 5  xtalid: 4
invalid EE detid:  26541  fed: 603  dcc:  3  ttid: 30  stripid: 5  xtalid: 5
invalid EE detid:  33577  fed: 604  dcc:  4  ttid: 12  stripid: 5  xtalid: 1
invalid EE detid:  33578  fed: 604  dcc:  4  ttid: 12  stripid: 5  xtalid: 2
invalid EE detid:  33579  fed: 604  dcc:  4  ttid: 12  stripid: 5  xtalid: 3
invalid EE detid:  33580  fed: 604  dcc:  4  ttid: 12  stripid: 5  xtalid: 4
invalid EE detid:  33581  fed: 604  dcc:  4  ttid: 12  stripid: 5  xtalid: 5
invalid EE detid:  34729  fed: 604  dcc:  4  ttid: 30  stripid: 5  xtalid: 1
invalid EE detid:  34730  fed: 604  dcc:  4  ttid: 30  stripid: 5  xtalid: 2
invalid EE detid:  34731  fed: 604  dcc:  4  ttid: 30  stripid: 5  xtalid: 3
invalid EE detid:  34732  fed: 604  dcc:  4  ttid: 30  stripid: 5  xtalid: 4
invalid EE detid:  34733  fed: 604  dcc:  4  ttid: 30  stripid: 5  xtalid: 5
invalid EE detid:  41194  fed: 605  dcc:  5  ttid:  3  stripid: 5  xtalid: 2
invalid EE detid:  41195  fed: 605  dcc:  5  ttid:  3  stripid: 5  xtalid: 3
invalid EE detid:  41196  fed: 605  dcc:  5  ttid:  3  stripid: 5  xtalid: 4
invalid EE detid:  41197  fed: 605  dcc:  5  ttid:  3  stripid: 5  xtalid: 5
invalid EE detid:  43049  fed: 605  dcc:  5  ttid: 32  stripid: 5  xtalid: 1
invalid EE detid:  43050  fed: 605  dcc:  5  ttid: 32  stripid: 5  xtalid: 2
invalid EE detid:  43051  fed: 605  dcc:  5  ttid: 32  stripid: 5  xtalid: 3
invalid EE detid:  43052  fed: 605  dcc:  5  ttid: 32  stripid: 5  xtalid: 4
invalid EE detid:  43053  fed: 605  dcc:  5  ttid: 32  stripid: 5  xtalid: 5
invalid EE detid:  49833  fed: 606  dcc:  6  ttid: 10  stripid: 5  xtalid: 1
invalid EE detid:  49834  fed: 606  dcc:  6  ttid: 10  stripid: 5  xtalid: 2
invalid EE detid:  49835  fed: 606  dcc:  6  ttid: 10  stripid: 5  xtalid: 3
invalid EE detid:  49836  fed: 606  dcc:  6  ttid: 10  stripid: 5  xtalid: 4
invalid EE detid:  49837  fed: 606  dcc:  6  ttid: 10  stripid: 5  xtalid: 5
invalid EE detid:  51353  fed: 606  dcc:  6  ttid: 34  stripid: 3  xtalid: 1
invalid EE detid:  51354  fed: 606  dcc:  6  ttid: 34  stripid: 3  xtalid: 2
invalid EE detid:  51355  fed: 606  dcc:  6  ttid: 34  stripid: 3  xtalid: 3
invalid EE detid:  51356  fed: 606  dcc:  6  ttid: 34  stripid: 3  xtalid: 4
invalid EE detid:  51357  fed: 606  dcc:  6  ttid: 34  stripid: 3  xtalid: 5
invalid EE detid:  51361  fed: 606  dcc:  6  ttid: 34  stripid: 4  xtalid: 1
invalid EE detid:  51362  fed: 606  dcc:  6  ttid: 34  stripid: 4  xtalid: 2
invalid EE detid:  51363  fed: 606  dcc:  6  ttid: 34  stripid: 4  xtalid: 3
invalid EE detid:  51364  fed: 606  dcc:  6  ttid: 34  stripid: 4  xtalid: 4
invalid EE detid:  51365  fed: 606  dcc:  6  ttid: 34  stripid: 4  xtalid: 5
invalid EE detid:  51369  fed: 606  dcc:  6  ttid: 34  stripid: 5  xtalid: 1
invalid EE detid:  51370  fed: 606  dcc:  6  ttid: 34  stripid: 5  xtalid: 2
invalid EE detid:  51371  fed: 606  dcc:  6  ttid: 34  stripid: 5  xtalid: 3
invalid EE detid:  51372  fed: 606  dcc:  6  ttid: 34  stripid: 5  xtalid: 4
invalid EE detid:  51373  fed: 606  dcc:  6  ttid: 34  stripid: 5  xtalid: 5
invalid EE detid:  58698  fed: 607  dcc:  7  ttid: 21  stripid: 1  xtalid: 2
invalid EE detid:  58699  fed: 607  dcc:  7  ttid: 21  stripid: 1  xtalid: 3
invalid EE detid:  58700  fed: 607  dcc:  7  ttid: 21  stripid: 1  xtalid: 4
invalid EE detid:  58701  fed: 607  dcc:  7  ttid: 21  stripid: 1  xtalid: 5
invalid EE detid:  65753  fed: 608  dcc:  8  ttid:  3  stripid: 3  xtalid: 1
invalid EE detid:  65754  fed: 608  dcc:  8  ttid:  3  stripid: 3  xtalid: 2
invalid EE detid:  65755  fed: 608  dcc:  8  ttid:  3  stripid: 3  xtalid: 3
invalid EE detid:  65756  fed: 608  dcc:  8  ttid:  3  stripid: 3  xtalid: 4
invalid EE detid:  65757  fed: 608  dcc:  8  ttid:  3  stripid: 3  xtalid: 5
invalid EE detid:  65761  fed: 608  dcc:  8  ttid:  3  stripid: 4  xtalid: 1
invalid EE detid:  65762  fed: 608  dcc:  8  ttid:  3  stripid: 4  xtalid: 2
invalid EE detid:  65763  fed: 608  dcc:  8  ttid:  3  stripid: 4  xtalid: 3
invalid EE detid:  65764  fed: 608  dcc:  8  ttid:  3  stripid: 4  xtalid: 4
invalid EE detid:  65765  fed: 608  dcc:  8  ttid:  3  stripid: 4  xtalid: 5
invalid EE detid:  65769  fed: 608  dcc:  8  ttid:  3  stripid: 5  xtalid: 1
invalid EE detid:  65770  fed: 608  dcc:  8  ttid:  3  stripid: 5  xtalid: 2
invalid EE detid:  65771  fed: 608  dcc:  8  ttid:  3  stripid: 5  xtalid: 3
invalid EE detid:  65772  fed: 608  dcc:  8  ttid:  3  stripid: 5  xtalid: 4
invalid EE detid:  65773  fed: 608  dcc:  8  ttid:  3  stripid: 5  xtalid: 5
invalid EE detid:  65961  fed: 608  dcc:  8  ttid:  6  stripid: 5  xtalid: 1
invalid EE detid:  65962  fed: 608  dcc:  8  ttid:  6  stripid: 5  xtalid: 2
invalid EE detid:  65963  fed: 608  dcc:  8  ttid:  6  stripid: 5  xtalid: 3
invalid EE detid:  65964  fed: 608  dcc:  8  ttid:  6  stripid: 5  xtalid: 4
invalid EE detid:  65965  fed: 608  dcc:  8  ttid:  6  stripid: 5  xtalid: 5
invalid EE detid:  67289  fed: 608  dcc:  8  ttid: 27  stripid: 3  xtalid: 1
invalid EE detid:  67290  fed: 608  dcc:  8  ttid: 27  stripid: 3  xtalid: 2
invalid EE detid:  67291  fed: 608  dcc:  8  ttid: 27  stripid: 3  xtalid: 3
invalid EE detid:  67292  fed: 608  dcc:  8  ttid: 27  stripid: 3  xtalid: 4
invalid EE detid:  67293  fed: 608  dcc:  8  ttid: 27  stripid: 3  xtalid: 5
invalid EE detid:  67297  fed: 608  dcc:  8  ttid: 27  stripid: 4  xtalid: 1
invalid EE detid:  67298  fed: 608  dcc:  8  ttid: 27  stripid: 4  xtalid: 2
invalid EE detid:  67299  fed: 608  dcc:  8  ttid: 27  stripid: 4  xtalid: 3
invalid EE detid:  67300  fed: 608  dcc:  8  ttid: 27  stripid: 4  xtalid: 4
invalid EE detid:  67301  fed: 608  dcc:  8  ttid: 27  stripid: 4  xtalid: 5
invalid EE detid:  67305  fed: 608  dcc:  8  ttid: 27  stripid: 5  xtalid: 1
invalid EE detid:  67306  fed: 608  dcc:  8  ttid: 27  stripid: 5  xtalid: 2
invalid EE detid:  67307  fed: 608  dcc:  8  ttid: 27  stripid: 5  xtalid: 3
invalid EE detid:  67308  fed: 608  dcc:  8  ttid: 27  stripid: 5  xtalid: 4
invalid EE detid:  67309  fed: 608  dcc:  8  ttid: 27  stripid: 5  xtalid: 5
invalid EE detid:  67497  fed: 608  dcc:  8  ttid: 30  stripid: 5  xtalid: 1
invalid EE detid:  67498  fed: 608  dcc:  8  ttid: 30  stripid: 5  xtalid: 2
invalid EE detid:  67499  fed: 608  dcc:  8  ttid: 30  stripid: 5  xtalid: 3
invalid EE detid:  67500  fed: 608  dcc:  8  ttid: 30  stripid: 5  xtalid: 4
invalid EE detid:  67501  fed: 608  dcc:  8  ttid: 30  stripid: 5  xtalid: 5
invalid EE detid:  75082  fed: 609  dcc:  9  ttid: 21  stripid: 1  xtalid: 2
invalid EE detid:  75083  fed: 609  dcc:  9  ttid: 21  stripid: 1  xtalid: 3
invalid EE detid:  75084  fed: 609  dcc:  9  ttid: 21  stripid: 1  xtalid: 4
invalid EE detid:  75085  fed: 609  dcc:  9  ttid: 21  stripid: 1  xtalid: 5
invalid EE detid: 377513  fed: 646  dcc: 46  ttid: 10  stripid: 5  xtalid: 1
invalid EE detid: 377514  fed: 646  dcc: 46  ttid: 10  stripid: 5  xtalid: 2
invalid EE detid: 377515  fed: 646  dcc: 46  ttid: 10  stripid: 5  xtalid: 3
invalid EE detid: 377516  fed: 646  dcc: 46  ttid: 10  stripid: 5  xtalid: 4
invalid EE detid: 377517  fed: 646  dcc: 46  ttid: 10  stripid: 5  xtalid: 5
invalid EE detid: 379033  fed: 646  dcc: 46  ttid: 34  stripid: 3  xtalid: 1
invalid EE detid: 379034  fed: 646  dcc: 46  ttid: 34  stripid: 3  xtalid: 2
invalid EE detid: 379035  fed: 646  dcc: 46  ttid: 34  stripid: 3  xtalid: 3
invalid EE detid: 379036  fed: 646  dcc: 46  ttid: 34  stripid: 3  xtalid: 4
invalid EE detid: 379037  fed: 646  dcc: 46  ttid: 34  stripid: 3  xtalid: 5
invalid EE detid: 379041  fed: 646  dcc: 46  ttid: 34  stripid: 4  xtalid: 1
invalid EE detid: 379042  fed: 646  dcc: 46  ttid: 34  stripid: 4  xtalid: 2
invalid EE detid: 379043  fed: 646  dcc: 46  ttid: 34  stripid: 4  xtalid: 3
invalid EE detid: 379044  fed: 646  dcc: 46  ttid: 34  stripid: 4  xtalid: 4
invalid EE detid: 379045  fed: 646  dcc: 46  ttid: 34  stripid: 4  xtalid: 5
invalid EE detid: 379049  fed: 646  dcc: 46  ttid: 34  stripid: 5  xtalid: 1
invalid EE detid: 379050  fed: 646  dcc: 46  ttid: 34  stripid: 5  xtalid: 2
invalid EE detid: 379051  fed: 646  dcc: 46  ttid: 34  stripid: 5  xtalid: 3
invalid EE detid: 379052  fed: 646  dcc: 46  ttid: 34  stripid: 5  xtalid: 4
invalid EE detid: 379053  fed: 646  dcc: 46  ttid: 34  stripid: 5  xtalid: 5
invalid EE detid: 385258  fed: 647  dcc: 47  ttid:  3  stripid: 5  xtalid: 2
invalid EE detid: 385259  fed: 647  dcc: 47  ttid:  3  stripid: 5  xtalid: 3
invalid EE detid: 385260  fed: 647  dcc: 47  ttid:  3  stripid: 5  xtalid: 4
invalid EE detid: 385261  fed: 647  dcc: 47  ttid:  3  stripid: 5  xtalid: 5
invalid EE detid: 387113  fed: 647  dcc: 47  ttid: 32  stripid: 5  xtalid: 1
invalid EE detid: 387114  fed: 647  dcc: 47  ttid: 32  stripid: 5  xtalid: 2
invalid EE detid: 387115  fed: 647  dcc: 47  ttid: 32  stripid: 5  xtalid: 3
invalid EE detid: 387116  fed: 647  dcc: 47  ttid: 32  stripid: 5  xtalid: 4
invalid EE detid: 387117  fed: 647  dcc: 47  ttid: 32  stripid: 5  xtalid: 5
invalid EE detid: 394025  fed: 648  dcc: 48  ttid: 12  stripid: 5  xtalid: 1
invalid EE detid: 394026  fed: 648  dcc: 48  ttid: 12  stripid: 5  xtalid: 2
invalid EE detid: 394027  fed: 648  dcc: 48  ttid: 12  stripid: 5  xtalid: 3
invalid EE detid: 394028  fed: 648  dcc: 48  ttid: 12  stripid: 5  xtalid: 4
invalid EE detid: 394029  fed: 648  dcc: 48  ttid: 12  stripid: 5  xtalid: 5
invalid EE detid: 395177  fed: 648  dcc: 48  ttid: 30  stripid: 5  xtalid: 1
invalid EE detid: 395178  fed: 648  dcc: 48  ttid: 30  stripid: 5  xtalid: 2
invalid EE detid: 395179  fed: 648  dcc: 48  ttid: 30  stripid: 5  xtalid: 3
invalid EE detid: 395180  fed: 648  dcc: 48  ttid: 30  stripid: 5  xtalid: 4
invalid EE detid: 395181  fed: 648  dcc: 48  ttid: 30  stripid: 5  xtalid: 5
invalid EE detid: 402217  fed: 649  dcc: 49  ttid: 12  stripid: 5  xtalid: 1
invalid EE detid: 402218  fed: 649  dcc: 49  ttid: 12  stripid: 5  xtalid: 2
invalid EE detid: 402219  fed: 649  dcc: 49  ttid: 12  stripid: 5  xtalid: 3
invalid EE detid: 402220  fed: 649  dcc: 49  ttid: 12  stripid: 5  xtalid: 4
invalid EE detid: 402221  fed: 649  dcc: 49  ttid: 12  stripid: 5  xtalid: 5
invalid EE detid: 403369  fed: 649  dcc: 49  ttid: 30  stripid: 5  xtalid: 1
invalid EE detid: 403370  fed: 649  dcc: 49  ttid: 30  stripid: 5  xtalid: 2
invalid EE detid: 403371  fed: 649  dcc: 49  ttid: 30  stripid: 5  xtalid: 3
invalid EE detid: 403372  fed: 649  dcc: 49  ttid: 30  stripid: 5  xtalid: 4
invalid EE detid: 403373  fed: 649  dcc: 49  ttid: 30  stripid: 5  xtalid: 5
invalid EE detid: 409834  fed: 650  dcc: 50  ttid:  3  stripid: 5  xtalid: 2
invalid EE detid: 409835  fed: 650  dcc: 50  ttid:  3  stripid: 5  xtalid: 3
invalid EE detid: 409836  fed: 650  dcc: 50  ttid:  3  stripid: 5  xtalid: 4
invalid EE detid: 409837  fed: 650  dcc: 50  ttid:  3  stripid: 5  xtalid: 5
invalid EE detid: 411689  fed: 650  dcc: 50  ttid: 32  stripid: 5  xtalid: 1
invalid EE detid: 411690  fed: 650  dcc: 50  ttid: 32  stripid: 5  xtalid: 2
invalid EE detid: 411691  fed: 650  dcc: 50  ttid: 32  stripid: 5  xtalid: 3
invalid EE detid: 411692  fed: 650  dcc: 50  ttid: 32  stripid: 5  xtalid: 4
invalid EE detid: 411693  fed: 650  dcc: 50  ttid: 32  stripid: 5  xtalid: 5
invalid EE detid: 418473  fed: 651  dcc: 51  ttid: 10  stripid: 5  xtalid: 1
invalid EE detid: 418474  fed: 651  dcc: 51  ttid: 10  stripid: 5  xtalid: 2
invalid EE detid: 418475  fed: 651  dcc: 51  ttid: 10  stripid: 5  xtalid: 3
invalid EE detid: 418476  fed: 651  dcc: 51  ttid: 10  stripid: 5  xtalid: 4
invalid EE detid: 418477  fed: 651  dcc: 51  ttid: 10  stripid: 5  xtalid: 5
invalid EE detid: 419993  fed: 651  dcc: 51  ttid: 34  stripid: 3  xtalid: 1
invalid EE detid: 419994  fed: 651  dcc: 51  ttid: 34  stripid: 3  xtalid: 2
invalid EE detid: 419995  fed: 651  dcc: 51  ttid: 34  stripid: 3  xtalid: 3
invalid EE detid: 419996  fed: 651  dcc: 51  ttid: 34  stripid: 3  xtalid: 4
invalid EE detid: 419997  fed: 651  dcc: 51  ttid: 34  stripid: 3  xtalid: 5
invalid EE detid: 420001  fed: 651  dcc: 51  ttid: 34  stripid: 4  xtalid: 1
invalid EE detid: 420002  fed: 651  dcc: 51  ttid: 34  stripid: 4  xtalid: 2
invalid EE detid: 420003  fed: 651  dcc: 51  ttid: 34  stripid: 4  xtalid: 3
invalid EE detid: 420004  fed: 651  dcc: 51  ttid: 34  stripid: 4  xtalid: 4
invalid EE detid: 420005  fed: 651  dcc: 51  ttid: 34  stripid: 4  xtalid: 5
invalid EE detid: 420009  fed: 651  dcc: 51  ttid: 34  stripid: 5  xtalid: 1
invalid EE detid: 420010  fed: 651  dcc: 51  ttid: 34  stripid: 5  xtalid: 2
invalid EE detid: 420011  fed: 651  dcc: 51  ttid: 34  stripid: 5  xtalid: 3
invalid EE detid: 420012  fed: 651  dcc: 51  ttid: 34  stripid: 5  xtalid: 4
invalid EE detid: 420013  fed: 651  dcc: 51  ttid: 34  stripid: 5  xtalid: 5
invalid EE detid: 427338  fed: 652  dcc: 52  ttid: 21  stripid: 1  xtalid: 2
invalid EE detid: 427339  fed: 652  dcc: 52  ttid: 21  stripid: 1  xtalid: 3
invalid EE detid: 427340  fed: 652  dcc: 52  ttid: 21  stripid: 1  xtalid: 4
invalid EE detid: 427341  fed: 652  dcc: 52  ttid: 21  stripid: 1  xtalid: 5
invalid EE detid: 434393  fed: 653  dcc: 53  ttid:  3  stripid: 3  xtalid: 1
invalid EE detid: 434394  fed: 653  dcc: 53  ttid:  3  stripid: 3  xtalid: 2
invalid EE detid: 434395  fed: 653  dcc: 53  ttid:  3  stripid: 3  xtalid: 3
invalid EE detid: 434396  fed: 653  dcc: 53  ttid:  3  stripid: 3  xtalid: 4
invalid EE detid: 434397  fed: 653  dcc: 53  ttid:  3  stripid: 3  xtalid: 5
invalid EE detid: 434401  fed: 653  dcc: 53  ttid:  3  stripid: 4  xtalid: 1
invalid EE detid: 434402  fed: 653  dcc: 53  ttid:  3  stripid: 4  xtalid: 2
invalid EE detid: 434403  fed: 653  dcc: 53  ttid:  3  stripid: 4  xtalid: 3
invalid EE detid: 434404  fed: 653  dcc: 53  ttid:  3  stripid: 4  xtalid: 4
invalid EE detid: 434405  fed: 653  dcc: 53  ttid:  3  stripid: 4  xtalid: 5
invalid EE detid: 434409  fed: 653  dcc: 53  ttid:  3  stripid: 5  xtalid: 1
invalid EE detid: 434410  fed: 653  dcc: 53  ttid:  3  stripid: 5  xtalid: 2
invalid EE detid: 434411  fed: 653  dcc: 53  ttid:  3  stripid: 5  xtalid: 3
invalid EE detid: 434412  fed: 653  dcc: 53  ttid:  3  stripid: 5  xtalid: 4
invalid EE detid: 434413  fed: 653  dcc: 53  ttid:  3  stripid: 5  xtalid: 5
invalid EE detid: 434601  fed: 653  dcc: 53  ttid:  6  stripid: 5  xtalid: 1
invalid EE detid: 434602  fed: 653  dcc: 53  ttid:  6  stripid: 5  xtalid: 2
invalid EE detid: 434603  fed: 653  dcc: 53  ttid:  6  stripid: 5  xtalid: 3
invalid EE detid: 434604  fed: 653  dcc: 53  ttid:  6  stripid: 5  xtalid: 4
invalid EE detid: 434605  fed: 653  dcc: 53  ttid:  6  stripid: 5  xtalid: 5
invalid EE detid: 435929  fed: 653  dcc: 53  ttid: 27  stripid: 3  xtalid: 1
invalid EE detid: 435930  fed: 653  dcc: 53  ttid: 27  stripid: 3  xtalid: 2
invalid EE detid: 435931  fed: 653  dcc: 53  ttid: 27  stripid: 3  xtalid: 3
invalid EE detid: 435932  fed: 653  dcc: 53  ttid: 27  stripid: 3  xtalid: 4
invalid EE detid: 435933  fed: 653  dcc: 53  ttid: 27  stripid: 3  xtalid: 5
invalid EE detid: 435937  fed: 653  dcc: 53  ttid: 27  stripid: 4  xtalid: 1
invalid EE detid: 435938  fed: 653  dcc: 53  ttid: 27  stripid: 4  xtalid: 2
invalid EE detid: 435939  fed: 653  dcc: 53  ttid: 27  stripid: 4  xtalid: 3
invalid EE detid: 435940  fed: 653  dcc: 53  ttid: 27  stripid: 4  xtalid: 4
invalid EE detid: 435941  fed: 653  dcc: 53  ttid: 27  stripid: 4  xtalid: 5
invalid EE detid: 435945  fed: 653  dcc: 53  ttid: 27  stripid: 5  xtalid: 1
invalid EE detid: 435946  fed: 653  dcc: 53  ttid: 27  stripid: 5  xtalid: 2
invalid EE detid: 435947  fed: 653  dcc: 53  ttid: 27  stripid: 5  xtalid: 3
invalid EE detid: 435948  fed: 653  dcc: 53  ttid: 27  stripid: 5  xtalid: 4
invalid EE detid: 435949  fed: 653  dcc: 53  ttid: 27  stripid: 5  xtalid: 5
invalid EE detid: 436137  fed: 653  dcc: 53  ttid: 30  stripid: 5  xtalid: 1
invalid EE detid: 436138  fed: 653  dcc: 53  ttid: 30  stripid: 5  xtalid: 2
invalid EE detid: 436139  fed: 653  dcc: 53  ttid: 30  stripid: 5  xtalid: 3
invalid EE detid: 436140  fed: 653  dcc: 53  ttid: 30  stripid: 5  xtalid: 4
invalid EE detid: 436141  fed: 653  dcc: 53  ttid: 30  stripid: 5  xtalid: 5
invalid EE detid: 443722  fed: 654  dcc: 54  ttid: 21  stripid: 1  xtalid: 2
invalid EE detid: 443723  fed: 654  dcc: 54  ttid: 21  stripid: 1  xtalid: 3
invalid EE detid: 443724  fed: 654  dcc: 54  ttid: 21  stripid: 1  xtalid: 4
invalid EE detid: 443725  fed: 654  dcc: 54  ttid: 21  stripid: 1  xtalid: 5

On average there are more than 4 invalid detid per event, but it seems to be the same ones repeating.

@fwyzard
Copy link
Contributor

fwyzard commented Jun 12, 2024

#45209 (14.1.x) and #45210 (14.0.x) fix this issue by explicitly initialising the EcalElectronicsMappingHost to contain null (invalid) detids, instead of relying on the value of uninitialised memory.

It does not address the fact that we find channels that seem to belong to invalid detids.

@mmusich
Copy link
Contributor

mmusich commented Jun 13, 2024

+hlt

tested explicitly with the following recipe:

cmsrel CMSSW_14_0_9_MULTIARCHS
cd CMSSW_14_0_9_MULTIARCHS/src
cmsenv
git cms-addpkg EventFilter/EcalRawToDigi
git cms-merge-topic 45210
scram b -j 20

and using the reproducer at #44956 (comment) [*] that the segmentation fault (and all the preceding error messages) are gone.

[*]
#!/bin/bash -ex
export MALLOC_CONF=junk:true
https_proxy=http://cmsproxy.cms:3128 hltConfigFromDB --runNumber 380466 > hlt_run380466.py
cat <<@EOF >> hlt_run380466.py
from EventFilter.Utilities.EvFDaqDirector_cfi import EvFDaqDirector as _EvFDaqDirector
process.EvFDaqDirector = _EvFDaqDirector.clone(
    buBaseDir = '/eos/cms/store/group/tsg/FOG/error_stream/',
    runNumber = 380466
)
from EventFilter.Utilities.FedRawDataInputSource_cfi import source as _source
process.source = _source.clone(
    fileListMode = True,
    fileNames = (
    '/eos/cms/store/group/tsg/FOG/error_stream/run380466/run380466_ls0276_index000212_fu-c2b03-09-01_pid672001.raw',
    )
)
process.options.numberOfThreads = 1
process.options.numberOfStreams = 0
process.options.wantSummary = True
process.hltOnlineBeamSpotESProducer.timeThreshold = int(1e6)
del process.MessageLogger
process.load('FWCore.MessageService.MessageLogger_cfi')
@EOF

mkdir run380466
cmsRun hlt_run380466.py &> crash_run380466.log

@cmsbuild
Copy link
Contributor

This issue is fully signed and ready to be closed.

mmusich added a commit to mmusich/hltScripts that referenced this issue Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants