Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while exporting Metrics #2391

Open
Veeraraghavans opened this issue Nov 3, 2023 · 14 comments
Open

Error while exporting Metrics #2391

Veeraraghavans opened this issue Nov 3, 2023 · 14 comments
Labels
bug Something isn't working

Comments

@Veeraraghavans
Copy link

Hello team,

I'm trying to use Opentelemetry Cpp version 1.8.1 to export my metrics from Ubuntu 22.04 machine . The plugin code that creates the agents, the provider to export the metrics. When I try to create the metrics provider, I get an allocation error. I'm not sure what's causing this error.

terminate called after throwing an instance of '
std::bad_alloc'
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc  what():  std::bad_alloc

I did some analysis using gdbgui to detail the problem and found that when MetaDataValidator is called, it triggers this regex and allocator validation and fails.

image

It would be nice if anyone has some idea on it. I am stuck on this for a while any inputs would be welcome. Happy to provide more details if needed

@Veeraraghavans Veeraraghavans added the bug Something isn't working label Nov 3, 2023
@github-actions github-actions bot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 3, 2023
@lalitb
Copy link
Member

lalitb commented Nov 3, 2023

@Veeraraghavans Which compiler? Also, do you have the sample code which is failing?

@Veeraraghavans
Copy link
Author

Veeraraghavans commented Nov 6, 2023

Hi @lalitb

I use compiler version of gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0. Here is snippet of code which I use to create Meter

nostd::shared_ptr<metrics_api::Meter> MetricAgent::GetMeter()
{
    auto provider = metrics_api::Provider::GetMeterProvider();
   return provider->GetMeter(this->serviceName, OPENTELEMETRY_SDK_VERSION);
}

More information:

When I call GetMeter it calls Get Meter from MeterProvider. During creation of Meter in Opentelemetry, It calls InstrumentDataValidator where the regex error is thrown.

std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >::basic_regex<std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::regex_constants::syntax_option_type)

image

Please let me know is the information shared is enough or you need more.

@lalitb
Copy link
Member

lalitb commented Nov 7, 2023

@Veeraraghavans - Do you get a similar crash while running - https://github.com/open-telemetry/opentelemetry-cpp/tree/main/examples/metrics_simple? Also, what is the otel-cpp version you are using? If it is from the main branch, do you also see the crash with v1.12.0?

@Veeraraghavans
Copy link
Author

Veeraraghavans commented Nov 8, 2023

No I am not getting crash. I could run the Metrics_Simple example which you shared and Opentelemetry version i use is 1.8.1. I use Opentelemetry branch of 1.8.1

@lalitb
Copy link
Member

lalitb commented Nov 9, 2023

Sorry, the stack trace is not enough for me to debug further. I can't see why allocation should fail in regex init. In case, someone want to comment/debug. Else, it would be helpful if you can provide a sample code (not the snippet) which fails consistently.

@Veeraraghavans
Copy link
Author

@lalitb thanks for your reply. You have some idea about common reason for allocation failure at regex init. I can share the part of the code which fails as it is propriety code. I will check on giving access.

@lalitb
Copy link
Member

lalitb commented Nov 13, 2023

@Veeraraghavans It would be more helpful if you could share the example ( in similar lines to https://github.com/open-telemetry/opentelemetry-cpp/tree/main/examples/metrics_simple ) which crashes on regex init. Something that can be easily compilable and reproducible to debug further.

@lalitb lalitb added the triage/needs-information Indicates an issue needs more information in order to work on it. label Nov 13, 2023
@Veeraraghavans
Copy link
Author

Veeraraghavans commented Nov 14, 2023

@lalitb please find the code which crashing during execution. Code has 2 parts one is

plugin.cpp - is the main code which creates the resources, Metric Agent.

#include "agents/MetricAgent.h"
void main()
{
    int processID = GetProcessID();
    //Create opentelemetry-cpp Resource to attach it to the telemetry data
    resource::ResourceAttributes attributes = {{"service.name", "ABC_PLUGIN"}, {"version", "latest"}, {"process_id", GetProcessID()}};   
    auto resource = resource::Resource::Create(attributes);
    std::string endpoint = "localhost:4317/v1/metrics";
    static ObservabilityPlugin::MetricAgent metricAgent( ABC_PLUGIN, GRPC, endpoint, resource);
    metricAgent.ActivateMetricType(ObservabilityPlugin::DefaultMetrics::All);  // Code calls ActivateMetricType function in MetricAgent.cpp 
	
}

// Get process Id:

int GetProcessID(){
  C_Communicator* com = C_Communicator::Instance();
  if(com == nullptr) return 0;
  if(com && com->size() > 1)  
    return com->cpuNum();
  else                        
    return 0;
}

agents/MetricAgent.cpp code - Creates Metric Exporter and Provider.

MetricAgent::MetricAgent(const std::string& serviceName, const std::string& protocol, const std::string& endpoint, resource::Resource resource, unsigned int frequency)
{
    this->serviceName = serviceName;
    auto attr = resource.GetAttributes();
    auto it = attr.find("process_id");
    if(it != attr.end()){
        this->processID = nostd::get<int>(it->second);
    }
    std::unique_ptr<metric_sdk::PushMetricExporter> exporter;
    this->metricGRPCExporterOptions.aggregation_temporality = metric_sdk::AggregationTemporality::kCumulative;
    this->metricGRPCExporterOptions.endpoint = endpoint;
    exporter = otlp::OtlpGrpcMetricExporterFactory::Create(metricGRPCExporterOptions);
    metric_sdk::PeriodicExportingMetricReaderOptions metricReaderOptions;
    metricReaderOptions.export_interval_millis = std::chrono::milliseconds(frequency);
    metricReaderOptions.export_timeout_millis  = std::chrono::milliseconds(frequency/2);
    std::unique_ptr<metric_sdk::MetricReader> reader{new metric_sdk::PeriodicExportingMetricReader(std::move(exporter), metricReaderOptions)};
    auto provider = std::shared_ptr<metrics_api::MeterProvider>(new metric_sdk::MeterProvider(std::unique_ptr<metric_sdk::ViewRegistry>(new metric_sdk::ViewRegistry()), resource));
   auto p        = std::static_pointer_cast<metric_sdk::MeterProvider>(provider);
   p->AddMetricReader(std::move(reader));	
   metrics_api::Provider::SetMeterProvider(provider);
}

// Function calls Metrics Meter Provider for adding Metrics counters
void MetricAgent::ActivateMetricType(DefaultMetrics type)
{
    auto meter = this->GetMeter();
    //This is place where the error is thrown where GetMeter function is called from plugin.cpp 
    switch (type)
    {
        //......
    }
}

nostd::shared_ptr<metrics_api::Meter> MetricAgent::GetMeter()
{
    auto provider = metrics_api::Provider::GetMeterProvider();
    return provider->GetMeter(this->serviceName, OPENTELEMETRY_SDK_VERSION);
}

@marcalff
Copy link
Member

Given how the regexp crashes on the name given to GetMeter(), what is the actual value of serviceName ?

Does it looks properly initialized ?

@marcalff marcalff removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 15, 2023
@Veeraraghavans
Copy link
Author

Veeraraghavans commented Nov 15, 2023

It gets following values, serviceName="abc_plugin" in the example and OPENTELEMETRY_SDK_VERSION=1.8.1. I think it is initialized fine as MetricAgent::MetricAgent(const std::string& serviceName, const std::string& protocol, const std::string& endpoint, resource::Resource resource, unsigned int frequency) executed fine but when I call Getmeter i have issues.

Do we have some methods to check on logs or some ways to check what happens ?

@lalitb lalitb removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Nov 15, 2023
@Veeraraghavans
Copy link
Author

Veeraraghavans commented Nov 20, 2023

Hey @lalitb @marcalff,

You think the usage of D_GLIBCXX_USE_CXX11_ABI flag will create issue?? Or any other reason you managed to get some idea. Any inputs will be helpful

@Veeraraghavans
Copy link
Author

Hi @marcalff @lalitb

Did you get any idea on it? I tried debugging using SDK, The error is taking place at Regex Validation the value is passed exactly is "mapdl_plugin" and "1.8.1" when I disable it code proceeds but fails at Meter Creation counter.

[Error] File: /home/vsekar/observability-plugins/source/opentelemetry-cpp-v1.8/sdk/src/metrics/meter.cc:46Meter::CreateUInt64Counter - failed. Invalid parameters.mapdl_plugin_counter_nb_of_processes Number of processes . Measurements won't be recorded.

Entire code works fine for other example but fails if i call from my plugin code.

Copy link

This issue was marked as stale due to lack of activity.

@github-actions github-actions bot added the Stale label Jan 30, 2024
@ashley-b
Copy link

ashley-b commented Dec 3, 2024

@Veeraraghavans Just came across this error in my own project

Invalid parameters.mapdl_plugin_counter_nb_of_processes Number of processes . Measurements won't be recorded.

I found it was having space character in the metric name

@github-actions github-actions bot removed the Stale label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants