Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warn logs for receiving null properties #1004

Merged
merged 2 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ protected void handlePut(final String listenTo, final ClusterProperties discover
}
else
{
_log.warn("Received a null cluster properties for {}", listenTo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last time we added logs around, they ended up being very noisy. Can you double check this isn't the case for this logger?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this log should not be logged unless there is a bug. And if there is a bug, we would want to see this log and take actions. So I think this log should be a must-show. I can make it an error instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the Uri one could use a rate limiter in case of huge clusters having many uris. The service and cluster data won't change frequently anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rate limit seems like a good idea. I agree we want to fail loudly, but we don't want to fail so loudly it breaks people's logs to the point where we're scrambling to put in a fix! This has happened a few times already

// still insert the ClusterInfoItem when discoveryProperties is null, but don't create accessor
_simpleLoadBalancerState.getClusterInfo().put(listenTo,
new ClusterInfoItem(_simpleLoadBalancerState, null, null, null));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ else if (oldServicePropertiesItem != null)
// we'll just ignore the event and move on.
// we could receive a null if the file store properties cannot read/write a file.
// in this case it's better to leave the state intact and not do anything
_log.warn("We receive a null service properties for {}. ", listenTo);
_log.warn("Received a null service properties for {}", listenTo);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,15 @@
import com.linkedin.d2.balancer.properties.PartitionData;
import com.linkedin.d2.balancer.properties.UriProperties;
import com.linkedin.d2.discovery.event.PropertyEventBus;
import com.linkedin.util.RateLimitedLogger;
import com.linkedin.util.clock.SystemClock;
import java.net.URI;
import java.util.Iterator;
import java.util.Map;
import java.util.Optional;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand All @@ -40,6 +43,8 @@
class UriLoadBalancerSubscriber extends AbstractLoadBalancerSubscriber<UriProperties>
{
private static final Logger _log = LoggerFactory.getLogger(UriLoadBalancerSubscriber.class);
private static final RateLimitedLogger RATE_LIMITED_LOGGER =
new RateLimitedLogger(_log, TimeUnit.MINUTES.toMillis(10), SystemClock.instance());

private SimpleLoadBalancerState _simpleLoadBalancerState;

Expand Down Expand Up @@ -164,15 +169,15 @@ protected void handlePut(final String cluster, final UriProperties uriProperties
// uri properties was null, we'll just log the event and continues.
// The reasoning is we might receive a null event when there's a problem writing/reading
// cache file, or we just started listening to a cluster without any uris yet.
warn(_log, "received a null uri properties for cluster: ", cluster);
RATE_LIMITED_LOGGER.warn("Received a null uri properties for cluster: {}", cluster);
}
}

@Override
protected void handleRemove(final String cluster)
{
_simpleLoadBalancerState.getUriProperties().remove(cluster);
warn(_log, "received a uri properties event remove() for cluster: ", cluster);
warn(RATE_LIMITED_LOGGER, "received a uri properties event remove() for cluster: ", cluster);
_simpleLoadBalancerState.removeTrackerClients(cluster);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ public void put(String listenTo, T discoveryProperties)
{
if (discoveryProperties == null)
{
warn(_log, "received a null property for resource ", listenTo, " received a null property");
_log.warn("Received and ignored a null property for resource: {}", listenTo);
}
else
{
Expand Down
Loading