-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ConfigurationCallbackStorage
Crash
#1385
Changes from 14 commits
43de2b6
8f51234
d546f2e
8f8b5ab
3b68a60
97d769f
5c7ae5c
2951226
ecf9759
cf9990c
cb3bf33
ad83d5c
da5da37
498b348
20d7a46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,19 +93,16 @@ import Foundation | |
/// cached on subsequent calls for better performance. | ||
@_documentation(visibility: private) | ||
public func fetchOrReturnRemoteConfiguration(_ completion: @escaping (BTConfiguration?, Error?) -> Void) { | ||
configurationLoader.getConfig { [weak self] configuration, error in | ||
guard let self else { | ||
completion(nil, BTAPIClientError.deallocated) | ||
return | ||
} | ||
|
||
if let error { | ||
// TODO: - Consider updating all feature clients to use async version of this method? | ||
|
||
Task { @MainActor in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just making sure that these don't cause issues with nested calls to fetchOrReturnConfig. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think main difference for our purposes is that @mainactor can have suspension points so we wouldn't have deadlocks with nested completion handlers that call fetchOrReturnRemoteConfiguration There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm just puzzled why there are differences in analytics output with this commit as opposed to what's on main or previous commits on this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we are just doing same on two call sites, so coordinating the two configurationLoader.getConfig calls. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That test you quoted above, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't personally seen weirdness or difference in the FPTI between what was on main before, before this crash fix was implemented, after adding it, with or without MainActor annotations for fetchConfig functions. But I trust that you saw something, there is a lot going on with sending analytics. One small issue I can see right now is if network request fails, events will be dropped. My main concern is potential for more latency for fetching configuration for Let's keep an eye on latency metrics and then we can revisit. What are your thoughts on that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
imo, adding
If you look at ticket DTMOBILES-998 you will see this same behavior where we make multiple network called and the config isn't cached properly. There are screenshots in that ticket with additional details from FPTI.
There is clearly a need to annotate as is, removing it causes the issues we can clearly see in FPTI noted above. Removing the duplicate config call and just relying on the async method should resolve any remaining confusion. There is not additional latency added that did not exist previously. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for the screenshots. Do these extra calls happen in 6.23.2? I can check. Looks like the a lot of extra v1/config are coming from analytics sent from fetchAPITiming function. This is strange, I got different results from FPTI. I will post on your PR after doing a few more runs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made comments on your jira ticket with screenshots. I tested 6.23.3 with @mainactor annotation as it is currently on main wit no changes, 6.23.3 without @mainactor annotation but with await MainActor.run on setHTTPCredentials and completion calls and 6.23.2 with no changes. I didn't incorporate your changes on this branch, I am going to try that as well. I found that there are no duplicate calls to v1/configuration in any of them, they are all fetching from cache but 6.23.3 without @mainactor annotation results in analytic events being listed more out of order, this was also true for 6.23.2, the analytics were out of order. |
||
do { | ||
let configuration = try await configurationLoader.getConfig() | ||
setupHTTPCredentials(configuration) | ||
completion(configuration, nil) | ||
} catch { | ||
completion(nil, error) | ||
return | ||
} | ||
|
||
setupHTTPCredentials(configuration) | ||
completion(configuration, nil) | ||
} | ||
} | ||
|
||
|
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,20 @@ | ||
import Foundation | ||
|
||
@globalActor actor ConfigurationActor { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is neat! Can we add a docstring for what purpose this serves? |
||
static let shared = ConfigurationActor() | ||
} | ||
|
||
class ConfigurationLoader { | ||
|
||
// MARK: - Private Properties | ||
|
||
private let configPath = "v1/configuration" | ||
private let configurationCache = ConfigurationCache.shared | ||
private let http: BTHTTP | ||
private let pendingCompletions = ConfigurationCallbackStorage() | ||
|
||
// MARK: - Intitializer | ||
|
||
private var existingTask: Task<BTConfiguration, Error>? | ||
|
||
// MARK: - Initializer | ||
|
||
init(http: BTHTTP) { | ||
self.http = http | ||
|
@@ -35,56 +40,38 @@ class ConfigurationLoader { | |
/// - `BTConfiguration?`: The configuration object if it is successfully fetched or retrieved from the cache. | ||
/// - `Error?`: An error object if fetching the configuration fails or if the instance is deallocated. | ||
@_documentation(visibility: private) | ||
func getConfig(completion: @escaping (BTConfiguration?, Error?) -> Void) { | ||
@ConfigurationActor | ||
func getConfig() async throws -> BTConfiguration { | ||
if let existingTask { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't we check the cache before bothering checking if there are any pending tasks? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Jax had the same question. I was thinking of scenario where there may be read/write cache race condition if there is a pending task. The task includes writing into the cache. |
||
return try await existingTask.value | ||
} | ||
|
||
if let cachedConfig = try? configurationCache.getFromCache(authorization: http.authorization.bearer) { | ||
completion(cachedConfig, nil) | ||
return | ||
return cachedConfig | ||
} | ||
|
||
pendingCompletions.add(completion) | ||
|
||
// If this is the 1st `v1/config` GET attempt, proceed with firing the network request. | ||
// Otherwise, there is already a pending network request. | ||
if pendingCompletions.count == 1 { | ||
http.get(configPath, parameters: BTConfigurationRequest()) { [weak self] body, response, error in | ||
guard let self else { | ||
self?.notifyCompletions(nil, BTAPIClientError.deallocated) | ||
return | ||
} | ||
|
||
if let error { | ||
notifyCompletions(nil, error) | ||
return | ||
} else if response?.statusCode != 200 || body == nil { | ||
notifyCompletions(nil, BTAPIClientError.configurationUnavailable) | ||
return | ||
|
||
let task = Task { [weak self] in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we just do Line 75 is a little confusing why we're doing the await on the |
||
guard let self else { | ||
throw BTAPIClientError.deallocated | ||
} | ||
|
||
defer { self.existingTask = nil } | ||
do { | ||
let (body, response) = try await http.get(configPath, parameters: BTConfigurationRequest()) | ||
|
||
if response?.statusCode != 200 || body == nil { | ||
throw BTAPIClientError.configurationUnavailable | ||
} else { | ||
let configuration = BTConfiguration(json: body) | ||
|
||
try? configurationCache.putInCache(authorization: http.authorization.bearer, configuration: configuration) | ||
|
||
notifyCompletions(configuration, nil) | ||
return | ||
} | ||
} | ||
} | ||
} | ||
|
||
func getConfig() async throws -> BTConfiguration { | ||
try await withCheckedThrowingContinuation { continuation in | ||
getConfig { configuration, error in | ||
if let error { | ||
continuation.resume(throwing: error) | ||
} else if let configuration { | ||
continuation.resume(returning: configuration) | ||
return configuration | ||
} | ||
} catch { | ||
throw error | ||
} | ||
} | ||
} | ||
|
||
// MARK: - Private Methods | ||
|
||
func notifyCompletions(_ configuration: BTConfiguration?, _ error: Error?) { | ||
pendingCompletions.invoke(configuration, error) | ||
|
||
existingTask = task | ||
return try await task.value | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created DTMOBILES-980 for this TODO. Want to focus on tackling the crash first, then we can follow up with this change