Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not grouping by the correct information #402

Open
quantfreedom opened this issue Sep 18, 2024 · 8 comments
Open

Not grouping by the correct information #402

quantfreedom opened this issue Sep 18, 2024 · 8 comments

Comments

@quantfreedom
Copy link

I have noticed that sometimes your agg trader will catch some trades that my websockets stream isn't catching and visa versa ... Also your agg trader catches way more million dollar buys than mine does

I took a look at your github and noticed you don't seem to be doing anything different than me and i have the same exchanges running and everything ... also i noticed that you are squishing alot of different prices together if you get a list of trades that aren't the same price which is really not good.

sometimes there will be like a 5 dollar difference on bitcoin between prices but you will squish them all into one price which is how you sometimes get 7 or 8 mil buys on bybit for a specific price but that isn't really the case

would love to talk to you more about this because i also made my own big trade market scanner. So if your interested in talking let me know

So you are basically just grouping by side and time and merging all the prices together

@Tucsky
Copy link
Owner

Tucsky commented Sep 19, 2024

By default the aggregation occur when multiple trades are reported by an exchange for the same exact market, on the same side and same exact time (same ms)

Below an example of large aggregation showing first price / weighed avg price / last price
image
As you can see, it's not uncommon to have a $15 difference for the larger aggregations.

We display only the last price to maximize performance. However, I am willing to address concerns regarding data accuracy if necessary.

Which price do you think makes more sense?

@quantfreedom
Copy link
Author

quantfreedom commented Sep 19, 2024

So i am not sure how people do trade stream aggregation. I don't know how binance does it and how okx does it and i have always wondered what is binance's aggregation algorithm because things can get really really weird.

If we aggr by time and side only then what is the cutoff for price difference to aggr the trade, if any

if we aggr by price and side then how much time do we allow to pass before we aggr

But the way we are doing it is grouping by timestamp, price, side and if all of those match we aggr the usdt size

But i am not saying we are doing it right, I just thought that is the way it was supposed to be done.

So i think we can all agree that the absolute most accurate is to group by all three but then it's like how much does that matter?

For me personally i think order of importance is side price timestamp with timestamp allowing for around 200ms of distance between the times but the problem we ran into is that sometimes bybit for instance, would send 3 responses and the first two have a couple of timestamps that overlap and then last two have times that overlap a little. And the max time distance between the lowest and highest out of the three is 250ms. So then its like how do we decide how many responses to wait to group everything and where does the time grouping start and stop.

And same situation for pricing so it becomes a logical nightmare.

I would love to hear your input on any of this especially if you know how binance does their aggr and if you have links to how they are doing their aggr ... and if you have links to when you said "By default the aggregation occur when multiple trades are reported by an exchange for the same exact market, on the same side and same exact time (same ms)" or is this just how you do it.

Also how do i get it to display all the prices that were aggr like you did on your screen shot

@Tucsky
Copy link
Owner

Tucsky commented Sep 29, 2024

So i think we can all agree that the absolute most accurate is to group by all three but then it's like how much does that matter?

Aggregation with price + time + side on the right vs time + side on the right (what Aggr does)
image

You simply see more on the right because large participants often split their orders to control their market impact, stay under the radar, etc.
However, I understand that someone might prefer aggregating by price, which provides a different feed: showing fewer large orders and have a more continuous flow of small to medium orders. Large orders only stand out during very big moves. This is also how the aggTrade feed from Binance works, so I'll add the option.

There's no single correct way to do it, but I believe grouping more trades allows you to crank up the volume filter, eliminate noise, and see an accurate representation of significant trades.

@quantfreedom
Copy link
Author

@Tucsky do you have a link to the documentation showing how binance does their aggtrade? or a link to where you find out about this information?

@quantfreedom
Copy link
Author

quantfreedom commented Sep 29, 2024

@Tucsky https://developers.binance.com/docs/derivatives/usds-margined-futures/websocket-market-streams/Aggregate-Trade-Streams

Yeah u r doing time and side and they are doing price and side

"The Aggregate Trade Streams push market trade information that is aggregated for fills with same price and taking side every 100 milliseconds."

So I guess what they're doing is every 100 milliseconds every trade that has the same price and side gets aggregated. But then the question becomes when does that 100 milliseconds start. This whole thing becomes an absolute mess

Again I just want to be clear that I'm not here to point out you are doing anything right or wrong I'm just trying to understand how you are doing it versus how I'm doing it versus how binance is doing it just to get a clearer picture

Ultimately the way you're doing it still helps me out in my Trading enormously but I think you're right the only solution to this would be to allow the user to have as much control as possible all the way down to grouping by time price and side. Because it doesn't look like there is a universal solution

@Tucsky
Copy link
Owner

Tucsky commented Sep 30, 2024

But then the question becomes when does that 100 milliseconds start. This whole thing becomes an absolute mess

I don't think replicating the same exact groups matter too much, but If you take 2 independant apps doing the aggregation job of the same source :

  • for the first few ms it's going to depend on when you started both app / when does the connection with Binance opened
  • but as soon as they send a different price, it will sync.
  • Same for time, as soon as the gap between two trades is greater than 100ms it should sync, because the ongoing aggregation will end on both apps regardless of other parameters (price, side).

That say it's purely dependant on the quality of the source, if they don't send the trade in the correct order on both receiver there not going to sync. I have got that issue with OKX for example 🤔

@quantfreedom
Copy link
Author

@Tucsky yeah i think what i am going to do is for any exchange that doesn't aggr their trades like binance does, which means they are sending a list of all the trades, Sometimes i will get 30 or 40 trades from bybit or okx.

I think most of them are all within 100ms of each other so i will just group each by price and side. But then the problem i have noticed is that sometimes the next 2 list of trades will be still within 100ms of the 1st list of trades. So it gets really messy.

I think what i am going to do is just keep accepting data and group store them in a list and each list will be 100 ms

so it will be all trades between 0000 and 0099 ms ... then the next group will be 0100 and 0199 and then next will be 0200 and 0299 and so and and so on

so all trades in 0000 and 0099 will be grouped by price and side and then stored and then all prices between times 0100 and 0199 will be grouped by price and side and so on and so on.

This is what it seems binance is doing.

And all the times of the different prices and sides will just be the last trade time of that grouped price and side

This is what it seems like binance is doing and they are the leader so i will just try to follow them because there seems to be no real good solution

but i do know that what you are doing which is grouping by time and side is not good. Sometimes you are grouping prices that are 50 dollar differences

I would say you should group by price and side by default and then let the user pick the tick size of price grouping if they want because even binance doesn't group by time and side. Individual price differences are super important and shouldn't be grouped into 1 price unless the user wants like a 10 tick or 20 tick price adjustment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants