-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results discrepancies #11
Comments
Hello @mg64ve ,
|
Thanks @Kostis-S-Z |
Your comment doesn't make sense. If you change positions from a long to a short or a short to a long, you'll have to exit the market to exit your position. Then, you'd buy the opposing position. This is when your PnL would be calculated.
Can you share your PnL calculation code? I wouldn't expect this to be proprietary, since all you're doing is calculating a PnL during a long to short or short to long position change. |
@mg64ve I just saw that you have edited your question to add one more point which I haven't addressed, sorry I missed that. I can get back to you on this when I find some time to go over the project again. @ScrapeWithYuri Unfortunately this part of the code is indeed proprietary and I am not allowed to publish it, this was due to a cooperation with a private company while working on this project. |
@Kostis-S-Z I believe you are busy again with job, since I see you had not time to reply. No worries. and the following is an example of it holds positions, not really bad at all: my pnl calculation function is the following:
I wonder to know what do you think about it? do you see any mistake @ScrapeWithYuri , @stevexxs ? |
Can you send the |
@ScrapeWithYuri you can find my code here: https://github.com/q-learning/trading-rl dataset contains EURSD OHCL+Volume years 2007-2015. |
@Kostis-S-Z @ScrapeWithYuri I found a bug in my code. That code was still incrementing the position after calculating the reward. However I have also added another profit calculation in the get_reward function.
I am plotting this profit as well, and there is a big discrepancy with the vectorized profit calculation: I tend to think the second one is more realistic. |
@mg64ve is your second profit only calculating the change between the current position and the day before? For instance, shouldn't pr_val be the value when you made the last buy / sell short decision, rather than the prior day?
Based on the way I read this, the change is between the current position and the prior day. This would be negative, since your FX dataset is declining in value during the test period. |
Very good point @ScrapeWithYuri . In general we can say that if a position is taken for T periods, than the profit is: profit = price( t+T) - price(t) and it is the same than: profit = price(t+T) - price(t+T-1) + price(t+T-1) - price(t+T-2)+ .... + price(t+1) -price(t) but this works if only have BUY and SELL actions. |
@mg64ve It looks like your pnl_of_trades has a similar logic. The code is looking at the difference in return between time t and t-1 and computing the return based on the action for that period. However, @Kostis-S-Z stated in issue #6 that the calculation should be for time t+1. The bot is deciding what action to make based on data as of time t and making a decision for time t+1. |
good point @ScrapeWithYuri x[0], x[1], x[2], .... , x[t] and you get : d[0], d[1], d[2], ..., d[t-1] you need to take in account that, at time t=0, if your action is a[0] (consider either SELL or BUY only), a[0] * d[0] so my calculation is wrong for 2 reason:
First is wrong because it should be:
[0.0] is need needed to keep the same dimension and goes at end. |
ok @ScrapeWithYuri it is now fixed. The problem is, we are not getting good results. and of course it has a lag, and I wrote earlier. |
Another interesting observation is the following. and this is "turn" parameter:
Now if this is fixed, why do we have a different degrees of gradient in the following: Gradient is almost the same apart some few cases. |
@mg64ve Take a look at this file. I've added three PnL scenarios. I think your result was good, because it was looking backwards. I've added a second scenario (the program buys / sells short at time t+1 and closes the position at time t+2) and a third scenario (more in line with the paper). These additional scenarios do not have strong results. Let me know your thoughts. |
Hello @ScrapeWithYuri and @mg64ve , I am very happy to see you sharing results and having fruitful discussions about it! I am sorry I haven't managed to respond as much as I wanted in the thread but as you correctly guessed, I am busy with work. I just wanted to point out that the reason sometime you might see a different angle could be due to the parameter RESET_FROM_MARGIN. If this is true then every time the agent is "out of the margin", its position will get reset to to the next value. By quickly looking at your plot, it looks like this could be the case, where in timestep 436 the agent's value seems to be "out of the margin" / higher than the 0.01 margin and thus ignoring the action and resetting its position to a closer value. A few words about this parameter: |
@mg64ve I made the below change to your pnl_of_trades code (not the most elegant approach). It's moving the trade returns forward by two time periods. The idea being you'd close out the position at t+2 for a trade you made at time t+1.
The resulting profit is positive, but not high enough to warrant a trading strategy and well below the paper's results. Granted, the paper is not taking this approach. It's making a trade at time t+1, then changing positions only when the bot thinks there will be a shift in the trend. In any event, I wouldn't expect the outcome to be materially different. |
Thanks @Kostis-S-Z for your clarification, it makes sense.
I still have two doubts/questions.
I would really appreciate a feedback from you on these points.
Thanks.
The text was updated successfully, but these errors were encountered: