Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling links with version query strings #409

Open
Axenu opened this issue Oct 22, 2019 · 4 comments
Open

Handling links with version query strings #409

Axenu opened this issue Oct 22, 2019 · 4 comments

Comments

@Axenu
Copy link

Axenu commented Oct 22, 2019

We have a html file with the following link:

<script type="text/javascript" src="app/main-config.js?v=1.17.0"></script>

When the html file loads in OpenWayback the javascript file is not found and the page does not render since it needs the file.

Trying to manually load the same file
http://192.168.10.210:8080/wayback/20191021115240/https://domain.com/app/main-config.js?v=1.17.0
fails as well. So it is confirmed that the file cannot be loaded. The problem is however that the file:

http://192.168.10.210:8080/wayback/20191021115240/https://domain.com/app/main-config.js

can be loaded, so it does exist.

When verifying the contents of the WARC files we can see that the file is named main-config.js?v=1.17.0

The expected behaviour is that openwayback let me open the file with the query string as it exists like that in the .WARC.

version: Apache Tomcat/8.5.20
OpenWayback: 2.3.2

@anjackson
Copy link
Member

How are you indexing the WARCs?

@Axenu
Copy link
Author

Axenu commented Nov 25, 2019

We are using the default configuration that comes with OpenWayback uses a Berkeley DB (BDB) database to store information about where to find WARC files and an index of their content.

@ldko
Copy link
Member

ldko commented Nov 25, 2019

Hi @Axenu
I am not sure I follow the logic in the statement:

The problem is however that the file:
http://192.168.10.210:8080/wayback/20191021115240/https://domain.com/app/main-config.js
can be loaded, so it does exist.

I believe https://domain.com/app/main-config.js and https://domain.com/app/main-config.js?v=1.17.0 are treated as completely separate URIs in OpenWayback and the index even though to us it looks like the same URL with a different query string.

Generally, loading URLs with version query strings should work in OpenWayback. Are you able to share the WARC file that includes https://domain.com/app/main-config.js?v=1.17.0?

@Axenu
Copy link
Author

Axenu commented Dec 4, 2019

@ldko Sorry for not being so clear.

What I mean is that a file named main-config.js exists in the warc. The file that the browser is looking for and thereby the file that OWM is looking for is main-config.js?v=1.17.0 which does not exist in the warc.

What we would want to happen is that OWM serves the file main-config.js for requests of main-config.js?v=1.17.0, ignoring the query param.

The warc containing the file main-config.js fro clarity: https://drive.google.com/open?id=1isYQpszliKxRorPTgVuGzepBKr8XLU0V

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants