Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature - accept content to render pdf by url #5

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,12 @@ target/

#Ipython Notebook
.ipynb_checkpoints

# Editor directories and files
.idea
.vscode
*.iml
*.suo
*.ntvs*
*.njsproj
*.sln
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,15 @@ curl -v -X POST -d @test.html -JLO http://127.0.0.1:5001/pdf?filename=result.pdf

This example will use the file `test.html` and return a response with `Content-Type: application/pdf` and `Content-Disposition: inline; filename=result.pdf` headers. The body of the response will be the PDF rendering of the html document.

`POST` to `/pdf` can be used to generate a PDF from a URL. Use the `type=url` query parameter and a url in a text/plain body, call:

```
curl --location --request POST 'http://127.0.0.1:5001/pdf?filename=result_from_url.pdf&type=url' -JLO --header 'Content-Type: text/plain' --data-raw 'https://www.google.ca/?client=safari&channel=iphone_bm'
```

This example will use the url `https://www.google.ca/?client=safari&channel=iphone_bm` and return a response with `Content-Type: application/pdf` and `Content-Disposition: inline; filename=result_from_url.pdf` headers. The body of the response will be the PDF rendering of the url (renders static html only, not for Single Page Applications).

When `type=url`, additional query parameters can be passed in: `encoding`, `media_type`, `base_url`. See [Weasyprint HTML api](https://weasyprint.readthedocs.io/en/stable/api.html#weasyprint.HTML).


In addition `/health` is a health check endpoint and a `GET` returns 'ok'.
15 changes: 12 additions & 3 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,16 +40,25 @@ def home():
<li>POST to <code>/multiple?filename=myfile.pdf</code>. The body
should contain a JSON list of html strings. They will each
be rendered and combined into a single pdf</li>
<li>POST to <code>/pdf?filename=myfile.pdf&type=url</code>. The body
should be plain text and must contain a url</li>
</ul>
'''


@app.route('/pdf', methods=['POST'])
def generate():
name = request.args.get('filename', 'unnamed.pdf')
app.logger.info('POST /pdf?filename=%s' % name)
#print ( request.get_data() )
html = HTML(string=request.get_data())
type = request.args.get('type', 'string')
app.logger.info('POST /pdf?filename=%s&type=%s' % (name, type))
html = None
if type == 'string':
html = HTML(string=request.get_data())
elif type == 'url':
url_args = {'encoding': request.args.get('encoding', None),
'media_type': request.args.get('media_type', 'print'),
'base_url': request.args.get('base_url', None)}
html = HTML(url=request.get_data().decode('utf-8'), **url_args)
pdf = html.write_pdf()
response = make_response(pdf)
response.headers['Content-Type'] = 'application/pdf'
Expand Down
29 changes: 29 additions & 0 deletions test.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,5 +76,34 @@ def test_body(self):
self.assertEqual(len(pages), 2)


def request_factory_url(path='/'):
url = 'http://127.0.0.1:5001%s' % path
headers = {
'Content-Type': 'text/plain'
}
return Request(url, data='https://www.google.ca/?client=safari&channel=iphone_bm'.encode('utf-8'), headers=headers, method='POST')


class TestUrlPdf(unittest.TestCase):

def setUp(self):
request = request_factory_url('/pdf?filename=sample_url.pdf&type=url')
self.response = urlopen(request)

def tearDown(self):
self.response.close()

def test_response_code(self):
self.assertEqual(self.response.getcode(), 200)

def test_headers(self):
headers = dict(self.response.info())
self.assertEqual(headers['Content-Type'], 'application/pdf')
self.assertEqual(headers['Content-Disposition'], 'inline;filename=sample_url.pdf')

def test_body(self):
self.assertEqual(self.response.read()[:4], b'%PDF')


if __name__ == '__main__':
unittest.main()