-
Notifications
You must be signed in to change notification settings - Fork 679
bug 1431820: use Django middleware for handling ETag/Last-Modified #4647
Conversation
@@ -273,4 +273,3 @@ def test_raw_file_if_modified_since(client, settings, file_attachment): | |||
assert 'Cache-Control' in response | |||
assert 'public' in response['Cache-Control'] | |||
assert 'max-age=900' in response['Cache-Control'] | |||
assert 'Vary' not in response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that the Vary
header was now in the response with a value of Accept-Encoding
. I'm not sure if it's worth checking any of the headers when the response is a 304.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'm OK if this now varies by encoding, not worth worrying about.
kuma/settings/common.py
Outdated
# Last-Modified header). Django's ConditionalGetMiddleware, uses both the ETag | ||
# and Last-Modified headers to handle conditional GET requests. | ||
# | ||
# IMPORTANT NOTE: When we move to Django 1.11, the USE_ETAGS setting is no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing can be in TODO:
Codecov Report
@@ Coverage Diff @@
## master #4647 +/- ##
==========================================
+ Coverage 95.32% 95.32% +<.01%
==========================================
Files 260 260
Lines 23534 23538 +4
Branches 1688 1692 +4
==========================================
+ Hits 22434 22438 +4
Misses 888 888
Partials 212 212
Continue to review full report at Codecov.
|
I feel like I stepped into a Django 1.8 minefield with this PR, but I think I've survived. The second commit of this PR does several things:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @escattone! I like getting back on the standard Django code, and I appreciate your research into the Django 1.8 → 1.11 transition.
I've noted a few nits, and some stuff that would be appropriate as follow-on PRs or commits, your choice.
@@ -273,4 +273,3 @@ def test_raw_file_if_modified_since(client, settings, file_attachment): | |||
assert 'Cache-Control' in response | |||
assert 'public' in response['Cache-Control'] | |||
assert 'max-age=900' in response['Cache-Control'] | |||
assert 'Vary' not in response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'm OK if this now varies by encoding, not worth worrying about.
kuma/core/middleware.py
Outdated
HttpResponseRedirect, | ||
HttpResponseForbidden, | ||
HttpResponsePermanentRedirect, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer hanging indents, but aligned indents are more common in the Kuma code. So, this is a nit out of responsibility. I look forward to the day when style linting is closer to the top of the priority list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had always used hanging indents until my last job, where we started using aligned indents because they enabled future edits to only affect one line (but I don't always like the look of it). I'm not consistent though, since I personally didn't do it long enough to make it habit. With MDN I'm happy to settle on either style, and if you prefer the hanging indent style, let's go with that. It's a greater part of my muscle and visual memory anyway!
@@ -59,6 +54,7 @@ def stream_raw_file(*args): | |||
response['Content-Length'] = rev.file.size | |||
except OSError: | |||
pass | |||
response['Last-Modified'] = convert_to_http_date(rev.created) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 conversion to convert_to_http_date
. I'm guessing that the ETag
skips streaming responses, so this is still needed.
convert_to_utc
can't handle ambiguous dates:
import datetime
from kuma.attachments.utils import convert_to_utc
convert_to_utc(datetime.datetime(2017, 11, 5, 1, 8, 42))
That feels like follow-on work on bug 1173189, and moving date manipulations into kuma/core/utils.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about streaming responses (we could have removed this code that sets the Last-Modified
header if it wasn't a streaming response), and good catch about content_to_utc
not handling ambiguous dates! I would never have thought of that until it failed. I guess the solution is to follow what you did for the publication date for DocumentsFeed
, and assume DST if the timezone is naive? I'll do that in a follow-on PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was on my mind because of PR #4621. I'd use the same strategy of assuming DST and being half-wrong rather than raise an ISE.
The function may be slightly different because that one converts to server timezone, while this one wants UTC. Maybe! I'd have to look at get_current_timezone
vs get_default_timezone
.
Getting to USE_TZ=1
is on my list, and I think centralizing the timezone stuff in kuma.core.utils
would be a good step. This stuff may be a shared function, or two close functions in the same file
@@ -93,14 +102,6 @@ def test_code_sample(code_sample_doc, constance_config, client, settings): | |||
HTTP_IF_NONE_MATCH=current_etag | |||
) | |||
assert response.status_code == 304 | |||
assert response['Access-Control-Allow-Origin'] == '*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: test Access-Control-Allow-Origin
in a headless test (different PR)
The content in this document's current revision contains multiple HTML | ||
elements with an "id" attribute (or "sections"), and also has a length | ||
greater than or equal to 200, which meets the compression threshold of | ||
the GZipMiddleware. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the 200-character minimum why SECTION5
was added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I added SECTION4
solely for that reason.
return response | ||
|
||
return render_code_sample(request) | ||
return render(request, 'wiki/code_sample.html', data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍 so much cleaner
* use new kuma.core.middleware.GZipMiddleware until Django 1.11 * add tests for kuma.core.middleware.GZipMiddleware * improve tests of etag-based conditional GET requests to ensure that middleware play well with each other (tests correctly fail when using django.middleware.gzip.GZipMiddleware instead of kuma.core.middleware.GZipMiddleware)
This PR does the following:
Sets the
USE_ETAGS
setting toTrue
. This enables theETag
functionality within Django'sCommonMiddleware
, which includes both computing/adding theETag
header to all responses (even responses to non-GET
requests) as well as handling conditional GET requests made using theIF-NONE-MATCH
header (theIF-UNMODIFIED-SINCE
header is not handled here). The fact that theETag
header is added to all responses is a shortcoming of Django 1.8. Another shortcoming is that 304 ("Not Modified") responses due to anETag
match clear most of the headers, so headers likeCache-Control
,ETag
,Last-Modified
, andVary
are lost. I don't think this will cause any issues, but I'm not certain. When we move to Django 1.11, theUSE_ETAGS
setting should be deleted, and this deprecated functionality withinCommonMiddleware
will no longer be used.Adds Django's
ConditionalGetMiddleware
to the list ofMIDDLEWARE_CLASSES
. In Django 1.8, this adds handling of theIF-NONE-MATCH
andIF-UNMODIFIED-SINCE
headers, but does not actually generate and add theETag
header to the response (that's done byCommonMiddleware
as explained above). When we move to Django 1.11, this middleware will assume full responsibility for generating/handling theETag
, and will be free of the shortcomings mentioned in the first point above (properly preserving some headers for 304 responses, and only generating/handlingETag
headers for GET requests).Removes the few
last_modified
andetag
view decorators that were used. Theetag
decorator is fully replaced by the middleware above (but of course still valid and available for those endpoints that have a faster way to generate theETag
value), while theConditionalGetMiddleware
only replaces the conditional handling portion of thelast_modified
decorator (theLast-Modified
header must still be generated and added to the response by an endpoint).Removes header checks in tests for 304 responses due to an
ETag
match (see first point above).