-
Notifications
You must be signed in to change notification settings - Fork 294
WIP More general fix for #127 with addinfourl #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Critic review: https://critic.hoppipolla.co.uk/r/498 This is an external review system which you may optionally use for the code review of your pull request. In order to help critic track your changes, please do not make in-place history rewrites (e.g. via |
Urgh, looking at this again I must've been thinking of my somewhat-working-written-a-long-time-ago Python 3-only rewrite of the input stream. Dealing with Alternative idea for fixing this (comments welcome, don't take this as me saying what the right approach is — I don't know what the right approach is!): make BufferedStream generic and work for both unicode and bytes file-like objects (literally the only thing specific is the |
The other option is to try and finish my long-abandoned Python 3-only rewrite of inputstream, create something with the same API (but slower when changing character encoding) for Python 2. https://github.com/gsnedders/html5lib-python/tree/new-inputstream is what there is. Hopefully shows the vague idea. |
At the moment |
Hmm, looking at |
At one point I had |
Is there any reason not to just go for the equality check? |
Do you mean |
So, what, |
We can, it’s just one more corner case piling up… |
How about this?
It should never raise, and cover every case where we’ve seen 20007 so far. |
I guess it is as good as anything. As ugly as it is. |
I'm so confused as to what's happened here. Why did this get dropped? Was I expecting you to update PR with that? I don't know. Looking over this again, |
Oh, no, I'm being stupid. I'm thinking about JS semantics and not Python ones! |
Do not merge yet, this is incomplete.
As discussed in #134 I changed this to avoid
.read(0)
entirely and pass a first chunk toHTMLUnicodeInputStream
andHTMLBinaryInputStream
, but then I get lost in their implementation and I don’t know what to do with that first chunk.Surprisingly, most tests still passed because
.seek(0)
is used in some places. I added a failing test with a non-seekable input.