[RFC] Add properties to List
for byte-based encodings/hashes
#924
+98
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The primary motivation here is to allow Pkl to properly work with arbitrary binary data. String can be abused to some extent, but because text encoding (and unicode normalization) is involved, what you see isn't what you get:
I'm most focused on
base64
here, but I don't see an reason to omit the hashes either.This can actually be implemented in-langage, but it's far from performant:
Here's a test encoding a 16kB buffer:
And here's the exact same evaluation switched to use the
List.base64
property introduced in this PR:This is roughly a 12x speedup!
Methods on generic types that only work for some type arguments don't really exist in Pkl yet. I considered an entirely separate stdlib class (possibly even a
List
subclass?) that forces the element type toUInt8
I'm interested in hearing thoughts on how to best approach this, but this was the most expedient path to prove the concept.As for why I'm doing this at all, here's a teaser:
