03 BCT Bitcoin Cryptographic Concepts
03 BCT Bitcoin Cryptographic Concepts
03 BCT Bitcoin Cryptographic Concepts
1
Basic Cryptographic Concepts in Blockchain
Properties of:
Hash Functions
Digital Signatures
… and applications
Build a basic cryptocurrency using these properties
Cryptography
Cryptography provides a mechanism for securely encoding the rules of
a cryptocurrency system within the system itself.
Prevent tampering and equivocation
Use of ambiguity for concealing/hiding facts
Encode rules for creation of new units of the currency in a mathematical
protocol
Bitcoin relies on a handful of relatively well-known cryptographic
constructions
Cryptographic Hashes
Digital Signatures
Zero-knowledge Proofs
proposed extensions and modifications to Bitcoin
Display only required information 3
Hash Function
▪ A hash function is a mathematical function with
the following three properties:
▪ Its input can be any string of any size.
▪ Fixed size output (assume a 256-bit output.)
▪ Efficiently computable
▪ Output of the hash function can be computed in a
reasonable amount of time
4
Cryptographic Hash Function
5
Collision Resistance (1)
▪ A collision occurs when two distinct inputs produce the
same output.
9
Is there a faster way to find collisions?
Consider the following Hash function:
H(x) = x mod 2256
Accepts input of any size and returns a fixed-sized (256 bits) output.
Efficiently computable
But, this function just returns the last 256 bits of the input
One collision: 3 and 3+2256
This function is not usable in practice
For others, we do not know of one.
No hash function has been proven collision-free
SHA-256 (Secure Hash Algorithm)
MD5 algorithm: collisions were found after years of work, function
deprecated and phased out of practical use.
10
Application: Message Digests
Use Hash outputs as a message digest.
Message Digests help us conclude whether two messages contain
the same content, without actually examining the messages.
Scenario:
Checking whether the current downloaded file is the same as the previously
uploaded file (to a cloud storage)?
Keep a local copy and compare it to the downloaded file.
Inefficient and wastes storage space
Collision-resistant hashes provide an elegant and efficient solution to this
problem.
Store the hash of the file locally
Download the original file
Compare its hash with the stored hash. 11
Property 2: Hiding[1]
Informally this property states that, if we are given the output of a Hash
function H(x) = y, there is no feasible way to figure out the input x.
This will not hold if the number of possible inputs x is limited
easy to pre-calculate all y’s for all possible x’s.
For example, consider a coin flip game
Heads → H(“heads”), Tails → H(“Tails”)
Seeing the Hash announced, an adversary can easily guess which one of
Heads/Tails by pre-computing the hashes of “Heads” and “Tails.”
To achieve the hiding property, there must be no value of x that is
particularly likely.
X has to be chosen from a set that is very spread out
What if x Є {“heads”, “tails”}?
12
Property 2: Hiding[2]
We can hide an input that is not spread out by concatenating (symbol
ǁ denotes concatenation) with another input that is spread out.
Hiding: A has function H is said to be hiding if when a secret value
r is chosen from a probability distribution that has high min-
entropy, then given H(rǁx), it is infeasible to find x.
In information theory, min-entropy is a measure of the predictability
of an outcome.
High min-entropy captures the intuitive idea that the distribution (of
a random variable) is very spread out.
If r is chosen uniformly from among all strings that are 256 bits
long, then any particular string is chosen with a probability of 1/2256,
an infinitesimally small value.
13
Applications: Commitments[1]
A commitment is the digital analog of taking a value, sealing it in an
envelope, and putting that envelope on the table where everyone can
see it.
Committed yourself to the value inside the envelope.
The value remains a secret from everyone else.
Open the envelope and reveal the value that was committed to earlier.
Putting the sealed envelope in the table:
Commit function applied to a random nonce plus msg, the value being
committed to, giving the commitment com
Opening the envelope:
Publish the random nonce and the message msg.
Two properties of hiding and binding ensure that we cannot
commit to one value and then later claim that we committed to
another value. 14
Applications: Commitments[2]
15
Applications: Commitments[3]
Commitment schemes can be implemented using a cryptographic
hash function.
Commit(msg, nonce) := H(nonce ǁ msg), where nonce is a random
256-bit value.
The properties required for commitment now become:
Hiding: Given H(nonce ǁ msg), it is infeasible to find msg.
Binding: It is infeasible to find two pairs (nonce, msg) and (nonce’,
msg’) such that msg ≠ msg’ and H(nonce ǁ msg) == H(nonce’ ǁ
msg’)
The binding property is implied by the collision resistant property of
the underlying hash function.
16
Property 3: Puzzle Friendliness
Puzzle friendliness: A hash function H is said to be puzzle friendly
if for every possible n-bit output value y, if k is chosen from a
distribution with high min-entropy, then it is infeasible to find x
such that H(k ǁ x) = y in time significantly less than 2n.
Suppose we want to target the hash function to have some particular
output value y, and
if part of the input k has been chosen in a suitably randomized way,
then it’s very difficult to find another value that hits exactly that target.
Search puzzle
A mathematical problem that requires searching a very large space to find a
solution
A search puzzle has no shortcuts
No way to find a solution other than searching that large space. 17
If H has an n-bit output, then it can take any of 2n values.
Solving the puzzle requires finding an input (id ǁ x) such that the output falls
within the set Y, which is typically much smaller than the set of all outputs.
‘id’ is fixed, so we have to work with ‘x’
Difficulty of the puzzle is determined by the size of Y.
If Y is the set of all n-bit strings, then the puzzle is trivial, whereas if Y has
only one element, then the puzzle is maximally hard.
18
For a puzzle friendly hash function, there’s no solving strategy
that is much better than just trying random values of x.
To pose a puzzle that’s difficult to solve, we can do it this way
as long as we can generate puzzle-IDs in a suitably random way.
Idea used in Bitcoin mining. 19
The SHA-256 Hash Function
The SHA-256 Hash function is used in Bitcoin.
Secure Hash Algorithm
A Hash function should work on inputs of arbitrary lengths and yield
a fixed length output.
Merkle-Damgard Transform: A generic method that converts a
Hash function that works on a fixed length input to a Hash function
that works on inputs of arbitrary lengths.
In common terminology, the underlying fixed-length collision
resistant hash function is called a compression function.
It can be shown that, if the underlying compression function is
collision resistant, then the overall hash function is collision
resistant as well. 20
The Merkle Damgard Transform [1]
Input to compression function = length m
Output of compression function = length n , m > n
The input of the hash function (any size) is divided into blocks of
length (m-n)
Pass each block together with the output of the previous block into
the compression function
Input length is (m – n) + n = m, which is the input length to the
compression function.
21
The Merkle Damgard Transform [2]
For the first block, where there is no previous block, an initialization
vector (IV) is used.
Initialization vector is a standard & well-known initialization vector.
For the last block, the input length may be < (m-n).
The input is padded, so that its length is a multiple of 512 bits or (m-
n).
Here m = 768 bits, n=256 bits and m-n = 512 bits
The result (hash) is the output of the last block.
22
The Merkle Damgard Transform [3]
To summarize:
SHA-256 uses a compression function that takes 768-bit input
and produces 256-bit outputs.
The block size is 512 bits.
23
Hash Pointers and Data Structures [1]
Hash pointer: A hash pointer is a pointer
to where data is stored together with a
cryptographic hash of the value of this data
at some fixed point in time.
Here, a familiar data structure that uses
pointers, such as a linked list or a binary
search tree is implemented with hash
pointers instead of ordinary pointers
• The adversary can continue to try and cover up this change by changing the next block’s hash
as well.
• The adversary can continue doing this for other blocks
• This strategy will fail when the adversary reaches the head of the list.
• Specifically, as long as the hash pointer at the head of the list is stored in a place where
the adversary cannot change it, the adversary will be unable to change any block
without being detected
• The hash pointer at the head of the list is a tamper-evident hash of the entire list.
• First block is called the genesis block. 27
Merkle Trees [1]
• A binary tree with hash pointers is known as a Merkle tree after its inventor, Ralph
Merkle.
• The blocks of data make up the leaves of the tree.
• The data blocks (say, transactions) are grouped into pairs of two.
• For each pair, we build a data structure that has two hash pointers, one to each of the
blocks. These data structures make up the next level of the tree.
28
Merkle Trees [2]
• These, in turn, are grouped into groups of two
• For each pair create a new data structure that contains the hash of each, is created.
• This process continues until we reach a single block, the root of the tree.
• The pointer at the root of the tree is remembered
29
Merkle Trees [3]
• If an adversary tampers with some data block at the bottom of the tree, this change will
cause the hash pointer one level up to not match
• Even if he continues to tamper with other blocks farther up the tree, the change will
eventually propagate to the top.
• The root node hash pointer is stored safely.
30
Merkle Trees: Proof of Membership
To confirm Transaction D, one only needs to know H(AB), H(C), H(D), and H(EFGH).
• Concise Proof of Membership: Prove that a certain data block is a member of the
Merkle tree. The root is known.
• Required: The data block, and the blocks on the path from the data block to the root.
• The rest of the tree can be ignored as blocks on this path are enough to allow us to
verify the hashes all the way up to the root of the tree. 31
Digital Signatures [1]
A digital signature is supposed to be the digital analog
(equivalent) of a handwritten signature on paper.
Two desirable properties of digital signatures:
Only the concerned person can make his/her signature, but
anyone who sees it can verify that it’s valid.
The signature should be tied to a particular document, so that
the signature cannot be used to indicate the signer’s agreement
or endorsement of a different document.
For handwritten signatures, this latter property is analogous to
ensuring that somebody can’t take your signature and snip it off
one document and glue it to the bottom of another one. 32
Digital Signatures [2]
• generateKeys and sign are
randomized algorithms
• Generates different keys for
different people
• Verify is always deterministic
• Valid signatures must be
verifiable – basic requirement
• Sign a message with secret key,
sk.
• Later, validate that signature
over that same message using the
public key, pk.
• The signature must validate
correctly.
33
Digital Signatures [3]
Unforgeability – It is computationally infeasible to forge signatures.
An adversary who knows your public key (pk) and has seen your
signatures on some messages(m1, ..mn) cannot forge your signature
on some message that he has not seen, i.e. message (munseen).
Formalized in terms of a game that is played with an adversary.
Adversary Claims that he can forge signatures
Challenger tests this claim
Step 1: Use generateKeys to generate sk and pk.
Step 2: sk given to challenger and pk given to both challenger and
the adversary
Step 3: The adversary knows only that information that is public and
his task is to forge a message
Step 4: The challenger can make signatures since he knows the sk. 34
Digital Signatures [5]
The setup of this game matches real-
world conditions
A real world attacker would be able
to see valid signatures from his
would be victim on different
documents
Manipulate the victim into signing
innocuous-looking documents
Game: Allow adversary to get
signatures on documents of his
choice, for as long as he wants and as • After the adversary has seen enough
long as the number of guesses is signatures, he will pick some message
plausible M, that he will attempt to forge a
Try 1 million guesses but not 280 signature on.
guesses • M should not have been signed before.
35
Digital Signatures [5]
The challenger runs the verify
algorithm on the signature
produced by the adversary.
Is the signature produced by the
adversary on M a valid one under
the public verification key?
If the signature successfully
verifies, the adversary wins the
game.
Signature scheme is unforgeable
if and only if the chances of
successfully forging a message is
extremely small – so small that it
will never happen in practice.
36
Digital Signatures [6]
Practical concerns:
Source of Randomness
Many signature algorithms are randomized
Good source of randomness is important
Bad randomness makes an otherwise secure algorithm insecure.
Message Size:
In practice, there is a limit to the length of the message that you can sign.
Getting around the limitation
Sign the hash of the message rather than the message itself.
Sign “Hash Pointer”
Here, the signature covers or protects the whole structure, not just the hash
pointer itself but everything the chain of hash pointers point to.
Digitally sign the entire Blockchain
Sign the hash pointer located at the end of the Blockchain.
37
Digital Signatures [7]
ECDSA (Elliptic Curve Digital Signature Algorithm)
Digital signature scheme used in Bitcoin
US Government standard
An update of the earlier DSA algorithm adapted to use elliptic curves
These algorithms are generally believed to be secure
Bitcoin uses ECDSA over the standard elliptic curve secp256k1, which
is estimated to provide 128 bits of security
it is as difficult to break this algorithm as it is to perform 2128 symmetric-key
cryptographic operations, such as invoking a hash function
Other applications using ECDSA (such as key exchange in the TLS
protocol for secure web browsing)
use the more common secp256r1 curve
secp256k1 was chosen by Satoshi in the early specification of the
system and is now difficult to change.
38
Digital Signatures [8]
42
Scroogecoin [1]
Solving the double-spending problem
A designated entity called Scrooge publishes an append-only ledger containing the history
of all transactions
The append-only ledger protects against double-spending by requiring all transactions to be
written in the ledger before they are accepted.
The append-only functionality can be implemented via a Blockchain that Scrooge digitally
signs.
Each block has the ID of a transaction, the transaction’s contents, and a hash pointer to the previous block.
Scrooge digitally signs the final hash pointer, which binds all the data in this entire structure, and he
publishes the signature along with the block chain
43
Scroogecoin [2]
A transaction is valid only if it is in the block chain
signed by Scrooge.
Verification: Anybody can verify that a transaction was
endorsed by Scrooge by checking Scrooge’s signature on
the block that records the transaction.
A Transaction that attempts to double spend an • On the other hand, in a system
already spent coin is not endorsed by Scrooge. where Scrooge signed blocks
individually, one would have to
To ensure append-only property, we need for both a block keep track of every single
chain with hash pointers in addition to having Scrooge signature Scrooge ever issued.
sign each block. • A block chain makes it easy for
Any modification by Scrooge i.e. addition, modification any two individuals to verify
or removal of will affect all following blocks because of that they have observed the
the hash pointers. same history of transactions
signed by Scrooge
If someone monitors the latest hash pointer published by
Scrooge, the change will be obvious and easy to catch. 44
Transactions in Scroogecoin [1]
CreateCoins creates multiple new coins with different values and assigns
them to people as initial owners.
Multiple coins are allowed to be created in one transaction
Each coin has a serial number in the transaction.
Each coin also has a value; it’s worth a certain number of scroogecoins.
Each coin has a recipient - a public key that gets the coin when it’s created.
CoinIDs: A CoinID is a combination of a transaction ID and the coin’s serial
number in that transaction.
45
Transactions in Scroogecoin [2]
PayCoins transaction consumes some coins (i.e., destroys them) and creates
new coins of the same total value.
The new coins might belong to different people (public keys).
This transaction has to be signed by everyone who’s paying in a coin.
The owner of one of the coins that’s going to be consumed in this transaction,
has to digitally sign the transaction to say that he/she is OK with spending this
coin.
46
Transactions in Scroogecoin [3]
A PayCoins transaction is valid if it satisfies four conditions:
Consumed coins are valid, i.e. they were created in previous transactions.
The consumed coins have not already been consumed in some previous transaction.
That is, this is not a double-spend transaction.
The total value of the coins that come out of this transaction is equal to the total value of
the coins that went in. Only Scrooge can create new value.
The transaction is validly signed by the owners of all coins consumed in the transaction.
47
Transactions in Scroogecoin [4]
Coins in this system are immutable—they are never changed, subdivided, or
combined.
Each coin is created, once, in one transaction and then later consumed in
another transaction.
We can get the same effect as being able to subdivide or combine coins by
using transactions.
For example, to subdivide a coin, Alice creates a new transaction that
consumes that one coin and then produces two new coins of the same total
value.
Those two new coins could be assigned back to her. So although coins are
immutable in this system, it has all the flexibility of a system that doesn’t have
immutable coins. 48
Transactions in Scroogecoin [5]
Core problem with Scroogecoin.
People can see which coins are valid and it prevents double spending.
Everyone can look into the blockchain and see that all transactions are valid
and that every coin is consumed only once.
The central problem here is Scrooge—he has too much influence.
Scrooge can’t create fake transactions, because he can’t forge other people’s signatures.
Scrooge could stop endorsing transactions from some users, denying them service and
making their coins unspendable.
If Scrooge is greedy, he could refuse to publish transactions unless they transfer some
mandated transaction fee to him.
Scrooge can also of course create as many new coins for himself as he wants.
Finally, Scrooge could get bored of the whole system and stop updating the block chain
completely. 49
Videos
Inside a Bit Coin Mining Farm (Longer Video)
https://www.youtube.com/watch?v=82vMOVREXzM
Inside the Largest Bitcoin Mine in The U.S. | WIRED
https://www.youtube.com/watch?v=x9J0NdV0u9k
Thanks
51