Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Showing posts with label security. Show all posts
Showing posts with label security. Show all posts

2010-02-12

A tour of the open standards used by Google Buzz

The thing I find most attractive about Google Buzz is its stated commitment to open standards:

We believe that the social web works best when it works like the rest of the web — many sites linked together by simple open standards.

So I took a bit of time to look over the standards involved. I’ll focus here on the standards that are new to me.

One key design decision in Google Buzz is that individuals in the social web should be identifiable by email addresses (or at least strings that look like email addresses).  On balance I agree with this decision: although it is perhaps better from a purist Web architecture perspective to use URIs for this, I think email addresses work much better from a UI perspective.

Google Buzz therefore has some standards to address the resulting discovery problem: how to associate metadata with something that looks like an email address. There are two key standards here:

  • XRD. This is a simple XML format developed by the OASIS XRI TC for representing metadata about a resource in a generic way. This looks very reasonable and I am happy to see that it is free of any XRI cruft. It seems quite similar to RDDL.
  • WebFinger. This provides a mechanism for getting from an email address to an XRD file.  It’s a two-step process based on HTTP.  First of all you HTTP get an XRD file from a well-known URI constructed using the domain part of the email address (the well-known URI follows the Defining Well-Known URIs and host-meta Internet Drafts). This per-domain XRD file provides (amongst other things) a URI template that tells you how to construct a URI for an email address in that domain; dereferencing this URI will give you an XRD representation of metadata related to that email address.  There seem to be some noises about a JSON serialization, which makes sense: JSON seems like a good fit for this problem. 

One of the many interesting things you can do with such a discovery mechanism is to associate a public key with an individual.  There’s a spec called Magic Signatures that defines this.  Magic Signatures correctly eschews all the usual X.509 cruft, which is completely unnecessary here; all you need is a simple RSA public key.  My one quibble would be that it invents its own format for public keys, when there is already a perfectly good standard format for this: the DER encoding of the RSAPublicKey ASN.1 structure (defined by RFC 3477/PKCS#1), as used by eg OpenSSL.

Note that for this to be secure, WebFinger needs to fetch the XRD files in a secure way, which means either using SSL or signing the XRD file using XML-DSig; in both these cases it is leveraging the existing X.509 infrastructure. The key architectural decision here is to use the X.509 infrastructure to establish trust at the domain level, and then to use Web technologies to extend that chain of trust from the domain to the individual. From a deployment perspective, I think this will work well for things like Gmail and Facebook, where you have many users per domain.  The challenge will be do make it work well for things like Google Apps for your Domain, where the number of users per domain may be few.  At the moment, Google Apps requires the domain administrator only to set up some DNS records.  The problem is that DNS isn’t secure (at least until DNSSEC is widely deployed).  Here’s one possible solution: the user’s domain (e.g. jclark.com) would have an SRV record pointing to a host in the provider’s domain (e.g. foo.google.com); the XRD is fetched using HTTP, but is signed using XML-DSig and  an X.509 certificate for the user’s domain.  The WebFinger service provider (e.g. Google) would take care of issuing these certificates, perhaps with flags to limit their usage to WebFinger (Google already verifies domain control as part of the Google Apps setup process). The trusted roots here might be different from the normal browser vendor determined HTTPS roots.

The other part of Magic Signatures is billed as a simpler alternative to XML-DSig which also works for JSON. The key idea here is to avoid the whole concept of signing an XML information item and thus avoid the need for canonicalization.  Instead you sign a byte sequence, which is encoded in base64 as the content of an XML element (or as a JSON string).  I don’t agree with the idea of always requiring base64 encoding of the content to be signed: that seems to unnecessarily throw away many of the benefits of a textual format.  Instead, when the byte sequence that you are signing is representing a Unicode string, you should be able to represent the Unicode string directly as the content of an XML element or as a JSON string, using the built-in quoting mechanisms of XML (character references/entities and CDATA sections) or JSON. The Unicode string that results from XML or JSON parsing would be UTF-8 encoded before the standard signature algorithm is applied. A more fundamental problem with Magic Signatures is that it loses the key feature of XML-DSig (particularly with enveloped signatures) that applications that don’t know or care about signing can still understand the signed data, simply by ignoring the signature.  I completely sympathize with the desire to avoid the complexity of XML-DSig, but I’m unconvinced that Magic Signatures is the right way to do so. Note that XRD has a dependency on XML-DSig, but it specifies a very limited profile of XML-DSig, which radically reduces the complexity of XML-DSig processing. For JSON, I think i

There are also standards that extend  Atom. The simplest are just content extensions:

  • Atom Activity Extensions provides semantic markup for social networking activities (such as "liking" something or posting something). This makes good sense to me.
  • Media RSS Module provides extensions for dealing with multimedia content. These were originally designed by Yahoo for RSS. I don't yet understand how these interact with existing Atom/AtomPub mechanisms for multimedia (content/@src, link).

There are also protocol extensions:

  • PubSubHubbub provides a scalable way of getting near-realtime updates from an Atom feed. The Atom feed includes a link to a “hub”.  An aggregator can then register with hub to be notified when a feed is updated. When a publisher updates a feed, it pings the hub and the hub then updates all the aggregators that have registered with it.  This is intended for server-based aggregators, since the hub uses HTTP POST to notify aggregators.
  • Salmon makes feed aggregation two-way.  Suppose user A uses only social networking site X and user B uses only social networking site Y. If user A wants to network with B, then typically either A has to join Y or B has to join X.  This pushes the world in the direction of having one dominant social network (i.e. Facebook). In the long-term I don’t think this is a good thing.  The above extensions solve part of the problem. X can expose a profile for A that links to an Atom feed, and Y can use this to provide B with information about A. But there’s a problem.  Suppose B wants to comment on one of A’s entries.  How can Y ensure that B’s comment flows back to X, where A can see it?  Note that there may be another user C on another social networking site Z that may want to see B’s comment on A’s entry. The basic idea is simple: the Atom feed for A exposed by X links to a URI to which comments can be posted.  The heavy lifting of Salmon is done by Magic Signatures.  Signing the Atom entries is the key to allowing sites to determine whether to accept comments.

Google seems to planning to use the Open Web Foundation (OWF) for some of these standards.  Although the OWF’s list of members includes many names that I recognize and respect, I don’t really understand why we need the OWF. It seems very similar to the IETF in its emphasis on individual participation.  What was the perceived deficiency in the IETF that motivated the formation of the OWF?

2007-10-16

HTTP response signing strawman

If we revise the abstract model for generating a Signature header along the lines suggested in my previous post, we get this:

  1. Choose which key (security token) to use and create one or more identifiers for it.  One possible kind of key would be an X.509 certificate.
  2. Choose which response headers to sign.  This would include at least Content-Type and probably Date and Expires.  It would not include hop-to-hop headers.
  3. Compute the digest (cryptographic hash) of the full entity body of the requested URI. Base64-encode the digest.
  4. Create a  Signature header template; this differs from the final Signature header only in that it has a blank string at the point where the final Signature header will have the base64-encoded signature value. It can specify the following information:
    • the type of key;
    • one or more identifiers for the key;
    • an identifier for the suite of cryptographic algorithms to be used;
    • an identifier for the header canonicalization algorithm to be used;
    • a list of the names of the response headers to be signed;
    • the request URI;
    • the base64 encoded digest (from step 4).
  5. Combine the response headers that are to be signed with the Signature header template.
  6. Canonicalize the headers from the previous step.  This ensures that the canonicalization of the headers as seen by the origin server are the same as the canonicalization of the headers as seen by the client, even if there are one or more HTTP/1.1 conforming proxies between the client and the origin server.
  7. Compute the cryptographic hash of the canonicalized headers.
  8. Sign the cryptographic hash created in the previous step.  Base64-encode this to create the signature value.
  9. Create the final Signature header by inserting the base64-encoded signature value from the previous step into the Signature header template from step 5.

Note that when verifying the signature, as well as checking the signature value, you have to compute the digest of the entity body and check that it matches the digest specified in the Signature header.

The syntax could be something like this:

Signature = "Signature" ":" #signature-spec
signature-spec = key-type 1*( ";" signature-param )
key-type = "x509" | key-type-extension
signature-param =
   "value" = <"> <Base64 encoded signature> <">
   | "canon" = "basic" | canon-extension
   | "headers" = <"> 1#field-name <">
   | "request-uri" = quoted-string
   | "digest" = <"> <Base64 encoded digest> <">
   | "crypt" = ( "rsa-sha1" | crypt-extension )
   | "key-uri" = quoted-string
   | "key-uid" = sha1-fingerprint | uid-extension
   | signature-param-extension
sha1-fingerprint = <"> "sha1" 20(":" 2UHEX) <">
UHEX = DIGIT | "A" | "B" | "C" | "D" | "E" | "F"
uid-extension = <"> uid-type ":" 1*uid-char <">
uid-type = token
uid-char = <any CHAR except CTLs, <\> and <">>
key-type-extension = token
canon-extension = token
crypt-extension = token
hash-func-extension = token
signature-param-extension =
   token "=" (token | quoted-string)

There are several issues I'm not sure about.

  • Should this be generalized to support signing of (some kinds of) HTTP request?
  • What is the right way to canonicalize HTTP headers?
  • Rather than having a digest parameter, would it be better to use the Digest header from RFC 3230 and then include that in the list of headers to be signed?
  • Should the time period during which the signature is valid be specified explicitly by parameters in the Signature header rather than being inferred from other headers, such as Date and Expires (which would of course need to be included in the list of headers to sign)?
  • Should support for security tokens other than X.509 certificates be specified?

2007-10-15

HTTP: what to sign?

There's been quite a number of useful comments on my previous post, and even an implementation.  The main area where there seems to be disagreement is on the issue of what exactly to sign.

It seems to me that you can look at an HTTP interaction at two different levels:

  • at a low level, it consist of request and response messages;
  • at a slightly higher level, it consists of the transfer of the representations of resources.

With a simple GET, there's a one-to-one correspondence between a response message a representation transfer.  But with fancier HTTP features, like HEAD or conditional GET or ranges or the proposed PATCH method, these two levels start to diverge: the messages aren't independent entities in themselves, they are artifacts of the client attempting to efficiently synchronize the representation of the resource that it has with the current representation defined by the origin server.

The question then arises of whether, at an abstract level, the right thing to sign is messages or resource representations.  I think the right answer is resource representations: those are things whose integrity is important to applications.  For example, in the response to the HEAD message, the signature wouldn't simply sign the response to the HEAD message; rather it would cover the entity that would have been returned by a GET. The Signature header would thus be allowed in similar situations to the ETag header and would correspond to the same thing that a strong entity tag corresponds to.

It's important to remember that the representation of the resource doesn't consist of just the data in the entity body.  It also includes the metadata in the entity headers.  At the very least, I think you would want to sign the Content-Type header. Note that there are some headers that you definitely wouldn't want to sign, in particular hop-to-hop headers.  I don't think there's a single right answer as to which headers to sign, which means that the Signature header will need to explicitly identify which headers it is signing.

With this approach the signature doesn't need to cover the request.  However, it does need to relate the representation to a particular resource. Otherwise there's a nasty attack possible: the bad guy can replace the response to a request for one resource with the response to a request for another resource. (Suppose http://www.example.com/products/x/price returns the price of product x; an attacker could completely switch around the price list.)  I think the simplest way to solve this is for the Signature header in the response to include a uri="request_uri" parameter, where request_uri is the URI of the resource whose representation is being signed. This allows the signature verification process to work with just the response headers and body as input, which should simplify plugging this feature into implementations.

Although not including the request headers in the signature simplifies things, it must be recognized that it does lose some functionality. When there are multiple variants, the signature can't prove that you've got the right variant. However, I think that's a reasonable tradeoff.  Even if the request headers were signed, sometimes the response depends on things that aren't in the request, like the client's IP address (as indicated by Vary: *). The response can at least indicate that the response is one of several possible variants, by including Content-Location, Content-Language and/or Vary headers amongst the signed response headers.

The signature will also need to include information about the time during which the relationship between the representation and the resource applies.  I haven't figured out exactly how this should work.  It might be a matter of signing some combination of Date, Last-Modified, Expires and Cache-Control (specifically the s-maxage and maxage directives) headers, or it might involve adding timestamp parameters to the Signature header.

To summarize, the signature in the response should assert that a particular entity is a representation of a particular resource at a particular time.