The Programmer's Hindsight - Caching with HttpClientFactory and Polly Part 2
In my last blog post Adding Cross-Cutting Memory Caching to an HttpClientFactory in ASP.NET Core with Polly I actually failed to complete my mission. I talked to a few people (thanks Dylan and Damian and friends) and I think my initial goal may have been wrong.
I thought I wanted "magic add this policy and get free caching" for HttpClients that come out of the new .NET Core 2.1 HttpClientFactory, but first, nothing is free, and second, everything (in computer science) is layered. Am I caching the right thing at the right layer?
The good thing that come out of explorations and discussions like this is Better Software. Given that I'm running Previews/Daily Builds of both .NET Core 2.1 (in preview as of the time of this writing) and Polly (always under active development) I realize I'm on some kind of cutting edge. The bad news (and it's not really bad) is that everything I want to do is possible it's just not always easy. For example, a lot of "hooking up" happens when one make a C# Extension Method and adds in into the ASP.NET Middleware Pipeline with "services.AddSomeStuffThatIsTediousButUseful()."
Polly and ASP.NET Core are insanely configurable, but I'm personally interested in the 80% or even the 90% case. The 10% will definitely require you/me to learn more about the internals of the system, while the 90% will ideally be abstracted away from the average developer (me).
I've had a Skype with Dylan from Polly and he's been updating the excellent Polly docs as we walk around how caching should work in an HttpClientFactory world. Fantastic stuff, go read it. I'll steal some here:
ASPNET Core 2.1 - What is HttpClient factory?
From ASPNET Core 2.1, Polly integrates with IHttpClientFactory. HttpClient factory is a factory that simplifies the management and usage of
HttpClient
in four ways. It:
allows you to name and configure logical
HttpClient
s. For instance, you may configure a client that is pre-configured to access the github API;manages the lifetime of
HttpClientMessageHandler
s to avoid some of the pitfalls associated with managingHttpClient
yourself (the dont-dispose-it-too-often but also dont-use-only-a-singleton aspects);provides configurable logging (via
ILogger
) for all requests and responses performed by clients created with the factory;provides a simple API for adding middleware to outgoing calls, be that for logging, authorisation, service discovery, or resilience with Polly.
The Microsoft early announcement speaks more to these topics, and Steve Gordon's pair of blog posts (1; 2) are also an excellent read for deeper background and some great worked examples.
Polly and Polly policies work great with ASP.NET Core 2.1 and integrated nicely. I'm sure it will integrate even more conveniently with a few smart Extension Methods to abstract away the hard parts so we can fall into the "pit of success."
Caching with Polly and HttpClient
Here's where it gets interesting. To me. Or, you, I suppose, Dear Reader, if you made it this far into a blog post (and sentence) with too many commas.
This is a salient and important point:
Polly is generic (not tied to Http requests)
Now, this is where I got in trouble:
Caching with Polly CachePolicy in a DelegatingHandler caches at the
HttpResponseMessage
level
I ended up caching an HttpResponseMessage...but it has a "stream" inside it at HttpResponseMessage.Content. It's meant to be read once. Not cached. I wasn't caching a string, or some JSON, or some deserialized JSON objects, I ended up caching what's (effectively) an ephemeral one-time object and then de-serializing it every time. I mean, it's cached, but why am I paying the deserialization cost on every Page View?
The Programmer's Hindsight: This is such a classic programming/coding experience. Yesterday this was opaque and confusing. I didn't understand what was happening or why it was happening. Today - with The Programmer's Hindsight - I know exactly where I went wrong and why. Like, how did I ever think this was gonna work? ;)
As Dylan from Polly so wisely points out:
It may be more appropriate to cache at a level higher-up. For example, cache the results of stream-reading and deserializing to the local type your app uses. Which, ironically, I was already doing in my original code. It just felt heavy. Too much caching and too little business. I am trying to refactor it away and make it more generic!
This is my "ShowDatabase" (just a JSON file) that wraps my httpClient
public class ShowDatabase : IShowDatabase { private readonly IMemoryCache _cache; private readonly ILogger _logger; private SimpleCastClient _client; public ShowDatabase(IMemoryCache memoryCache, ILogger<ShowDatabase> logger, SimpleCastClient client) { _client = client; _logger = logger; _cache = memoryCache; } static SemaphoreSlim semaphoreSlim = new SemaphoreSlim(1); public async Task<List<Show>> GetShows() { Func<Show, bool> whereClause = c => c.PublishedAt < DateTime.UtcNow; var cacheKey = "showsList"; List<Show> shows = null; //CHECK and BAIL - optimistic if (_cache.TryGetValue(cacheKey, out shows)) { _logger.LogDebug($"Cache HIT: Found {cacheKey}"); return shows.Where(whereClause).ToList(); } await semaphoreSlim.WaitAsync(); try { //RARE BUT NEEDED DOUBLE PARANOID CHECK - pessimistic if (_cache.TryGetValue(cacheKey, out shows)) { _logger.LogDebug($"Amazing Speed Cache HIT: Found {cacheKey}"); return shows.Where(whereClause).ToList(); } _logger.LogWarning($"Cache MISS: Loading new shows"); shows = await _client.GetShows(); _logger.LogWarning($"Cache MISS: Loaded {shows.Count} shows"); _logger.LogWarning($"Cache MISS: Loaded {shows.Where(whereClause).ToList().Count} PUBLISHED shows"); var cacheExpirationOptions = new MemoryCacheEntryOptions(); cacheExpirationOptions.AbsoluteExpiration = DateTime.Now.AddHours(4); cacheExpirationOptions.Priority = CacheItemPriority.Normal; _cache.Set(cacheKey, shows, cacheExpirationOptions); return shows.Where(whereClause).ToList(); ; } catch (Exception e) { _logger.LogCritical("Error getting episodes!"); _logger.LogCritical(e.ToString()); _logger.LogCritical(e?.InnerException?.ToString()); throw; } finally { semaphoreSlim.Release(); } } } public interface IShowDatabase { Task<List<Show>> GetShows(); }
I'll move a bunch of this into some generic helpers for myself, or I'll use Akavache, or I'll try another Polly Cache Policy implemented farther up the call stack! Thanks for reading my ramblings!
UPDATE: Be sure to read the comments below AND my response in Part 2.
Sponsor: SparkPost’s cloud email APIs and C# library make it easy for you to add email messaging to your .NET applications and help ensure your messages reach your user’s inbox on time. Get a free developer account and a head start on your integration today!
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
I believe Scott is using SemaphoreSlim to lock asynchronously.
Currently await cannot be used in the body of a lock statement and there are similar synchronization concerns with Mutex.
The SemaphoreSlim is static because only one thread should be updating the cache while multiple can read. Potentially another thread will try to update the cache after which is mitigated by double-checked locking.
I've also now requested that a TryGetOrCreate() method gets added to provide a nice way to abort without throwing exceptions.
As suggested, GetOrCreate (or more appropriate for this use case, GetOrCreateAsync) should handle the synchronization for you.
Let's go back for the main reason that I'm here: code, clean code. Is this a good example for young (and old) developers? It will take me ~1 week to forgive you.
Comments are closed.