Berlin Buzzwords 2017 talk: A look at what our storage models, metaphors and APIs are, showing how we need to rethink the Posix APIs to work with object stores, while looking at different alternatives for local NVM.
This is the unabridged talk; the BBuzz talk was 20 minutes including demo and questions, so had ~half as many slides
15. val work = new Path("s3a://stevel-frankfurt/work")
val fs = work.getFileSystem(new Configuration())
val task00 = new Path(work, "task00")
fs.mkdirs(task00)
val out = fs.create(new Path(task00, "part-00"), false)
out.writeChars("hello")
out.close();
fs.listStatus(task00).foreach(stat =>
fs.rename(stat.getPath, work)
)
val statuses = fs.listStatus(work).filter(_.isFile)
require("part-00" == statuses(0).getPath.getName)
Facebook prineville photo store is probably at the far end of the spectrum
Zermelo–Fraenkel set theory (with axiom of choice) and a relational algebra. Or, as it is known: SQL
Relational Set theory
Everything usies the Hadoop APIs to talk to both HDFS, Hadoop Compatible Filesystems and object stores; the Hadoop FS API. There's actually two: the one with a clean split between client side and "driver side", and the older one which is a direct connect. Most use the latter and actually, in terms of opportunities for object store integration tweaking, this is actually the one where can innovate with the most easily. That is: there's nothing in the way.
Under the FS API go filesystems and object stores.
HDFS is "real" filesystem; WASB/Azure close enough. What is "real?". Best test: can support HBase.
This is how we commit work in Hadoop FileOutputFormat, and so, transitively, how Spark does it too (Hive does some other things, which I'm ignoring, but are more manifest file driven)
This is my rough guess at a C-level operation against mmaped data today. Ignoringthe open/sync stuff, then the writes in the middle are the operations we need to worry about, as they update the datastructures in-situ. Nonatomically.
This is my rough guess at a C-level operation against mmaped data today. Ignoringthe open/sync stuff, then the writes in the middle are the operations we need to worry about, as they update the datastructures in-situ. Nonatomically.
Work like RAMCloud has led the way here, go look at the papers. But that was at RAM, not NVM, where we have the persistence problem
Some form of LSF model tends to be used, which ties in well with raw SSD (don't know about other techs, do know that non-raw SSD doesn't really suit LFS)
If you look at code, we generally mix persisted (and read back) data with transient; load/save isn't so much seriaiizing out data structures as marshalling the persistent parts to a form we think they will be robust over time, unmarshalling them later (exceptions: Java Serialization, which is notoriously brittle and insecure)
What apps work best here? Could you extend Spark to make RDDs persistent & shared, so you can persist them simply by copying to part of cluster memory?, e.g a "NVMOutputFormat"