twitter: All content tagged as twitter in NoSQL databases and polyglot persistence
Thursday, 27 June 2013
Hadoop Map Reduce jobs run time data and stats with Twitter's hRaven
Announced in the first day of the Hadoop Summit, hRaven is a new open source tool from the data team at Twitter meant to collect data and help analyse the usage of a Hadoop cluster:
hRaven collects run time data and statistics from map reduce jobs running on Hadoop clusters and stores the collected job history in an easily queryable format. For the jobs that are run through frameworks (Pig or Scalding/Cascading) that decompose a script or application into a DAG of map reduce jobs for actual execution, hRaven groups job history data together by an application construct. This allows for easier visualization of all of the component jobs’ execution for an application and more comprehensive trending and analysis over time.
Original title and link: Hadoop Map Reduce jobs run time data and stats with Twitter’s hRaven ( ©myNoSQL)