The document discusses how genomics and DNA sequencing have been transformed by exponential increases in data and computing power similar to Moore's law. It describes how Hadoop and related technologies have enabled scaling out storage and analysis of genomic data across clusters. This allows secondary analysis of DNA sequencing data like variant calling to be performed more efficiently.
1 of 49
More Related Content
Hadoop and Genomics - What you need to know - Cambridge - Sanger Center and EBI
Why MapR is the best Hadoop Platform for Data warhouse optimization? For business-critical applications you must have data protection and security (availability, data protection, and recovery), high performance (with random read-write system), multi-tenancy (to support multiple business units, isolate applications or user data,…), provide good resource and workload management to support multiple applications, and open standards to integrate with the rest of the IT ecosystem.
You also need a platform that is capable of super fast data ingestion from multiple sources and be able to make critical analytics and decisions at speed (in milliseconds), and at scale. Examples include breach detection based on information from multiple sources, fraud detection on millions of transactions that are based on individual patterns, fleet management and routing taking into account current conditions….This requires a Hadoop platform that can go beyond batch and support streaming writes so data can be constantly writing to the system while analysis is being conducted. High performance to meet the business needs and real-time operations the ability to perform online database operations to react to the business situation and impact business as it happens not report on it one week, month or quarter later.
Data Agility is needed for Business Agility. Drill provides instant ANSI SQL for Hadoop & NoSQL. You can explore data in its native format without expensive and time consuming transformation. You can analyze evolving and semi-structured/nested data from NoSQL databases, find what is of value and THEN model this in your DW schema for downstream ad-hoc reporting by 100’s or 1000’s of concurrent users.
MapR Quick Start Solutions are a set of purpose-built solutions for the most critical and valuable use cases for Hadoop. These solutions, which include pre-built templates for each of the areas listed below, let you quickly get started with Hadoop and achieve faster time-to-value. We currently have offers around DWO, security log analytics, and recommendation engines with more planned for 2015.