Introduction to Basic Haskell Components (In Chinese)
In 2012, we had the first Chinese functional meetup about general functional programming techniques in Taipei. I gave this talk to introduce several classes in the famous Typeclassesopedia article.
Here are the answers to your questions:
1. The main differences between a Trait and Abstract Class in Scala are:
- Traits can be mixed in to classes using with, while Abstract Classes can only be extended.
- Traits allow for multiple inheritance as they can be mixed in, while Abstract Classes only allow single inheritance.
- Abstract Classes can have fields and constructor parameters while Traits cannot.
- Abstract Classes can extend other classes, while Traits can only extend other Traits.
2. abstract class Animal {
def isMammal: Boolean
def isFriendly: Boolean = true
def summarize: Unit = {
println("Characteristics of animal:")
}
This document provides an introduction to Apache Spark, including its core components, architecture, and programming model. Some key points:
- Spark uses Resilient Distributed Datasets (RDDs) as its fundamental data structure, which are immutable distributed collections that allow in-memory computing across a cluster.
- RDDs support transformations like map, filter, reduce, and actions like collect that return results. Transformations are lazy while actions trigger computation.
- Spark's execution model involves a driver program that coordinates tasks on worker nodes using an optimized scheduler.
- Spark SQL, MLlib, GraphX, and Spark Streaming extend the core Spark API for structured data, machine learning, graph processing, and stream processing
I used these slides for a Scala workshop that I gave. They are based on these: http://www.scala-lang.org/node/4454. Thanks to Alf Kristian Støyle and Fredrik Vraalsen for sharing!
- The document discusses a presentation given by Jongwook Woo on introducing Spark and its uses for big data analysis. It includes information on Woo's background and experience with big data, an overview of Spark and its components like RDDs and task scheduling, and examples of using Spark for different types of data analysis and use cases.
This document provides an overview of a machine learning workshop including tutorials on decision tree classification for flight delays, clustering news articles with k-means clustering, and collaborative filtering for movie recommendations using Spark. The tutorials demonstrate loading and preparing data, training models, evaluating performance, and making predictions or recommendations. They use Spark MLlib and are run in Apache Zeppelin notebooks.
This Hadoop HDFS Tutorial will unravel the complete Hadoop Distributed File System including HDFS Internals, HDFS Architecture, HDFS Commands & HDFS Components - Name Node & Secondary Node. Not only this, even Mapreduce & practical examples of HDFS Applications are showcased in the presentation. At the end, you'll have a strong knowledge regarding Hadoop HDFS Basics.
Session Agenda:
✓ Introduction to BIG Data & Hadoop
✓ HDFS Internals - Name Node & Secondary Node
✓ MapReduce Architecture & Components
✓ MapReduce Dataflows
----------
What is HDFS? - Introduction to HDFS
The Hadoop Distributed File System provides high-performance access to data across Hadoop clusters. It forms the crux of the entire Hadoop framework.
----------
What are HDFS Internals?
HDFS Internals are:
1. Name Node – This is the master node from where all data is accessed across various directores. When a data file has to be pulled out & manipulated, it is accessed via the name node.
2. Secondary Node – This is the slave node where all data is stored.
----------
What is MapReduce? - Introduction to MapReduce
MapReduce is a programming framework for distributed processing of large data-sets via commodity computing clusters. It is based on the principal of parallel data processing, wherein data is broken into smaller blocks rather than processed as a single block. This ensures a faster, secure & scalable solution. Mapreduce commands are based in Java.
----------
What are HDFS Applications?
1. Data Mining
2. Document Indexing
3. Business Intelligence
4. Predictive Modelling
5. Hypothesis Testing
----------
Skillspeed is a live e-learning company focusing on high-technology courses. We provide live instructor led training in BIG Data & Hadoop featuring Realtime Projects, 24/7 Lifetime Support & 100% Placement Assistance.
Email: sales@skillspeed.com
Website: https://www.skillspeed.com
Functional Programming for OO Programmers (part 2)
Code examples demonstrating Functional Programming concepts, with JavaScript and Haskell.
Part 1 can be found here - http://www.slideshare.net/calvinchengx/functional-programming-part01
Source code can be found here - http://github.com/calvinchengx/learnhaskell
Let me know if you spot any errors! Thank you! :-)
The document provides an agenda for a DevOps advanced class on Spark being held in June 2015. The class will cover topics such as RDD fundamentals, Spark runtime architecture, memory and persistence, Spark SQL, PySpark, and Spark Streaming. It will include labs on DevOps 101 and 102. The instructor has over 5 years of experience providing Big Data consulting and training, including over 100 classes taught.
This document provides an overview of Hadoop and related big data technologies. It begins with defining big data and discussing why traditional systems are inadequate. It then introduces Hadoop as a framework for distributed storage and processing of large datasets. The key components of Hadoop - HDFS for storage and MapReduce for processing - are described at a high level. HDFS architecture and read/write operations are outlined. MapReduce paradigm and an example word count job are also summarized. Finally, Hive is introduced as a data warehouse tool built on Hadoop that provides SQL-like queries for large datasets.
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
The document outlines an agenda for a conference on Apache Spark and data science, including sessions on Spark's capabilities and direction, using DataFrames in PySpark, linear regression, text analysis, classification, clustering, and recommendation engines using Spark MLlib. Breakout sessions are scheduled between many of the technical sessions to allow for hands-on work and discussion.
This document outlines steps for developing analytic applications using Apache Spark and Python. It covers prerequisites for accessing flight and weather data, deploying a simple data pipe tool to build training, test, and blind datasets, and using an IPython notebook to train predictive models on flight delay data. The agenda includes accessing necessary services on Bluemix, preparing the data, training models in the notebook, evaluating model accuracy, and deploying models.
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
The document provides an overview of Spark and its machine learning library MLlib. It discusses how Spark uses resilient distributed datasets (RDDs) to perform distributed computing tasks across clusters in a fault-tolerant manner. It summarizes the key capabilities of MLlib, including its support for common machine learning algorithms and how MLlib can be used together with other Spark components like Spark Streaming, GraphX, and SQL. The document also briefly discusses future directions for MLlib, such as tighter integration with DataFrames and new optimization methods.
Functional Programming for OO Programmers (part 1)
The Why and Benefits of Functional Programming paradigm. Part 2 with source code can be found here: http://www.slideshare.net/calvinchengx/functional-programming-for-oo-programmers-part-2
Related source code https://github.com/calvinchengx/learnhaskell
Introduction to Basic Haskell Components (In Chinese)ChengHui Weng
In 2012, we had the first Chinese functional meetup about general functional programming techniques in Taipei. I gave this talk to introduce several classes in the famous Typeclassesopedia article.
Here are the answers to your questions:
1. The main differences between a Trait and Abstract Class in Scala are:
- Traits can be mixed in to classes using with, while Abstract Classes can only be extended.
- Traits allow for multiple inheritance as they can be mixed in, while Abstract Classes only allow single inheritance.
- Abstract Classes can have fields and constructor parameters while Traits cannot.
- Abstract Classes can extend other classes, while Traits can only extend other Traits.
2. abstract class Animal {
def isMammal: Boolean
def isFriendly: Boolean = true
def summarize: Unit = {
println("Characteristics of animal:")
}
This document provides an introduction to Apache Spark, including its core components, architecture, and programming model. Some key points:
- Spark uses Resilient Distributed Datasets (RDDs) as its fundamental data structure, which are immutable distributed collections that allow in-memory computing across a cluster.
- RDDs support transformations like map, filter, reduce, and actions like collect that return results. Transformations are lazy while actions trigger computation.
- Spark's execution model involves a driver program that coordinates tasks on worker nodes using an optimized scheduler.
- Spark SQL, MLlib, GraphX, and Spark Streaming extend the core Spark API for structured data, machine learning, graph processing, and stream processing
I used these slides for a Scala workshop that I gave. They are based on these: http://www.scala-lang.org/node/4454. Thanks to Alf Kristian Støyle and Fredrik Vraalsen for sharing!
- The document discusses a presentation given by Jongwook Woo on introducing Spark and its uses for big data analysis. It includes information on Woo's background and experience with big data, an overview of Spark and its components like RDDs and task scheduling, and examples of using Spark for different types of data analysis and use cases.
This document provides an overview of a machine learning workshop including tutorials on decision tree classification for flight delays, clustering news articles with k-means clustering, and collaborative filtering for movie recommendations using Spark. The tutorials demonstrate loading and preparing data, training models, evaluating performance, and making predictions or recommendations. They use Spark MLlib and are run in Apache Zeppelin notebooks.
This Hadoop HDFS Tutorial will unravel the complete Hadoop Distributed File System including HDFS Internals, HDFS Architecture, HDFS Commands & HDFS Components - Name Node & Secondary Node. Not only this, even Mapreduce & practical examples of HDFS Applications are showcased in the presentation. At the end, you'll have a strong knowledge regarding Hadoop HDFS Basics.
Session Agenda:
✓ Introduction to BIG Data & Hadoop
✓ HDFS Internals - Name Node & Secondary Node
✓ MapReduce Architecture & Components
✓ MapReduce Dataflows
----------
What is HDFS? - Introduction to HDFS
The Hadoop Distributed File System provides high-performance access to data across Hadoop clusters. It forms the crux of the entire Hadoop framework.
----------
What are HDFS Internals?
HDFS Internals are:
1. Name Node – This is the master node from where all data is accessed across various directores. When a data file has to be pulled out & manipulated, it is accessed via the name node.
2. Secondary Node – This is the slave node where all data is stored.
----------
What is MapReduce? - Introduction to MapReduce
MapReduce is a programming framework for distributed processing of large data-sets via commodity computing clusters. It is based on the principal of parallel data processing, wherein data is broken into smaller blocks rather than processed as a single block. This ensures a faster, secure & scalable solution. Mapreduce commands are based in Java.
----------
What are HDFS Applications?
1. Data Mining
2. Document Indexing
3. Business Intelligence
4. Predictive Modelling
5. Hypothesis Testing
----------
Skillspeed is a live e-learning company focusing on high-technology courses. We provide live instructor led training in BIG Data & Hadoop featuring Realtime Projects, 24/7 Lifetime Support & 100% Placement Assistance.
Email: sales@skillspeed.com
Website: https://www.skillspeed.com
Functional Programming for OO Programmers (part 2)Calvin Cheng
Code examples demonstrating Functional Programming concepts, with JavaScript and Haskell.
Part 1 can be found here - http://www.slideshare.net/calvinchengx/functional-programming-part01
Source code can be found here - http://github.com/calvinchengx/learnhaskell
Let me know if you spot any errors! Thank you! :-)
The document provides an agenda for a DevOps advanced class on Spark being held in June 2015. The class will cover topics such as RDD fundamentals, Spark runtime architecture, memory and persistence, Spark SQL, PySpark, and Spark Streaming. It will include labs on DevOps 101 and 102. The instructor has over 5 years of experience providing Big Data consulting and training, including over 100 classes taught.
This document provides an overview of Hadoop and related big data technologies. It begins with defining big data and discussing why traditional systems are inadequate. It then introduces Hadoop as a framework for distributed storage and processing of large datasets. The key components of Hadoop - HDFS for storage and MapReduce for processing - are described at a high level. HDFS architecture and read/write operations are outlined. MapReduce paradigm and an example word count job are also summarized. Finally, Hive is introduced as a data warehouse tool built on Hadoop that provides SQL-like queries for large datasets.
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
The document outlines an agenda for a conference on Apache Spark and data science, including sessions on Spark's capabilities and direction, using DataFrames in PySpark, linear regression, text analysis, classification, clustering, and recommendation engines using Spark MLlib. Breakout sessions are scheduled between many of the technical sessions to allow for hands-on work and discussion.
This document outlines steps for developing analytic applications using Apache Spark and Python. It covers prerequisites for accessing flight and weather data, deploying a simple data pipe tool to build training, test, and blind datasets, and using an IPython notebook to train predictive models on flight delay data. The agenda includes accessing necessary services on Bluemix, preparing the data, training models in the notebook, evaluating model accuracy, and deploying models.
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Spark Summit
The document provides an overview of Spark and its machine learning library MLlib. It discusses how Spark uses resilient distributed datasets (RDDs) to perform distributed computing tasks across clusters in a fault-tolerant manner. It summarizes the key capabilities of MLlib, including its support for common machine learning algorithms and how MLlib can be used together with other Spark components like Spark Streaming, GraphX, and SQL. The document also briefly discusses future directions for MLlib, such as tighter integration with DataFrames and new optimization methods.
Functional Programming for OO Programmers (part 1)Calvin Cheng
The Why and Benefits of Functional Programming paradigm. Part 2 with source code can be found here: http://www.slideshare.net/calvinchengx/functional-programming-for-oo-programmers-part-2
Related source code https://github.com/calvinchengx/learnhaskell
22. Quick Sort
隐式转换
type Segment = (List[Int], List[Int], List[Int])
implicit class ListWithPartition(list: List[Int]) {
def partitionBy(p: Int): Segment = {
val idenElem = (List[Int](), List[Int](), List[Int]())
def partition(result: Segment, x: Int): Segment = {
val (left, mid, right) = result
if (x < p) (x :: left, mid, right)
else if (x == p) (left, x :: mid, right)
else (left, mid, x :: right)
}
list.foldLeft(idenElem)(partition)
}
}
23. 副作用
值与址
class Pair[A](var x: A, var y: A) {
def modifyX(x: A) = this.x = x
def modifyY(y: A) = this.y = y
}
var pair = new Pair(1, 2)
var pair1 = new Pair(pair, pair)
var pair2 = new Pair(pair, new Pair(1, 2))
pair.modifyX(3)
34. 惰性求值
Lazy val x = 3 + 3
def number = {println("OK"); 3 + 3}
class LazyValue(expr: => Int) {
var evaluated: Boolean = false
var value: Int = -1
def get: Int = {
if (!evaluated) {
value = expr
evaluated = true
}
value
}
}
Call By Name
val lazyValue = new LazyValue(number)
println(lazyValue.get)
println(lazyValue.get)
Thinking in Java
Map可以用装饰器模式来实现
35. Higher-Order Functions
map(f: T => U): A[U]
filter(f: T => Boolean): A[T]
flatMap(f: T => A[T]): A[T]
groupBy(f: T => K): A[(K, List[T])]
sortBy(f: T => K): A[T]
NEW
Count: Int
Force: A[T]
Reduce(f: (T, T) => T): T
T
r
a
n
f
o
r
m
a
t
i
o
n
A
c
t
i
o
n
37. 一些语法糖
class Sugar(i: Int) {
def unary_- = -i
def apply(expr: => Unit) = for (j <- 1 to i) expr
def +(that: Int) = i + that
def +:(that: Int) = I + that
}
目的是为了做好DSL
和延续函数式编程习惯
val sugar = new Sugar(2)
请注意谨慎使用
-sugar
sugar(println("aha"))
sugar + 5
5 + sugar
前缀
中缀
省略方法名
所有字母
|
^
&
< >
= !
: 注意右结
合
+ -
* / %
其他字符
右结合
39. Trait & Mix-in
Mix-in是一种多继承的手段,同Interface一样,通过限制第二个父类的方式
来限制多继承的复杂关系,但它具有默认的实现。
1.通常的继承提供单一继承
2.第二个以及以上的父类必须是Trait
3.不能单独生成实例
Scala中的Trait可以在编译时进行混合也可以在运行时混合。
设想我们要描述一种鸟,它可以唱歌也可以跑;由于它是一只鸟,它当然可
以飞。
abstract class Bird(kind: String) {
val name: String
def singMyName = println(s"$name is singing")
val capability: Int
def run = println(s"I can run $capability meters!!!")
def fly = println(s"flying of kind: $kind")
}
但显然,一个人也可以跑可以唱歌……..不过他还可以编程.
(虽然我不歧视鸟类,不过如果碰到会编程的鸟请通知我)
继承
40. trait Runnable {
val capability: Int
def run = println(s"I can run $capability meters!!!")
}
trait Singer {
val name: String
def singMyName = println(s"$name is singing")
}
abstract class Bird(kind: String) {
def fly = println(s"flying of kind: $kind")
}
继承
41. class Nightingale extends Bird("Nightingale") with Singer with Runnable {
val capability = 20
val name = "poly"
}
val myTinyBird = new Nightingale
myTinyBird.fly
myTinyBird.singMyName
myTinyBird.run
class Coder(language: String) {
val capability = 10
val name = "Handemelindo"
def code = println(s"coding in $language")
}
val me = new Coder("Scala") with Runnable with Singer
me.code
me.singMyName
me.run
继承
43. 一些小伙伴
Case Class与ADT
abstract class Tree
case class Leaf(info: String) extends Tree
case class Node(left: Tree, right: Tree) extends Tree
def traverse(tree: Tree): Unit = {
tree match {
case Leaf(info) => println(info)
case Node(left, right) => {
traverse(left)
traverse(right)
}
}
}
val tree: Tree = new Node(new Node(new Leaf("1"), new Leaf("2")), new Leaf("3"))
traverse(tree)
45. *
Any
Int
1
Pair[Int, Int]
(1, 2)
List[Int]
[1, 2, 3]
* * * * *
List Pair
Kind
Type
Value
类型构造器
类别
子类型
Generics of a Higher Kind - Martin Odersky
=> => =>
Proper
Type
46. type Int :: *
type String :: *
type (Int => String) :: *
type List[Int] :: *
type List :: ?
type Function1 :: ??
做一些抽象练习吧
type List :: * => *
type function1 :: * => * => * Function1[-T, +R]
def id(x: Int) = x
type Id[A] = A
def id(f: Int => Int, x: Int) = f(x)
type id[A[_], B] = A[B]
47. 设想,我们的程序要返回结果:
(Set(x,x,x,x,x), List(x,x,x,x,x,x,x,x,x,x))
(* -> *) -> (* -> *) -> *
type Pair[K[_], V[_]] = (K[A], V[A]) forSome { type A }
val pair: Pair[Set, List] = (Set(“42”), List(52))
val pair: Pair[Set, List] = (Set(42), List(52))
做一些抽象练习吧
49. trait Monoid[A]{
val zero: A
def append(x: A, y: A): A
}
object IntNum extends Monoid[Int] {
val zero = 0
def append(x: Int, y: Int) = x + y
}
object DoubleNum extends Monoid[Double] {
val zero = 0d
def append(x: Double, y: Double) = x + y
}
def sum[A](nums: List[A])(tc: Monoid[A]) =
nums.foldLeft(tc.zero)(tc.append)
sum(List(1, 2, 3, 5, 8, 13))(IntNum)
sum(List(3.14, 1.68, 2.72))(DoubleNum)
对态射进行抽象
50. trait Monoid[A]{
val zero: A
def append(x: A, y: A): A
}
object IntNum extends Monoid[Int] {
val zero = 0
def append(x: Int, y: Int) = x + y
}
object DoubleNum extends Monoid[Double] {
val zero = 0d
def append(x: Double, y: Double) = x + y
}
def sum[A](nums: List[A])(implicit tc: Monoid[A]) =
nums.foldLeft(tc.zero)(tc.append)
sum(List(1, 2, 3, 5, 8, 13))
sum(List(3.14, 1.68, 2.72))
implicit
implicit
Type Class
1.抽象分离
2.可组合
3.可覆盖
4.类型安全
Type Class
val list = List(1,3,234,56,5346,34)
list.sorted sorted[B >: A](implicit ord: math.Ording[B])
51. 逆变与协变
List[+T]
class Person(name: String) {
def shut = println(s"I am $name")
}
class Coder(language: String, name: String) extends Person(name) {
def code = println(s"Coding in $language")
}
val persons: List[Coder] = List(new Coder("Java", "Jeff"),
new Coder("Haskell", "Harry"))
def traverse(persons: List[Person]) = persons.foreach(_.shut)
traverse(persons)
59. Monad
自函子上的幺半群
回想一下幺半群的单位元
回想一下fold函数
什么是自函子上的单位元呢?
什么是自函子上的结合运算呢?
Unit x >>= f ≡ f x
M >>= unit ≡ m
(m >>= f) >>= g ≡ m >>= (λx . F x >>= g)
单位元:将元素提升进计算语境
结合律:结合简单运算形成复杂运算
60. 一些常见Monad
Option
Option或叫Maybe,表示可能失败的计算
由Some(Value)或None表示
Some(x) fMap (f: A => Some[B]) = Some(f(x))
None fMap(f: A => Some[B]) = None
Unit = Some
val maybe: Option[Int] = Some(4)
val none: Option[Int] = None
def calculate(maybe: Option[Int]):
Option[Int] = for {
value <- maybe
} yield value + 5
calculate(maybe)
calculate(none)
61. 一些常见Monad
List
集合本身是Proper type,它代表的是不确定性
Unit = List
val list1 = List(2, 4, 6, 8)
val list2 = List(1, 3, 5, 7)
for {
value1 <- list1
value2 <- list2
} yield value1 + value2
69. map(f: T => U)
filter(f: T => Boolean)
flatMap(f: T => Seq[U])
sample(fraction: Float)
groupByKey()
reduceByKey(f: (V, V) => V)
mapValues(f: V => W)
RDD
NEW
Count()
Collect()
Reduce(f: (T, T) => T)
Lookup(k: K)
Save(path: String)
take(n: Int)
T
r
a
n
f
o
r
m
a
t
i
o
n
A
c
t
i
o
n
union()
join()
cogroup()
crossProduct
sort(c Comparator[K])
partitionBy(p: Partitioner[K])
70. Word Count
[(K1, V1)] -> [(K2, [V2])] -> [(K2, V3)]
lines = spark.textFile("hdfs://...")
words = lines.flatMap(_.split(“//s+”))
wordCounts = words.map((_, 1))
result = wordCounts.reduceByKey(_ + _)
result.save(“hdfs://…”)
RDD
79. MLlib
SVM with SGD
NB
各类决策树
Classification
LabeledPoint(Double, Vector)
val data = sc.textFile(“….")
val parsedData = data.map { line =>
val parts = line.split(' ')
LabeledPoint(parts(0).toDouble, parts.tail.map(x => x.toDouble).toArray)
}
val numIterations = 20
val model = SVMWithSGD.train(parsedData, numIterations)
val labelAndPreds = parsedData.map { point =>
val prediction = model.predict(point.features)
(point.label, prediction)
}
val trainErr = labelAndPreds.filter(r => r._1 != r._2).count.toDouble / parsedData.count
80. MLlib
逻辑回归Regression
岭回归与
拉锁回归
LabeledPoint(Double <- Vector)
val data = sc.textFile(“….")
val parsedData = data.map { line =>
val parts = line.split(',')
LabeledPoint(parts(0).toDouble, parts(1).split(' ').map(x => x.toDouble).toArray)
}
val numIterations = 20
val model = LinearRegressionWithSGD.train(parsedData, numIterations)
val valuesAndPreds = parsedData.map { point =>
val prediction = model.predict(point.features)
(point.label, prediction)
}
val MSE = valuesAndPreds.map{ case(v, p) =>
math.pow((v - p), 2)}.reduce(_ + _) / valuesAndPreds.count
81. MLlib
Clustering
Clustering:
k均值
及其变种k均值++ Vector
val data = sc.textFile(“….")
val parsedData = data.map( _.split(' ').map(_.toDouble))
val numIterations = 20
val numClusters = 2
val clusters = KMeans.train(parsedData, numClusters, numIterations)
val WSSSE = clusters.computeCost(parsedData)
82. MLlib
支持显性和隐性的ALS
Collaborate Filtering
Rating(Int, Int, Double)
val data = sc.textFile(“….")
val ratings = data.map(_.split(',') match {
case Array(user, item, rate) => Rating(user.toInt, item.toInt, rate.toDouble)
})
val numIterations = 20
val model = ALS.train(ratings, 1, 20, 0.01)
val usersProducts = ratings.map{ case Rating(user, product, rate) => (user, product)}
val predictions = model.predict(usersProducts).map{
case Rating(user, product, rate) => ((user, product), rate)
}
val ratesAndPreds = ratings.map{
case Rating(user, product, rate) => ((user, product), rate)
}.join(predictions)
val MSE = ratesAndPreds.map{
case ((user, product), (r1, r2)) => math.pow((r1 - r2), 2)
}.reduce(_ + _) / ratesAndPreds.count