3. Preliminaries
• Formal definition of Dominates (≺)
Given a set of d-dimensional points 𝑇
We say that a point t1 ∈ 𝑇 DOMINATES another point t2 ∈ 𝑇
If and only if
∀𝑖 ∈ 1, 2, 3, … , 𝑑 , 𝑡1 𝑖 ≤ 𝑡2[𝑖]
∃𝑗 ∈ 1, 2, 3, … , 𝑑 , 𝑡1 𝑗 < 𝑡2[𝑗]
and Denoted by t1 ≺ t2
(simply saying, t1 이 자명하게 선호됨)
Definition from http://www.comp.nus.edu.sg/~atung/publication/k_dominant.pdf
Note that
the meaning of ‘dominates’ may differ
according to type of application
www.caranddriver.com
4. formal Definition (skyline)
• The Skyline operator
Input - Given a set of objects P = {𝑝1, 𝑝2, … , 𝑝 𝑁}
𝑆𝐾𝑌𝐿𝐼𝑁𝐸 𝑃 = {𝑝𝑖| 𝑝𝑖 ∈ 𝑃 𝑎𝑛𝑑 ∄ 𝑝∗
∈ 𝑃 𝑠. 𝑡. 𝑝∗
≺ 𝑝𝑖}
A
B
C
D
E
F
Dominating Area(B)
x axis
yaxis
G
Common misconceptions
“𝐵 ∈ 𝑂𝑢𝑝𝑢𝑡 s𝑖𝑛𝑐𝑒 𝐵 ≺ 𝐶 , D, F” , wrong
“𝐵 ∈ 𝑂𝑢𝑝𝑢𝑡,
s𝑖𝑛𝑐𝑒 𝑛𝑜 𝑜𝑡ℎ𝑒𝑟 𝑝𝑜𝑖𝑛𝑡 𝑃 ≺ 𝐵”, correct
8. Related Work: Summary
• Worst-case Analysis (2.1)
worst case complexity on arbitrary data distributions
Ω(𝑛𝑙𝑜𝑔𝑛)[16], O( N/B logM/B
𝑑−2
N/B )[12]
• Elimination Category (2.2)
Average Complexity with dimensional independence
Idea: Eliminate non-skyline objects quickly!
BNL[7], SFS[9], LESS[12], …
O(dnm)[20], where 𝑚 is the skyline cardinalityO(dnm)[20], where 𝑚 is the skyline cardinality
10. Anti-Correlated (2)
•A relationship in which
the value in one dimension increases as the values in the other
dimensions decrease
•Skyline Queries
are used to find a set of non-dominated data points
for Multi-Criteria Decision Making
•Data in real world
is more likely to be anti- correlated
11. Anti-Correlated (3)
• The anti-correlation significantly limits the practical
usage of the existing algorithms
• and yields the demand of effective mathematical
models and efficient algorithms on anti-correlated data
O(dnm)[20], where 𝑚 is the skyline cardinality
𝑚 tends to increase on anti-correlated distribution
These existing algorithms fall back to O(dn2)
13. Contribution
• 1) General model for the anti-correlated distribution
• 2) Polynomial Estimation of the lower bound of the
expected value of skyline cardinality
• 3) a “Determination and Elimination Framework” for
efficient computation of skyline on anti-correlated
distribution
20. 공분산
• 확률론과 통계학에서, 공분산(共分散, 영어: covariance)
은 2개의 확률변수의 상관정도를 나타내는 값
• 만약 2개의 변수중 하나의 값이 상승하는 경향을 보일
때, 다른 값도 상승하는 경향의 상관관계에 있다면, 공분
산의 값은 양수
• 반대로 2개의 변수중 하나의 값이 상승하는 경향을 보일
때, 다른 값이 하강하는 경향을 보인다면 공분산의 값은
음수