Abstract
In many application areas, complex data sets are often represented by some metric space and metric embedding is used to provide a more structured representation of the data. In many of these applications much greater emphasis is put on preserving the local structure of the original space than on maintaining its complete structure. This is also the case in some networking applications where “small world” phenomena in communication patterns have been observed. Practical study of embedding has indeed involved with finding embeddings with this property. In this paper we initiate the study of local embeddings of metric spaces and provide embeddings with distortion depending solely on the local structure of the space.
Similar content being viewed by others
Notes
Consider an equilateral metric (where all distances are 1); a strong local embedding imposes constraints on all pairs, thus a standard volume argument suggests that the required dimension is Ω(log α n) for distortion α.
This function ϑ indeed satisfy (1) because \(\int_{1}^{\infty}\vartheta(x)dx = -\epsilon\hat {c}\cdot [(\log^{(t)}x)^{-\epsilon}]_{1}^{\infty}= \epsilon \hat{c}\).
A metric space (X,d) is λ-doubling if for any x∈X and r>0, B(x,2r) can be covered by λ balls of radius r.
Unlike the previous embedding, this embedding may have arbitrary expansion.
Properly colored means that the end points of every directed edge are colored by different colors.
The total degree of a vertex in a directed graph is the number of edges touching the vertex.
The actual bound on the dimension is in fact O((logk)⋅(ln(2min{p,1/ϵ})/ϵ)2⋅max{1,1/(ϵp)}), which is O(ϵ −3logk⋅p −1logp) for p=O(1/ϵ), and O(ϵ −2log(1/ϵ)logk) for p=Ω(1/ϵ).
The actual bound on the dimension is in fact O((logλ)⋅(ln(2min{p,1/ϵ}))2⋅1/ϵ⋅max{1,1/(ϵp)}), which is O(ϵ −2logλ⋅p −1log2 p) for p=O(1/ϵ), and O(ϵ −1log2(1/ϵ)logλ) for p=Ω(1/ϵ).
Since d and d i differ by at most Δ i /n, in fact the padding parameter will be η−1/n. We shall ignore this minor change in what follows.
Recall that this is a collection of 2i-bounded partitions P i , for all integers i, such that if i<j then P i is a refinement of P j .
If y∉B(x,r k (x)), then the query will return value at least r k (x)/O(t).
References
Abraham, I., Bartal, Y., Chan, T.-H.H., Dhamdhere Dhamdhere, K., Gupta, A., Kleinberg, J.M., Neiman, O., Slivkins, A.: Metric embeddings with relaxed guarantees. In: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, FOCS’05, Washington, DC, USA, pp. 83–100. IEEE Comput. Soc., Los Alamitos (2005)
Abraham, I., Bartal, Y., Neiman, O.: Advances in metric embedding theory. In: STOC’06: Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, New York, NY, USA, pp. 271–286. ACM, New York (2006)
Abraham, I., Bartal, Y., Neiman, O.: Embedding metrics into ultrametrics and graphs into spanning trees with constant average distortion. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, Philadelphia, PA, USA, pp. 502–511. Society for Industrial and Applied Mathematics, Philadelphia (2007)
Abraham, I., Bartal, Y., Neiman, O.: Local embeddings of metric spaces. In: Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, STOC’07, New York, NY, USA, pp. 631–640. ACM, New York (2007)
Abraham, I., Bartal, Y., Neiman, O.: Embedding metric spaces in their intrinsic dimension. In: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’08, Philadelphia, PA, USA, pp. 363–372. Society for Industrial and Applied Mathematics, Philadelphia (2008)
Abraham, I., Bartal, Y., Neiman, O.: Nearly tight low stretch spanning trees. In: FOCS’08: Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 781–790. IEEE Comput. Soc., Los Alamitos (2008)
Abraham, I., Bartal, Y., Neiman, O.: On low dimensional local embeddings. In: SODA’09: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Philadelphia, PA, USA, pp. 875–884. Society for Industrial and Applied Mathematics, Philadelphia (2009)
Alon, N.: Problems and results in extremal combinatorics. I. Discrete Math. 273, 31–53 (2003)
Alon, N., Karp, R.M., Peleg, D., West, D.: A graph-theoretic game and its application to the k-server problem. SIAM J. Comput. 24(1), 78–100 (1995)
Awerbuch, B., Peleg, D.: Sparse partitions. In: Proceedings of the 31st IEEE Symposium on Foundations of Computer Science (FOCS), pp. 503–513 (1990)
Bansal, N., Buchbinder, N., Madry, A., Naor, J.: A polylogarithmic-competitive algorithm for the k-server problem. In: 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 267–276 (2011)
Bartal, Y.: Probabilistic approximation of metric spaces and its algorithmic applications. In: 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), pp. 184–193. IEEE Comput. Soc., Los Alamitos (1996)
Bartal, Y.: On approximating arbitrary metrics by tree metrics. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 183–193 (1998)
Bartal, Y.: Graph decomposition lemmas and their role in metric embedding methods. In: 12th Annual European Symposium on Algorithms, pp. 89–97 (2004)
Bartal, Y.: Metric Ramsey decompositions and their applications. Manuscript (2007)
Bartal, Y., Mendel, M.: Dimension reduction for ultrametrics. In: SODA’04: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 664–665. Society for Industrial and Applied Mathematics, Philadelphia (2004)
Bartal, Y., Blum, A., Burch, C., Tomkins, A.: A polylog(n)-competitive algorithm for metrical task systems. In: STOC’97: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, pp. 711–719. ACM, New York (1997)
Bartal, Y., Linial, N., Mendel, M., Naor, A.: Low dimensional embeddings of ultrametrics. Eur. J. Comb. 25, 87–92 (2004)
Bartal, Y., Linial, N., Mendel, M., Naor, A.: On metric Ramsey-type phenomena. Ann. Math. 162(2), 643–709 (2005)
Bartal, Y., Recht, B., Schulman, L.J.: Dimensionality reduction: beyond the Johnson-Lindenstrauss bound. In: SODA’11: Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (2011)
Beck, J.: An algorithmic approach to the Lovàsz local lemma. I. Random Struct. Algorithms 2, 343–366 (1991)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Bollobás, B.: Extremal Graph Theory. Academic Press/Harcourt Brace Jovanovich, London (1978)
Bourgain, J.: On Lipschitz embedding of finite metric spaces in Hilbert space. Isr. J. Math. 52(1–2), 46–52 (1985)
Calinescu, G., Karloff, H., Rabani, Y.: Approximation algorithms for the 0-extension problem. In: Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’01, Philadelphia, PA, USA, pp. 8–16. Society for Industrial and Applied Mathematics, Philadelphia (2001)
Charikar, M., Makarychev, K., Makarychev, Y.: Local global tradeoffs in metric embeddings. SIAM J. Comput. 39, 2487–2512 (2010)
Costa, M., Castro, M., Rowstron, A.I.T., Key, P.B.: Pic: practical Internet coordinates for distance estimation. In: 24th International Conference on Distributed Computing Systems, pp. 178–187 (2004)
Elkin, M., Emek, Y., Spielman, D.A., Teng, S.-H.: Lower-stretch spanning trees. In: STOC’05: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, New York, NY, USA, pp. 494–503. ACM, New York (2005)
Fakcharoenphol, J., Talwar, K.: An improved decomposition theorem for graphs excluding a fixed minor. In: RANDOM-APPROX, pp. 36–46 (2003)
Fakcharoenphol, J., Rao, S., Talwar, K.: A tight bound on approximating arbitrary metrics by tree metrics. In: STOC’03: Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, pp. 448–455. ACM, New York (2003)
Gilbert, E.N.: A comparison of signalling alphabets. Bell Syst. Tech. J. 31, 504–522 (1952)
Gottlieb, L.-A., Krauthgamer, R.: A nonlinear approach to dimension reduction. In: Proceedings of the Twenty Second Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 888–899 (2011)
Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS’03: Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, p. 534. IEEE Comput. Soc., Los Alamitos (2003)
Har-Peled, S., Mendel, M.: Fast construction of nets in low-dimensional metrics and their applications. SIAM J. Comput. 35(5), 1148–1184 (2006)
Indyk, P.: Algorithmic applications of low-distortion geometric embeddings. In: Proceedings of the 42nd Annual Symposium on Foundations of Computer Science, pp. 10–33 (2001)
Indyk, P., Naor, A.: Nearest-neighbor-preserving embeddings. ACM Trans. Algorithms 3(3), 31 (2007)
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in Modern Analysis and Probability (New Haven, Conn., 1982), pp. 189–206. Amer. Math. Soc., Providence (1984)
Klein, P., Plotkin, S.A., Rao, S.: Excluded minors, network decomposition, and multicommodity flow. In: Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, STOC’93, New York, NY, USA, pp. 682–690. ACM, New York (1993)
Kleinberg, J.: The small-world phenomenon: an algorithm perspective. In: STOC’00: Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, New York, NY, USA, pp. 163–170. ACM, New York (2000)
Kleinberg, J.M., Slivkins, A., Wexler, T.: Triangulation and embedding using small sets of beacons. In: FOCS, pp. 444–453 (2004)
Krauthgamer, R., Lee, J.R., Mendel, M., Naor, A.: Measured descent: a new embedding method for finite metrics. In: 45th Annual IEEE Symposium on Foundations of Computer Science, October 2004, pp. 434–443. IEEE Press, New York (2004)
Liben-Nowell, D., Novak, J., Kumar, R., Raghavan, P., Tomkins, A.: Geographic routing in social networks. Proc. Natl. Acad. Sci. USA 102, 11623–11628 (2005)
Matoušek, J.: Note on bi-Lipschitz embeddings into low-dimensional Euclidean spaces. Comment. Math. Univ. Carol. 31, 589–600 (1990)
Matoušek, J.: On embedding expanders into l p spaces. Isr. J. Math. 102, 189–197 (1997)
Mendel, M., Naor, A.: Euclidean quotients of finite metric spaces. Adv. Math. 189(2), 451–494 (2004)
Mendel, M., Naor, A.: Ramsey partitions and proximity data structures. In: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 109–118. IEEE Comput. Soc., Los Alamitos (2006)
Moser, R.A.: A constructive proof of the Lovász local lemma. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC’09, New York, NY, USA, pp. 343–350. ACM, New York (2009)
Moser, R.A., Tardos, G.: A constructive proof of the general Lovász local lemma. J. ACM 57(2), 1–15 (2010)
Peleg, D.: Distributed Computing: A Locality-Sensitive Approach. Monographs on Discrete Mathematics and Applications. SIAM, Philadelphia (2000)
Rao, S.: Small distortion and volume preserving embeddings for planar and Euclidean metrics. In: Proceedings of the Fifteenth Annual Symposium on Computational Geometry, New York, pp. 300–306. ACM, New York (1999)
Schechtman, G., Shraibman, A.: Lower bounds for local versions of dimension reductions. Discrete Comput. Geom. 41, 273–283 (2009)
Xiao, L., Sun, J., Boyd, S.: A duality view of spectral methods for dimensionality reduction. In: ICML’06: Proceedings of the 23rd International Conference on Machine Learning, New York, NY, USA, pp. 1041–1048. ACM, New York (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
Y. Bartel is supported in part by a grant from the Israeli Science Foundation (1609/11).
O. Neiman is supported by the Israeli Science Foundation grant number (523/12) and by the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n ∘303809. Most of the research on this paper was carried out while the author was a student at the Hebrew University and was supported in part by a grant from the Israeli Science Foundation (195/02).
This paper is a full version based on the conference papers [4, 7].
Appendix: Proof of Lemma 5
Appendix: Proof of Lemma 5
The following decomposition lemma was shown in [2].
Lemma 49
(Probabilistic Decomposition)
For any metric space (X,d), a subset Z⊆X, a point v∈X, real parameters χ≥2,Δ>0, let r be a random variable sampled from a truncated exponential density function with parameter λ=8ln(χ)/Δ
If S=B(v,r)∩Z and \({\bar{S}}= Z \setminus S\) then for any θ∈[χ −1,1) and any x∈Z:
where η=2−4ln(1/θ)/lnχ.
We are now ready to prove Lemma 5. By sub-partition we mean a partition {C i } i lacking the requirement that ⋃ i C i =X. The intuition behind the construction is that we perform the partition of [2] as long as the local growth rate is small enough. Once the growth rate is large with respect to the decomposability parameter, we assign all the points who were not covered by the first partition a cluster generated by the probabilistic partition known to exists from Definition 10. This is done in two phases:
- Phase 1 :
-
Define the sub-partition P 1 of X into clusters by generating a sequence of clusters: C 1,C 2,…,C s , for some s∈[n]. Notice that we are generating a distribution over sub-partitions and therefore the generated clusters are random variables. First we deterministically assign centers v 1,v 2,…,v s and parameters χ 1,χ 2,…,χ s . Let W 1=X and j=1. Conduct the following iterative process:
-
1.
Let v j ∈W j be the point minimizing \(\hat{\chi_{j}}=\rho(x,2\varDelta ,\gamma)\) over all x∈W j .
-
2.
If \(2^{6}\ln(\hat{\chi_{j}})>\tau^{-1}\) set s=j−1 and stop.
-
3.
Set \(\chi_{j} = \max\{2/\hat{\delta}^{1/4},\hat{\chi_{j}}\}\).
-
4.
Let W j+1=W j ∖B(v j ,Δ/4).
-
5.
Set j=j+1. If W j ≠∅ return to (1).
Now the algorithm for the partition and functions ξ,η is as follows: Let Z 1=X. For j=1,2,3,…,s:
-
1.
Let \((S_{v_{j}},{\bar{S}}_{v_{j}})\) be the partition created by invoking Lemma 49 on Z j with center v=v j and parameter χ=χ j .
-
2.
Set \(C_{j}=S_{v_{j}}\), \(Z_{j+1}={\bar{S}}_{v_{j}}\).
-
3.
For all x∈C j let \(\eta_{P}(x)=2^{-7}/\max \{\ln\hat{\chi}_{j},\ln(1/\hat{\delta})\}\). If \(\hat{\chi_{j}}\ge 1/\hat{\delta}\) set ξ P (x)=1, otherwise set ξ P (x)=0.
Fix some \(\hat{\delta}\le\delta\le1\). Let θ=δ 1/4. Note that \(\theta\geq2\chi_{j}^{-1}\) for all j∈[s] as required. Recall that η j =2−4ln(1/θ)/lnχ j =2−6ln(1/δ)/lnχ j (it is easy to verify that η P (x)⋅ln(1/δ)≤η j ). Observe that some clusters may be empty and that it is not necessarily the case that v m ∈C m .
-
1.
- Phase 2 :
-
In this phase we assign any points left unassigned from phase 1. Let \(P'_{2}=\{D_{1},D_{2},\ldots,D_{t}\}\) be a Δ-bounded probabilistic partition of X, such that for all δ≤1 satisfying ln(1/δ)≤26 τ −1, \(P'_{2}\) is (τ⋅ln(1/δ),δ)-padded, this probabilistic partition exists by Definition 10. Let \(Z=\bigcup_{i=1}^{s}C_{i}\) and \(\bar{Z}=X\setminus Z\) (the unassigned points), then let \(P_{2}=\{D_{1}\cap\bar{Z},D_{2}\cap\bar{Z},\ldots,D_{t}\cap\bar{Z}\}\). For all \(x\in\bar{Z}\) let η P (x)=τ/2 and ξ P (x)=1. It can be checked that \(\eta_{P}^{(\delta)}(x)\le\eta_{j}\) for all j∈[s]. Notice that by the stop condition of phase 1, \(\tau\le2^{-6}/\ln\hat{\chi}_{j}\), since by definition \(\tau\le 2^{-6}/\ln(1/\hat{\delta})\) as well follows that for all \(x\in\bar{Z}\) and j∈[s], η P (x)⋅ln(1/δ)≤η j .
Define P=P 1∪P 2. We now prove the properties in the lemma for some x∈X, first consider the sub-partition P 1, and the distribution over the clusters C 1,C 2,…,C s as defined above. For 1≤m≤s, define the events:
Also let T=T x =B(x,Δ). We prove the following inductive claim: For every 1≤m≤s:
Note that \(\Pr[ \mathcal{E}_{s} ]=0\). Assume the claim holds for m+1 and we will prove for m. Define the events:
First we bound \(\Pr[ \mathcal{F}_{m} ]\). Recall that the center v m of C m and the value of χ m are determined deterministically. The radius r m is chosen from the interval [Δ/4,Δ/2]. Since η m ≤1/2, if \(B(x,\eta_{m} \varDelta ) \bowtie (S_{v_{m}},{\bar{S}}_{v_{m}})\) then d(v m ,x)≤Δ, and thus v m ∈T. Therefore if v m ∉T then \(\Pr[ \mathcal{F}_{m} ] = 0\). Otherwise, we shall use Lemma 49 (note that conditioning on \(\mathcal{Z}_{m}\) only determines Z m , all the other parameters are determined deterministically),
Since the choice of radius is the only randomness in the process of creating P 1, the event of padding for z∈Z, and the event B(z,η P (z)Δ)∩Z=∅ for \(z\in\bar{Z}\) are independent of all choices of radii for centers v j ∉T z . That is, for any assignment to clusters of points outside B(z,2Δ) (which may determine radius choices for points in X∖B(x,Δ)), the padding probability will not be affected. Using the induction hypothesis we prove the inductive claim:
The second inequality follows from (24) and the induction hypothesis. Fix some x∈X, T=T x . Observe that for all v j ∈T, d(v j ,x)≤Δ, and so we get B(v j ,2Δ/γ)⊆B(x,2Δ). On the other hand B(v j ,2γΔ)⊇B(x,2Δ). Note that the definition of W j suggests that if v j is a center then all the other points in B(v j ,Δ/4) cannot be a center as well, therefore for any j≠j′, d(v j ,v j′)>Δ/4≥4Δ/γ, so that B(v j ,2Δ/γ)∩B(v j′,2Δ/γ)=∅. Hence, we get:
We conclude from (23) with m=1 that
Hence there is probability at least δ 1/2 that event \(\mathcal{E}_{1}\) does not occurs. Given that this happens, we will show that there is probability at least δ 1/2 that x is padded. If x∈Z, then let j∈[s] such that P(x)=C j , then η P (x)⋅ln(1/δ)≤η j and so B(x,η P (x)⋅ln(1/δ)Δ)⊆B(x,η j Δ). Note that if x∈Z is padded in P 1 it will be padded in P. If x∉Z: since for any j∈[s], η P (x)⋅ln(1/δ)≤η j we have that \(\neg\mathcal{E}_{1}\) suggests that B(x,η P (x)⋅ln(1/δ)Δ)∩Z=∅. As P 2 is performed independently of P 1 we have Pr[B(x,(τ/2)ln(1/δ))⊆P 2(x)]≥δ 1/2, hence
It follows that \(\hat{\mathcal{P}}\) is uniformly padded. Finally, we show the properties stated in the lemma. The first property follows from the stop condition in phase 1 and from the definition of η P (x). The second property holds: first take x∈Z and let j be such that x∈C j , then ξ P (x)=1 suggests that \(\hat{\chi}_{j}\ge 1/\hat{\delta}\) hence \(\eta_{P}(x)=2^{-7}/\ln\hat{\chi_{j}}= 2^{-7}/\ln\rho(v_{j},2\varDelta ,\gamma)\) and by the minimality of v j , η P (x)≥2−7/lnρ(x,2Δ,γ). By definition \(\eta_{P}(x)\le2^{-7}/\ln(1/\hat{\delta})\). If x∉Z then η P (x)=τ/2, by the stop condition of phase 1 \(\tau/2\ge2^{-7}/\ln\hat{\chi}_{j}\). Again by definition of \(\hat{\delta}\) it follows that \(\tau/2\le2^{-7}/\ln(1/\hat{\delta})\). As for the third property, which is meaningful only for x∈Z, let j be such that x∈C j , then ξ P (x)=0 suggests that \(\hat{\chi}_{j}< 1/\hat{\delta}\) hence \(\eta_{P}(x)=2^{-7}/\ln(1/\hat{\delta})\) and since d(x,v j )≤Δ also \(\bar{\rho}(x,2\varDelta ,\gamma)\le \rho(v_{j},2\varDelta ,\gamma) < 1/\hat{\delta}\).
Rights and permissions
About this article
Cite this article
Abraham, I., Bartal, Y. & Neiman, O. Local Embeddings of Metric Spaces. Algorithmica 72, 539–606 (2015). https://doi.org/10.1007/s00453-013-9864-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-013-9864-2