Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
LARGE-SCALE ENTITY RESOLUTION FOR SEMANTIC WEB DATA INTEGRATION Gustavo de Assis Costa1,2, José Maria Parente de Oliveira1 1 Divisão de Ciência da Computação, Instituto Tecnológico de Aeronáutica, Brazil Departamento de Informática, Instituto Federal de Educação, Ciência e Tecnologia de Goiás - Campus Jataí, Brazil 2 ABSTRACT Despite all the advances, one of the main challenges for the consolidation of the Web of Data is data integration, a key aspect to semantic web data management. Most of the solutions make use of entity resolution, a task that deals with identifying and linking different manifestations of the same real world object in one or more datasets. However, data are usually incomplete, inconsistent and contain outliers and, to overcome these limitations, it is necessary to explore as much as possible the existent patterns in data. One way to extrapolate the commonly used technique of pair-wise matching is to explore the relationship structure between entities. Moreover, with the billions of RDF triples being published in the Web, scale has become a problem, posing some new challenges. Only recently some works started to consider new strategies that can deal with the problem of entity resolution in high scale datasets. In this paper we describe a Map-Reduce strategy for a relational learning approach that addresses the problem by statistical approximation method using a linear algebra technique. We applied the parallelization in all steps of the approach. Preliminary experiments shows that our strategy scales well with real world semantic datasets, maintaining the effectiveness of results even with the increased number of processed data. KEYWORDS Entity resolution, Semantic Web, Linked Data, Semantic Web Data Integration, Map-Reduce, Relational Learning. 1. INTRODUCTION The amount of data that has been published in the Web of Data has grown quite. As a result, the amount and diversity of data has grown at the same rate, creating a graph of global dimensions formed by billions of RDF triples that represent information from different areas of knowledge. RDF datasets usually reflect the same issues present in relational databases as outliers, duplication, inconsistency, and schema heterogeneity. These kinds of restrictions pose a great hindrance to the effective integration and sharing of Linked Data and consequently for the Semantic Web Data Management. However, the integration of datasets is one of the key points of LOD1 (Linked Open Data). Furthermore, different and isolated descriptions of the same real world entity exist in different datasets and to get the effective semantic of all these information it is necessary to provide ways of predicting new links between them. The task of entity relationship prediction, mentioned in this work as Entity Resolution (ER), has been focus to many works, since entities are key elements for data representation. Also known as Record Linkage (Elmagarmid et al., 2007), De-duplication, Co-reference Resolution, Instance Matching, among others, it has been recognized as an important issue in Semantic Web and Linked Data research community (Getoor and Machanavajjhala, 2012; Nickel et al., 2012; Yilmaz et al., 2011). Our main objective is identify sameAs links between datasets, e.g., connecting data about the Brazilian poet Cora Coralina in Dbpedia, Freebase, etc. This way, the higher amount of information that can be obtained about the poet can compose jointly any query result. In this sense, we can highlight some important challenges. The first is how to deal with semi-structured data. Different semantic description structures can be employed to refer to the same element, as for example, the description of entities of the same type. As an example, in addition to naturally different namespaces 1 http://linkeddata.org/ descriptions, each vocabulary can employ different descriptions to refer to the same structural type. This reflects a classical problem of ontology alignment, which aims to look for mappings between different schemas. A second challenge is related to noise in the data. There are various problems related to literal descriptions of data and to overcome these problems, many different metrics of string matching were proposed. Even taking into account some situations like characters suppression, variation in radical words, among others, existent metrics still cannot resolve problems like attributes without value. In this paper we focus in what we consider the third challenge, scale, more specifically, the large-scale of the Web of Data. The LOD cloud has nowadays about 60 billion RDF triples and this number is still rapidly growing. Handle this amount of information requires techniques that can distribute/parallelize the computation, e.g., Map Reduce paradigm. However, only a few recently works started to consider new strategies that can deal with the problem of entity resolution in high scale RDF datasets, and for all we know no specific work has dealt with large-scale semantic data in a collective entity resolution framework. In (Costa and Oliveira, 2014) we treated these issues with a relational learning approach that addresses the problem by statistical approximation method using a linear algebra technique. Machine learning algorithms are typically robust to both noise and data inconsistencies and are able to efficiently utilize nondeterministic dependencies in the data. Furthermore, a relational machine learning framework allows exploit contextual information that might be too distant in the relational graph, allowing the joint application of relationship patterns between entities and evidences of similarity between their descriptions, what can improve the effectiveness of results. However, despite good results with some high scale datasets this approach still cannot deal with real world large-scale datasets because of evident limitations of memory and processing. Thus, in this paper we perform preliminar analysis of (Costa and Oliveira, 2014) in a large-scale context of real world RDF datasets, using data from BTC 2012 (Billion Triples Challenge) (Harth, 2012). To perform the distributed computation of the proposed approach we consider all of its steps separately. The first step consists in performing a pre-processing of RDF datasets to extract potential matching pairs in order to avoid a large increase in the number of comparisons. We do this through the use of a Map-Reduce (Dean and Ghemawat, 2008) version of an inverted index. From the generated index in the distributed file system we perform a chained pair-wise string comparison using metrics like TF-IDF and Jaro. After that, following with chaining, we generate a matrix entity-attribute with the similarity results as its entries and corresponding to an entity having or not certain attribute or an infix in its URI description. The step of tensor generation is performed jointly with the first step because both of them are conceived directly from the RDF data dumps. In this step the relations between entities are modeled as a tensor and all the problem is formulated as a coupled matrix and tensor factorization. Finally, the factorization computation is performed in a Map-Reduce version of the extended version of RESCAL (Nickel et al., 2011) model, a tensor factorization model for relational learning. The main contributions of our work is propose a large-scale solution for semantic web data integration using a collective entity resolution task that is based in a relational learning framework. For machine learning algorithms, more data does mean better results because, as more data we have, the better the results of prediction. Furthermore, as more relationship patterns between entities could be used, much greater will be the effectiveness of the results. The remainder of this paper is structured as follows: Section 2 presents the preliminaries on the relational learning approach and the tensor factorization model. Section 3 describes our Map-Reduce strategy for largescale entity resolution. In Section 4 experimental results are reported. Section 5 summarizes related works and in Section 6 we conclude. 2. PRELIMINARIES 2.1 Relational Learning Approach The approach is composed of four steps. The first step consists of literal information extraction for each entity in all datasets. Some literal descriptions are more discriminative then others because, from the triple structure, they can represent the most significant information of each element: 1) Attribute values. Correspond to features of an entity (e.g. name/label, birth date, profession). Most approaches explore these values due to the precision when identifying an entity; 2) URI infix. The URIs of a dataset follow a common pattern: the Prefix-Infix(-Suffix) scheme. 3) Predicate: We will use the last token (normalized) of the URI, e.g., “has spouse” for "fb:has_spouse". Considering two datasets A and B with n and m entities respectively, and a brute force algorithm, there will be at least n x m comparisons between instance pairs. It is impractical, especially when dealing with real world datasets, with millions of triples or even larger. To overcome this problem, it was proposed a preprocessing step to obtain the possible matching pairs. An inverted index is built for instances of some key words in the descriptions to efficiently determine potential candidates. The entities sharing the same keys in the index are considered to be candidate matching instances. After literal information extraction, similarity metrics are applied for the candidates and generate the entity-attribute matrix, with its entries corresponding to an entity having or not certain attribute or an infix in its URI description. In the same way, entity-entity relations from datasets are mapped to tensor, with which the coupled factorization is performed. The factor-matrix A computed in the above process can be interpreted as an embedding of the entities into a latent-component space that reflects their similarity over all relations in the domain of discourse. In order to retrieve entities that are similar to a particular entity e with respect to all relations in the data, a clustering is computed in the latent-component space. Initially, however, the rows of A are normalized, such that each row represents the normalized participation of the corresponding entity in the latent components. From feature vectors corresponding to each entity (matrix rows) it is possible to create clusters of similar entities, since matrix A represents entities by their participation in the latent components. The clustering will be determined by the entities' similarity evidences in the relational domain. 2.2 Tensor Factorization Model The key elements of the model are entities and its relations. The entities are given by the set of all resources, classes and blank nodes in the data, while the set of relations consists of all predicates that include relationships between entities. Once these elements were extracted from datasets we set out to the transformation of them into a tensor representation. A tensor is a multidimensional array. More formally, an N-way or Nth-order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. A third-order tensor has three indices as shown in Figure 1a. A first-order tensor is a vector, a second-order tensor is a matrix, and tensors of order three or higher are called higher-order tensors (Kolda and Bader, 2009). Assuming that the relational domain consists of n entities and m relation types, data is modeled as a threeway tensor X of size n × n × m, where the entries on two modes (dimensions) of the tensor correspond to the combined entities of the domain of discourse and the third mode holds the m different types of relations. A tensor entry Xijk= 1 denotes the fact that the k-th relation (i-th entity, j-th entity) exists. Otherwise, for non-existing or unknown relations, Xijk is set to zero. In RESCAL, learning is performed using the latent components of the model (Fig. 1a). The approach employs the rank-r factorization as follows, where each segment is factored as Xk Xk  ARk AT where k = 1, ..., m (1) A is a n × r matrix containing the components of the latent representation of the entities in the domain and Rk is an asymmetric matrix r × r modeling the interactions of the components of the kth latent predicate. The rows of the factor matrices A and R can be considered latent-variable representations of entities that explain the observed variables Xij, the columns can be considered the invented latent features and the entries of the factor matrices specify how much an entity participates in a latent feature. The factor-matrices A and Rk are computed by solving a regularized minimization problem (Nickel et al., 2011) applying an alternating least squares algorithm (RESCAL-ALS), which updates A and Rk iteratively until a convergence criterion is met (linear regression). In order to retrieve entities that are similar to a particular entity e with respect to all relations in the data, it is sufficient to compute a ranking of entities by their similarity to e in A. a) b) Fig. 1. a) Illustration of data representation and factorization in the model and b) A Tensor coupled with a matrix of attributes Once this model assumes that two of the three modes are defined by entities, the process becomes limited to RDF resources. So we used an extension of the model, coupling the entity-attribute matrix with the tensor (Fig. 1b) aiming to perform the factorization (Nickel et al., 2012; Yilmaz et al., 2011). If we include all the literal evidences in the tensor, a huge amount of entries would be wasted, which would lead to an increased runtime since a significantly larger tensor would have to be factorized. So, the idea is to add the predicate-value pairs to a separate entity-attributes matrix D and not to the tensor X. The entity-attributes matrix D is then factorized into D  AV (2) where A is the entities’ latent-component representation of the model and V is an r × l matrix, which provides a latent-component representation of the literals. To include this matrix factorization as an additional constraint on A in the tensor factorization of X, it is necessary to adjust the minimization problem. In figure 2 we show an illustration that depict an example. The latent-component representations of entities A and B will be similar to each other in this example, as both representations reflect that their corresponding entities are related to the same object (wikipedia page) and attribute value. Because of this and their own similarity evidences, C and D will also have similar latent-component between their representations. Consequently, the latent feature vector of A will yield similar values to the latent feature vector of B and as such the likelihood of matching can be predicted correctly. The attribute values are only considered here due to the extension of the model. Considering that ai and aj denote the i-th and j-th row of A and thus are the latent-component representations of the i-th and j-th entity, the products 1) afb:m.05mwy8T R{spouse_s, isMarriedTo} afb:m.0pc9q, 2) afb:m.05mwy8T R{spouse_s, isMarriedTo} a yago:Luiz_Inácio_Lula_da_Silva, 3) ayago:Marisa_Leticia_Lula_da_SilvaT R{spouse_s, isMarriedTo} afb:m.0pc9q 4) ayago:Marisa_Leticia_Lula_da_SilvaT R{spouse_s, isMarriedTo} ayago:Luiz_Inácio_Lula_da_Silva along with the similarities evidences obtained, will contribute to get likelihood of A and B representing the same real world entity. Fig. 2. Illustration with representations of the same real world entities in Freebase and YAGO. The red line indicates the wanted matching. 3. DISTRIBUTED IMPLEMENTATION Map-Reduce (Dean and Ghemawat, 2008) is a programming model and associated infrastructure that provide automatic and reliable parallelization once a computation task is expressed as a series of Map and Reduce operations. Specifically, the Map function reads a <key, value> pair, and emits one or many intermediate <key, value> pairs. The Map-Reduce infrastructure then groups together all values with the same intermediate key, and constructs a <key, ValueList> pair with ValueList containing all values associated with the same key. The Reduce function takes a <key, ValueList> pair and emits one or many new <key, value> pairs. Open-source implementations of Map-Reduce infrastructure are readily available such as the Apache Hadoop project. As can be seen in the algorithm below, we represent the map and function execution in loops of size P, where P is the number of processing units in the cluster. There are four steps in the distributed model for (Costa and Oliveira, 2014) that are depicted in the algorithm : 1) Pre-processing; 2) String Similarity; 3) Matrix and Tensor modeling and 4) Tensor and Matrix coupled factorization. a) Pre-processing - corresponds to the full inverted index generation; b) String similarity - this step has as input the RDF dump and the inverted index; c) Matrix and Tensor modeling - collects all the literal evidences and models it in a matrix, jointly with the modeling of a tensor, with data obtained from RDF datasets; d) Tensor and Matrix Factorization - performs all the operations related to linear algebra of the machine learning algorithm, i.e., among several operations, updating matrices A and R iteratively until a convergence criterion is met. Algorithm 1: Map-Reduce implementation of the approach Input: Datasets composed of RDF triples <subject, predicate, object> Output: Rank reconstructed Xk and factor matrices A and R. We rank the entries of matrix A as basis to their likelihood that entities are similar to some other particular entity e in A. 1: for p = 1, ..., P: 2: for z = 1,..., Z/P triples in datasets: 3: parse triple 4: inverted index ← literals in component <object> 5: end for 6: end for 7: for p = 1, ..., P: 8: for w = 1, ..., W/P index entries: 9: if index w has sharing entities: 10: D [rows] ← entities (subjects or objects) 11: D [columns] ← attribute, URI or predicate Matrix D construction with similarity metrics applied in three groups of values (attributes, URIs and predicates) Sparse matrix where entries are 0 or 1 (has not or has a specific attribute, URI or predicate) 12: for k = 1, ..., K relationships between entities: 13: X [rows] ← entities (subjects or objects) 14: X [columns] ← entities (subjects or objects) 15: X [slices] ← relationship k Tensor X construction with entity-entity relationships forming k sparse matrices where entries are 0 or 1 (has not or has relationship with other entity) 16: end for 17: end for 18: for p = 1, ..., P: 19: for iter = 1, ...maxIter or convergence = true: 20: for k = 1, ..., K/P: 21: 22: 23:  A   DV T    m k 1  X k AR kT  X kT AR k  VV T   Rk  Z Z   R I  T V  AT A  v I   1 1  m k 1 T T T  Bk  C k   A I  where Bk  Rk A ARk , C k  Rk A ARk  1 T T Z vec ( X k ) , where Z  A  A T AT D 24: end for 25: end for 26: end for To be able to analyze the scalability power to large knowledge bases, it is necessary for a learning algorithm to have low computational complexity and low memory usage. The time complexity of performing the inverted index is linear with regard to the number dataset dump files. In this operation we consider the length of query q times the average length L of posting list. Relational data is usually very sparse and we can assume that Xk is a sparse matrix, while A and Rk are dense matrices. With RESCAL model we have linear computational complexity with regard to the number of entities or predicates in the dataset as well as with regard to the number of known facts. The update steps for A and R are linear in the number of predicates m, regardless of the sparsity of X. Table 1. Computational complexity for inverted index and update steps of A, R and D Operation Complexity Inverted index (lines 8 - 17) each X k ARkT and X kT ARk (line 21) each Bk ∧ Ck (line 21) Matrix inversion (A update) (line 21) QR decomposition of A (line 22) Projection (line 22) Matrix inversion (R update) (line 22) (line 22) (line 23) Considering O (unl) as the runtime complexity for the matrix product of a first sparse matrix with a second dense n × l matrix, where u is the number of non-zeros in the first matrix. The operations listed above are iterated only a small number of times until the algorithm converges or a maximum number of iterations is reached. The computational complexity of a sparse implementation of the algorithm is linear in n or m and superlinear only in the model complexity r. Although the O (n5) operations in the update step for Rk can be reduced to O (n3) complexity in the non-regularized case (λ = 0). We must point out that computing the update steps of Rk in this form would be intractable for large-scale data, since it involves the r2 × n2 matrix Z. In this case, RESCAL algorithm use the QR decomposition of A to simplify the update steps for Rk significantly. Another important aspect to consider is that when we add attributes to the factorization, it is obvious to note that the additional operations will increase the runtime of the algorithm. Although, they do not alter the linear scalability, as can be seen in table 1. Due to the sums in the update of A, this step can be computed distributed by using a map-reduce approach. First, the current state of A and Rk is distributed to P available computing nodes. Then, these nodes compute and locally, for those k that have been assigned to them. Given the results of these computations, the master node can reduce the results and compute the matrix inversion, which only involves r × r matrices and the final matrix product. Since the updates of Rk are independent of each other, these steps can be computed in a similar way. Considering memory complexity of the model, in each iteration, only one frontal slice Xk has to be kept in memory. Since sparse matrices usually have a linear memory complexity O(u) with regard to the number of nonzero elements in a matrix, the approach can scale up to billions of known facts. However, the factor matrices are more demanding in terms of memory, especially A, a dense n × r matrix, such that if the domain contains a very large number of entities, some additional technique is necessary. 4. EXPERIMENTAL RESULTS All algorithms have been implemented in Python, using the following APIs: NumPy, RDFLib, NLTK, Scikit-learn and Apache Hadoop, respectively. In order to collect detailed execution statistics and prevent the interference of other jobs on a shared computer cluster, we construct a dedicated Hadoop cluster that hosts up to 24 node machines. These machines are all in the same configuration and all have a 3,8 GHz Intel (R) Core i7 CPU machine with 4 cores and 64 GB RAM. We proceed to perform our preliminary experiments with large-scale datasets, more specifically with datasets from BTC 2012 (Billion Triples Challenge) (Harth, 2012), respectively Datahub, DBpedia, Freebase, Rest and Timbl. These datasets contains approximately 1.4 billion RDF n-quads triples. The first step was to remove provenance information, duplicate triples, RDF blank nodes as well as reification statements. We decided to use the same experimental framework for large datasets as in (Costa and Oliveira, 2014), but with the exception that much more data were applied. In the first round, it was generated a total of approximately 1.5 million sameAs links. The precision achieved was of 85% and the recall was of 72%. The total execution time was about almost one day. In the second round approximately 2.3 million links was generated, but the precision has dropped to 71% and the recall was of 58%. In this round the execution time was about 30 hours. We believe that most of this time was spent with processing similarity metrics due to volume of comparisons, even using the inverted index in pre-processing. An important aspect of our strategy is memory use by the factor matrices that are more demanding in terms of memory, especially A, a dense n × r matrix. Even so, in each iteration of the factorization process only one frontal slice Xk has to be kept in memory. Since sparse matrices usually have a linear memory complexity with regard to the number of nonzero elements in a matrix, the proposed approach scales up to billions of known facts, mainly due to the pre-processing step that prunes not matching pairs. Although we have had reasonable good results it is important to note that we need to plan more detailed experiments and define a baseline to perform an accurate analysis. 5. RELATED WORKS The problem of entity resolution has emerged as an important task to the Web of Data. This same task has been exploited in many different research arenas, and today it is a very active research topic and many approaches have been proposed and evaluated (Köpcke and Rahm, 2010). Just recently some few approaches have started to use the parallelization strategy for entity resolution. Some works addresses the blocking technique using a sorted neighborhood approach (Kolb et al., 2012a) and in (Kolb et al., 2012b, 2012c) it is addressed as a load-balanced entity resolution. The basis of these works is a map function that emits <key, value> pairs and pairs with the same key are processed in the same reducer, and in the reduce function the matching function is performed with all pairs that have the same key. To avoid redundant comparisons (Kolb et al., 2013) addresses a comparison propagation scheme. Like most of the existing works in the scientific literature, none of these approaches addresses graph data, like RDF. RDF is essentially semi-structured data and it is naturally dirty, incomplete and inconsistent. Despite the work in (Papadakis et al., 2014, 2012) take into account the intrinsic issues of semantic web data, they do not address parallelization techniques like Map-Reduce in its approaches. Linda system (Böhm et al., 2012) is an approach that addresses RDF data and its algorithm is based on maintaining X and Y matrices as data structures. Matrix X will contain matching or non-matching results from performed comparisons that are temporarily maintained in matrix Y, and this second contains realvalued similarity values. The algorithm then repeatedly dequeues entity pairs with the highest similarity score from a priority queue. It considers the nearest neighborhood of matching entities as a way to propagate similarities between linked entities. 6. CONCLUSION We have presented a parallel strategy for collective entity resolution supported by a relational learning model. Performing entity resolution using tensor factorization is a currently rarely practiced discipline and a prominent mathematical structure that fit nicely to the dyadic structure of RDF triples. It is one of the key strength of this work, as it allows to include the influence of all the relationship patterns from a dataset. The experimental results showed that the relationship patterns can improve the results, considering that as much more evidences the better the effectiveness. However this has the cost that most of the solutions have to confront, the noise nature of data. ACKNOWLEDGEMENT We would like to thank CAPES, ITA and IFG for the support via DINTER project. REFERENCES Böhm, C., de Melo, G., Naumann, F., Weikum, G., 2012. LINDA: Distributed Web-of-data-scale Entity Matching, in: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12. ACM, New York, NY, USA, pp. 2104–2108. Costa, G. de A., Oliveira, J.M.P. de, 2014. A Relational Learning Approach for Collective Entity Resolution in the Web of Data, to appear in: Proceedings of the Fifth International Workshop on Consuming Linked Data., CEURWS, Riva del Garda (IT). Dean, J., Ghemawat, S., 2008. Map-Reduce: Simplified Data Processing on Large Clusters. Commun ACM 51, 107–113. Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S., 2007. Duplicate Record Detection: A Survey. IEEE Trans. Knowl. Data Eng. 19, 1 –16. Getoor, L., Machanavajjhala, A., 2012. Entity resolution: theory, practice and open challenges. Proc VLDB Endow 5, 2018–2019. Harth, A., 2012. Billion Triples Challenge data set. Kolb, L., Thor, A., Rahm, E., 2012a. Multi-pass sorted neighborhood blocking with Map-Reduce. Comput. Sci. - Res. Dev. 27, 45–63. Kolb, L., Thor, A., Rahm, E., 2012b. Dedoop: Efficient Deduplication with Hadoop. Proc VLDB Endow 5, 1878–1881. Kolb, L., Thor, A., Rahm, E., 2012c. Load Balancing for Map-Reduce-based Entity Resolution, in: Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, ICDE ’12. IEEE Computer Society, Washington, DC, USA, pp. 618–629. Kolb, L., Thor, A., Rahm, E., 2013. Don’T Match Twice: Redundancy-free Similarity Computation with Map-Reduce, in: Proceedings of the Second Workshop on Data Analytics in the Cloud, DanaC ’13. ACM, New York, NY, USA, pp. 1–5. Kolda, T.G., Bader, B.W., 2009. Tensor Decompositions and Applications. SIAM Rev. 51, 455–500. Köpcke, H., Rahm, E., 2010. Frameworks for Entity Matching: A Comparison. Data Knowl Eng 69, 197–210. Nickel, M., Tresp, V., Kriegel, H.-P., 2011. A Three-Way Model for Collective Learning on Multi-Relational Data, in: Getoor, L., Scheffer, T. (Eds.), Proceedings of the 28th International Conference on Machine Learning (ICML11), ICML ’11. ACM, New York, NY, USA, pp. 809–816. Nickel, M., Tresp, V., Kriegel, H.-P., 2012. Factorizing YAGO: scalable machine learning for linked data, in: Proceedings of the 21st International Conference on World Wide Web, WWW ’12. ACM, New York, NY, USA, pp. 271–280. Papadakis, G., Ioannou, E., Niederée, C., Palpanas, T., Nejdl, W., 2012. Beyond 100 Million Entities: Large-scale Blocking-based Resolution for Heterogeneous Data, in: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM ’12. ACM, New York, NY, USA, pp. 53–62. Papadakis, G., Koutrika, G., Palpanas, T., Nejdl, W., 2014. Meta-Blocking: Taking Entity Resolution to the Next Level. IEEE Trans. Knowl. Data Eng. 26, 1946–1960. doi:10.1109/TKDE.2013.54 Yilmaz, Y.K., Cemgil, A.-T., Simsekli, U., 2011. Generalised Coupled Tensor Factorisation, in: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F.C.N., Weinberger, K.Q. (Eds.), Proceedings of Neural Information Processing Systems (NIPS). Presented at the Annual Conference on Neural Information Processing Systems, Granada, SPAIN, pp. 2151–2159. View publication stats