Showing 1–3 of 3 results for author: Niitsuma, H
-
Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing
Authors:
Hirotaka Niitsuma,
Minho Lee
Abstract:
We show that correspondence analysis (CA) is equivalent to defining a Gini index with appropriately scaled one-hot encoding. Using this relation, we introduce a nonlinear kernel extension to CA. This extended CA gives a known analysis for natural language via specialized kernels that use an appropriate contingency table. We propose a semi-supervised CA, which is a special case of the kernel extens…
▽ More
We show that correspondence analysis (CA) is equivalent to defining a Gini index with appropriately scaled one-hot encoding. Using this relation, we introduce a nonlinear kernel extension to CA. This extended CA gives a known analysis for natural language via specialized kernels that use an appropriate contingency table. We propose a semi-supervised CA, which is a special case of the kernel extension to CA. Because CA requires excessive memory if applied to numerous categories, CA has not been used for natural language processing. We address this problem by introducing delayed evaluation to randomized singular value decomposition. The memory-efficient CA is then applied to a word-vector representation task. We propose a tail-cut kernel, which is an extension to the skip-gram within the kernel extension to CA. Our tail-cut kernel outperforms existing word-vector representation methods.
△ Less
Submitted 25 November, 2018; v1 submitted 17 May, 2016;
originally announced May 2016.
-
Image processing using miniKanren
Authors:
Hirotaka Niitsuma
Abstract:
An integral image is one of the most efficient optimization technique for image processing. However an integral image is only a special case of delayed stream or memoization. This research discusses generalizing concept of integral image optimization technique, and how to generate an integral image optimized program code automatically from abstracted image processing algorithm. In oder to abstruct…
▽ More
An integral image is one of the most efficient optimization technique for image processing. However an integral image is only a special case of delayed stream or memoization. This research discusses generalizing concept of integral image optimization technique, and how to generate an integral image optimized program code automatically from abstracted image processing algorithm. In oder to abstruct algorithms, we forces to miniKanren.
△ Less
Submitted 16 March, 2014;
originally announced March 2014.
-
Covariance and PCA for Categorical Variables
Authors:
Hirotaka Niitsuma,
Takashi Okada
Abstract:
Covariances from categorical variables are defined using a regular simplex expression for categories. The method follows the variance definition by Gini, and it gives the covariance as a solution of simultaneous equations. The calculated results give reasonable values for test data. A method of principal component analysis (RS-PCA) is also proposed using regular simplex expressions, which allows…
▽ More
Covariances from categorical variables are defined using a regular simplex expression for categories. The method follows the variance definition by Gini, and it gives the covariance as a solution of simultaneous equations. The calculated results give reasonable values for test data. A method of principal component analysis (RS-PCA) is also proposed using regular simplex expressions, which allows easy interpretation of the principal components. The proposed methods apply to variable selection problem of categorical data USCensus1990 data. The proposed methods give appropriate criterion for the variable selection problem of categorical
△ Less
Submitted 28 November, 2007;
originally announced November 2007.