The Store-n-Learn plane accelerator encodes an entire page with raw data to generate a
D-dimensional hypervector. Let us assume the SSD page size to be 4 KB (
\(p_s\) ) with each data point being 4 bytes (
\(d_s\) ). This translates to 1K data points (
\(p_s/d_s\) ). Let the feature vector contain 1K features. Assuming that the feature vectors are page aligned, each page stores one feature vector. HDC encoding multiplies an
n-size feature vector with a projection matrix containing
\(D\times n\) 1-bit elements. Our accelerator calculates the dot product between two page-long vectors, one read from the flash array and another being a row vector of the projection matrix. This involves element-wise multiplication of the two vectors and adding together all elements in the product. Since the weights in the projection matrix
\(\in \lbrace 1, -1\rbrace\) , we reduce the bits required to store the weights by mapping them such that
\(1 \xrightarrow {} 1\) and
\((-1) \xrightarrow {} 0\) . We use 2’s complement to break the multiplication into an inversion using XNOR gates and then adding the total number of inverted inputs to the accumulated sum of XNOR outputs. The accelerator is shown in Figure
2. It consists of an array of 32K XNOR gates followed by a 1K input tree adder (labeled CSA in Figure
2). The tree adder is a pruned version of the Wallace carry save tree, where the operand size throughout the tree is fixed to 4B. It reduces 1,024 inputs to 2, which is followed by a carry look-ahead addition (labeled CLA in Figure
2). This gives us the dot product of the two vectors. It is the value of one dimension of the encoded hypervector. The accelerator is iteratively run
D times to generate
D dimensions. Depending upon the power budget, Store-n-Learn may employ multiple parallel instances of this accelerator to reduce the total number of iterations. Since
D is generally large, the generated
D-dimensional vector is multi-page output. Store-n-Learn writes the output of the accelerator to the page buffer of the plane, which serves as the response to the original SSD read request.