TOP > クイック検索 > 外国特許検索 > CALCULATION DEVICE, CALCULATION METHOD, AND CALCULATION PROGRAM

# CALCULATION DEVICE, CALCULATION METHOD, AND CALCULATION PROGRAM

外国特許コード F210010353 08963-WO 2021年4月7日 世界知的所有権機関（ＷＩＰＯ） 2020JP022377 WO 2020246598 令和2年6月5日(2020.6.5) 令和2年12月10日(2020.12.10) 特願2019-107283 (2019.6.7) JP CALCULATION DEVICE, CALCULATION METHOD, AND CALCULATION PROGRAM Provided is a calculation device comprising: a vector storage unit which stores, among a plurality of first partial vectors obtained by dividing a first vector, at least a first partial vector; a matrix storage unit which stores, among a plurality of first submatrixes obtained by dividing a first matrix to be multiplied by the first vector in the row direction and the column direction, at least a first submatrix to be multiplied by the first partial vector; a pipeline calculation unit which, through pipeline calculation, executes calculation for adding an intermediate vector to a matrix vector product of the submatrix stored in the matrix storage unit and the partial vector stored in the vector storage unit; and a calculation control unit which while the pipeline calculation unit executes the pipeline calculation of the matrix vector product of the first submatrix and the first partial vector, instructs the pipeline calculation unit to execute the calculation of another matrix vector product using the first partial vector or the first submatrix. BACKGROUND ARTFor example, in various applications such as numerical calculations and deep learning, a matrix product of (or less is indicated as a "matrix product.") and a matrix vector product account for most of the computational complexity. Therefore, an arithmetic device and an arithmetic method that efficiently execute such a matrix operation have been developed (, see PTL 1 ~ 3. Processors capable of performing matrix operations have also been developed. [Prior Art Documents] [Patent Document] [Patent Document 1] WO 2018/207926 [Patent Document 2] JP 2018-139045 A [Patent Document 3] JP 2018-197906 AProblem to be SolvedThe matrix vector product of the n-dimensional square matrix and the n-dimensional vector includes a multiplication of n2 and an addition of approximately n2, resulting in a computation complexity of approximately 2n2. Thus, in a case where the n-dimensional square matrix is fixed, the computation amount of the matrix vector product is on the order of n2 with respect to the input of the n-dimensional vector. Accordingly, when the matrix size is increased and the matrix operator is increased, the ratio of the amount of data loaded to the amount of computation can be reduced. However, when the matrix operator is enlarged, the load/store capability of the register file or the like becomes relatively low, and the processing performance of the operations of the small size matrix and the operations other than the matrix becomes relatively low.General DisclosureIn a first aspect of the present invention, a computation device is provided. The computing device may include a vector storage unit that stores at least a first partial vector among a plurality of partial vectors obtained by dividing the first vector. The arithmetic device may include a matrix storage unit configured to store at least a first sub-matrix to be multiplied, among a plurality of first sub-matrices obtained by dividing a first matrix to be multiplied by a first vector in a row direction and a column direction. The arithmetic device may include a pipeline arithmetic unit capable of performing an operation of adding an intermediate vector to a matrix vector product of a partial matrix stored in the matrix storage unit and a partial vector stored in the vector storage unit by a pipeline operation. The arithmetic device may include an arithmetic controller configured to instruct the pipeline arithmetic unit to execute an operation on the first partial vector or another matrix vector product using the first partial matrix during the pipeline operation on the first partial matrix and the matrix vector product of the first partial vector.The vector storage unit may further store a second partial vector among the first plurality of partial vectors. The matrix storage unit may further store, among the first plurality of sub-matrices, a second sub-matrix to be multiplied by the second sub-vector. The arithmetic controller may instruct the pipeline arithmetic unit to execute an operation of adding the matrix vector product of the second partial matrix and the second partial vector to the arithmetic result of the matrix vector product of the first partial matrix and the first partial vector after a cycle in which the arithmetic result of the matrix vector product of the first partial matrix and the first partial vector becomes available without delay.The vector storage unit may further store a third partial vector to be multiplied by the first partial matrix among the second plurality of partial vectors obtained by dividing the second vector to be multiplied by the first partial matrix. During the pipeline operation on the matrix vector product of the first partial matrix and the first partial vector, the operation controller may instruct the pipeline operation unit to execute the operation on the matrix vector product of the first partial matrix and the third partial vector as an operation on another matrix vector product.The first vector and the second vector may be column vectors included in the second matrix with which the first matrix is to be multiplied.The vector storage unit may store a plurality of second vectors included in the second matrix. The arithmetic controller may fill each cycle from the start of the pipeline operation of the matrix vector product of the first partial matrix and the first partial vector to the start of the operation result before the operation becomes available without delay, with an operation of the matrix vector product of the third partial vector from each of the first partial matrix and the plurality of second vectors.The matrix storage unit may further store, among the first plurality of partial matrices, a third partial matrix to be multiplied by the first partial vector. During the pipeline operation on the matrix vector product of the first partial matrix and the first partial vector, the operation controller may instruct the pipeline operation unit to execute an operation on the matrix vector product of the third partial matrix and the first partial vector as an operation on another matrix vector product.The matrix storage unit may store a plurality of third sub-matrices. The arithmetic controller may fill each cycle from the start of the pipeline operation of the matrix vector product of the first partial matrix and the first partial vector to the start of the operation result before the operation becomes available without delay, with an operation of each of the plurality of third partial matrices and the matrix vector product of the first partial vector.In a second aspect of the present invention, a computation method is provided. The computation method may include the vector storage unit storing at least a first partial vector from among a first plurality of partial vectors obtained by dividing the first vector. The computation method may include the matrix storage unit storing at least a first submatrix to be multiplied, among a plurality of first submatrices obtained by dividing a first matrix to be multiplied by a first vector in a row direction and a column direction, the first submatrix to be multiplied by the first vector. An operation method includes a pipeline operation unit capable of performing an operation of adding an intermediate vector to a matrix vector product of a partial matrix stored in a matrix storage unit and a partial vector stored in a vector storage unit by a pipeline operation, The method may include initiating, during pipeline operations of the first sub-matrix and the matrix vector product of the first sub-matrix, performance of operations of the first sub-vector or other matrix vector product using the first sub-matrix.In a third aspect of the present invention, an arithmetic program to be executed by an arithmetic device is provided. The computing device may include a vector storage unit that stores at least a first partial vector among a plurality of partial vectors obtained by dividing the first vector. The arithmetic device may include at least a matrix storage unit configured to, of a plurality of first sub-matrices obtained by dividing, in the row direction and the column direction, the first sub-matrix to be multiplied by the first sub-vector, a first sub-matrix to be multiplied by the first sub-vector. The arithmetic device may include a pipeline arithmetic unit capable of performing an operation of adding an intermediate vector to a matrix vector product of a partial matrix stored in the matrix storage unit and a partial vector stored in the vector storage unit by a pipeline operation. The operation program may cause the arithmetic device to start executing an operation on the first partial vector or another matrix vector product using the first partial matrix during a pipeline operation on the first partial matrix and the matrix vector product of the first partial vector.Note that the above summary of the present invention does not list all of the necessary features of the present invention. A sub-combination of these features may also be the invention. RIKEN MAKINO Junichiro EBISUZAKI Toshikazu G06F  17/16      マトリックスまたはベクトルの計算