Top > Search of International Patents > CALCULATION DEVICE, CALCULATION METHOD, AND CALCULATION PROGRAM

CALCULATION DEVICE, CALCULATION METHOD, AND CALCULATION PROGRAM UPDATE_EN

Foreign code F210010353
File No. 08963-WO
Posted date Apr 7, 2021
Country WIPO
International application number 2020JP022377
International publication number WO 2020246598
Date of international filing Jun 5, 2020
Date of international publication Dec 10, 2020
Priority data
  • P2019-107283 (Jun 7, 2019) JP
Title CALCULATION DEVICE, CALCULATION METHOD, AND CALCULATION PROGRAM UPDATE_EN
Abstract Provided is a calculation device comprising: a vector storage unit which stores, among a plurality of first partial vectors obtained by dividing a first vector, at least a first partial vector; a matrix storage unit which stores, among a plurality of first submatrixes obtained by dividing a first matrix to be multiplied by the first vector in the row direction and the column direction, at least a first submatrix to be multiplied by the first partial vector; a pipeline calculation unit which, through pipeline calculation, executes calculation for adding an intermediate vector to a matrix vector product of the submatrix stored in the matrix storage unit and the partial vector stored in the vector storage unit; and a calculation control unit which while the pipeline calculation unit executes the pipeline calculation of the matrix vector product of the first submatrix and the first partial vector, instructs the pipeline calculation unit to execute the calculation of another matrix vector product using the first partial vector or the first submatrix.
Outline of related art and contending technology BACKGROUND ART
For example, in various applications such as numerical calculations and deep learning, a matrix product of (or less is indicated as a "matrix product.") and a matrix vector product account for most of the computational complexity. Therefore, an arithmetic device and an arithmetic method that efficiently execute such a matrix operation have been developed (, see PTL 1 ~ 3. Processors capable of performing matrix operations have also been developed. [Prior Art Documents] [Patent Document] [Patent Document 1] WO 2018/207926 [Patent Document 2] JP 2018-139045 A [Patent Document 3] JP 2018-197906 A
Problem to be Solved
The matrix vector product of the n-dimensional square matrix and the n-dimensional vector includes a multiplication of n2 and an addition of approximately n2, resulting in a computation complexity of approximately 2n2. Thus, in a case where the n-dimensional square matrix is fixed, the computation amount of the matrix vector product is on the order of n2 with respect to the input of the n-dimensional vector. Accordingly, when the matrix size is increased and the matrix operator is increased, the ratio of the amount of data loaded to the amount of computation can be reduced. However, when the matrix operator is enlarged, the load/store capability of the register file or the like becomes relatively low, and the processing performance of the operations of the small size matrix and the operations other than the matrix becomes relatively low.
General Disclosure
In a first aspect of the present invention, a computation device is provided. The computing device may include a vector storage unit that stores at least a first partial vector among a plurality of partial vectors obtained by dividing the first vector. The arithmetic device may include a matrix storage unit configured to store at least a first sub-matrix to be multiplied, among a plurality of first sub-matrices obtained by dividing a first matrix to be multiplied by a first vector in a row direction and a column direction. The arithmetic device may include a pipeline arithmetic unit capable of performing an operation of adding an intermediate vector to a matrix vector product of a partial matrix stored in the matrix storage unit and a partial vector stored in the vector storage unit by a pipeline operation. The arithmetic device may include an arithmetic controller configured to instruct the pipeline arithmetic unit to execute an operation on the first partial vector or another matrix vector product using the first partial matrix during the pipeline operation on the first partial matrix and the matrix vector product of the first partial vector.
The vector storage unit may further store a second partial vector among the first plurality of partial vectors. The matrix storage unit may further store, among the first plurality of sub-matrices, a second sub-matrix to be multiplied by the second sub-vector. The arithmetic controller may instruct the pipeline arithmetic unit to execute an operation of adding the matrix vector product of the second partial matrix and the second partial vector to the arithmetic result of the matrix vector product of the first partial matrix and the first partial vector after a cycle in which the arithmetic result of the matrix vector product of the first partial matrix and the first partial vector becomes available without delay.
The vector storage unit may further store a third partial vector to be multiplied by the first partial matrix among the second plurality of partial vectors obtained by dividing the second vector to be multiplied by the first partial matrix. During the pipeline operation on the matrix vector product of the first partial matrix and the first partial vector, the operation controller may instruct the pipeline operation unit to execute the operation on the matrix vector product of the first partial matrix and the third partial vector as an operation on another matrix vector product.
The first vector and the second vector may be column vectors included in the second matrix with which the first matrix is to be multiplied.
The vector storage unit may store a plurality of second vectors included in the second matrix. The arithmetic controller may fill each cycle from the start of the pipeline operation of the matrix vector product of the first partial matrix and the first partial vector to the start of the operation result before the operation becomes available without delay, with an operation of the matrix vector product of the third partial vector from each of the first partial matrix and the plurality of second vectors.
The matrix storage unit may further store, among the first plurality of partial matrices, a third partial matrix to be multiplied by the first partial vector. During the pipeline operation on the matrix vector product of the first partial matrix and the first partial vector, the operation controller may instruct the pipeline operation unit to execute an operation on the matrix vector product of the third partial matrix and the first partial vector as an operation on another matrix vector product.
The matrix storage unit may store a plurality of third sub-matrices. The arithmetic controller may fill each cycle from the start of the pipeline operation of the matrix vector product of the first partial matrix and the first partial vector to the start of the operation result before the operation becomes available without delay, with an operation of each of the plurality of third partial matrices and the matrix vector product of the first partial vector.
In a second aspect of the present invention, a computation method is provided. The computation method may include the vector storage unit storing at least a first partial vector from among a first plurality of partial vectors obtained by dividing the first vector. The computation method may include the matrix storage unit storing at least a first submatrix to be multiplied, among a plurality of first submatrices obtained by dividing a first matrix to be multiplied by a first vector in a row direction and a column direction, the first submatrix to be multiplied by the first vector. An operation method includes a pipeline operation unit capable of performing an operation of adding an intermediate vector to a matrix vector product of a partial matrix stored in a matrix storage unit and a partial vector stored in a vector storage unit by a pipeline operation, The method may include initiating, during pipeline operations of the first sub-matrix and the matrix vector product of the first sub-matrix, performance of operations of the first sub-vector or other matrix vector product using the first sub-matrix.
In a third aspect of the present invention, an arithmetic program to be executed by an arithmetic device is provided. The computing device may include a vector storage unit that stores at least a first partial vector among a plurality of partial vectors obtained by dividing the first vector. The arithmetic device may include at least a matrix storage unit configured to, of a plurality of first sub-matrices obtained by dividing, in the row direction and the column direction, the first sub-matrix to be multiplied by the first sub-vector, a first sub-matrix to be multiplied by the first sub-vector. The arithmetic device may include a pipeline arithmetic unit capable of performing an operation of adding an intermediate vector to a matrix vector product of a partial matrix stored in the matrix storage unit and a partial vector stored in the vector storage unit by a pipeline operation. The operation program may cause the arithmetic device to start executing an operation on the first partial vector or another matrix vector product using the first partial matrix during a pipeline operation on the first partial matrix and the matrix vector product of the first partial vector.
Note that the above summary of the present invention does not list all of the necessary features of the present invention. A sub-combination of these features may also be the invention.
Scope of claims (In Japanese)[請求項1]
 第1ベクトルを分割した第1の複数の部分ベクトルのうち、第1部分ベクトルを少なくとも記憶するベクトル記憶部と、
 前記第1ベクトルに乗じる第1行列を行方向および列方向に分割した第1の複数の部分行列のうち、前記第1部分ベクトルに乗じるべき第1部分行列を少なくとも記憶する行列記憶部と、
 パイプライン演算により、前記行列記憶部に記憶された部分行列と前記ベクトル記憶部に記憶された部分ベクトルとの行列ベクトル積に、中間ベクトルを加える演算を実行可能なパイプライン演算部と、
 前記パイプライン演算部が、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算中に、前記第1部分ベクトルまたは前記第1部分行列を用いた他の行列ベクトル積の演算の実行を前記パイプライン演算部に指示する演算制御部と
 を備える演算装置。

[請求項2]
 前記ベクトル記憶部は、前記第1の複数の部分ベクトルのうち、第2部分ベクトルを更に記憶し、
 前記行列記憶部は、前記第1の複数の部分行列のうち、前記第2部分ベクトルに乗じるべき第2部分行列を更に記憶し、
 前記演算制御部は、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積の演算結果が遅延なく利用可能となるサイクル以降に、前記第2部分行列および前記第2部分ベクトルの行列ベクトル積を、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積の演算結果に加える演算の実行を前記パイプライン演算部に指示する
 請求項1に記載の演算装置。

[請求項3]
 前記ベクトル記憶部は、前記第1行列を乗じるべき第2ベクトルを分割した第2の複数の部分ベクトルのうち、前記第1部分行列を乗じるべき第3部分ベクトルを更に記憶し、
 前記演算制御部は、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算中に、前記他の行列ベクトル積の演算として、前記第1部分行列および前記第3部分ベクトルの行列ベクトル積の演算の実行を前記パイプライン演算部に指示する
 請求項1または2に記載の演算装置。

[請求項4]
 前記第1ベクトルおよび前記第2ベクトルは、前記第1行列に乗じるべき第2行列に含まれる列ベクトルである請求項3に記載の演算装置。

[請求項5]
 前記ベクトル記憶部は、前記第2行列に含まれる複数の前記第2ベクトルを記憶し、
 前記演算制御部は、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算の開始後から演算結果が遅滞なく利用可能となる前までの間の各サイクルを、前記第1部分行列および前記複数の第2ベクトルのそれぞれからの前記第3部分ベクトルの行列ベクトル積の演算で充填する
 請求項4に記載の演算装置。

[請求項6]
 前記行列記憶部は、前記第1の複数の部分行列のうち、前記第1部分ベクトルに乗じるべき第3部分行列を更に記憶し、
 前記演算制御部は、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算中に、前記他の行列ベクトル積の演算として、前記第3部分行列および前記第1部分ベクトルの行列ベクトル積の演算の実行を前記パイプライン演算部に指示する
 請求項1または2に記載の演算装置。

[請求項7]
 前記行列記憶部は、複数の前記第3部分行列を記憶し、
 前記演算制御部は、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算の開始後から演算結果が遅滞なく利用可能となる前までの間の各サイクルを、前記複数の第3部分行列のそれぞれおよび前記第1部分ベクトルの行列ベクトル積の演算で充填する
 請求項6に記載の演算装置。

[請求項8]
 ベクトル記憶部が、第1ベクトルを分割した第1の複数の部分ベクトルのうち、第1部分ベクトルを少なくとも記憶し、
 行列記憶部が、前記第1ベクトルに乗じる第1行列を行方向および列方向に分割した第1の複数の部分行列のうち、前記第1部分ベクトルに乗じるべき第1部分行列を少なくとも記憶し、
 パイプライン演算により、前記行列記憶部に記憶された部分行列と前記ベクトル記憶部に記憶された部分ベクトルとの行列ベクトル積に、中間ベクトルを加える演算を実行可能なパイプライン演算部が、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算中に、前記第1部分ベクトルまたは前記第1部分行列を用いた他の行列ベクトル積の演算の実行を開始する
 演算方法。

[請求項9]
 演算装置によって実行される演算プログラムであって、
 前記演算装置は、
 第1ベクトルを分割した第1の複数の部分ベクトルのうち、第1部分ベクトルを少なくとも記憶するベクトル記憶部と、
 前記第1ベクトルに乗じる第1行列を行方向および列方向に分割した第1の複数の部分行列のうち、前記第1部分ベクトルに乗じるべき第1部分行列を少なくとも行列記憶部と、
 パイプライン演算により、前記行列記憶部に記憶された部分行列と前記ベクトル記憶部に記憶された部分ベクトルとの行列ベクトル積に、中間ベクトルを加える演算を実行可能なパイプライン演算部と
 を備え、
 当該演算プログラムは、前記演算装置に、前記第1部分行列および前記第1部分ベクトルの行列ベクトル積のパイプライン演算中に、前記第1部分ベクトルまたは前記第1部分行列を用いた他の行列ベクトル積の演算の実行を開始させるためのものである
 演算プログラム。
  • Applicant
  • ※All designated countries except for US in the data before July 2012
  • RIKEN
  • Inventor
  • MAKINO Junichiro
  • EBISUZAKI Toshikazu
IPC(International Patent Classification)

PAGE TOP

close
close
close
close
close
close