CUDA
ˬÌä¼Ô¿ô¡¡10880¡¡¡¡¡¡¡¡¡¡¡¡ºÇ½ª¹¹¿·¡¡2008-10-25 (ÅÚ) 10:36:24
¡¡¡¡¡ä¡¡£Ç£Ð£Õ
CUDA Programming Guide Version 2.0 †
°Ê²¼¤¹¤Ù¤Æ¤Ï¡¢»³Æâ¸¦¤Î³ØÀ¸¤ÎÊÙ¶¯¤Î¤¿¤á¡¢¸¶Ê¸¡ÊCUDA Programming Guide Version 2.0¡Ë¡¡¤òÆÉ¤ó¤Ç¡¢Í×Ìó¤·¤¿¤â¤Î¤Ç¤¹¡£
¸¶Ê¸¤ÎÃøºî¸¢¤ÏNVIDIA¼Ò¤Ë¤¢¤ê¤Þ¤¹¡£
Chapter 2 Programming Model †
- CUDA¤ÏC¸À¸ì¤Î³ÈÄ¥
- kernel: C¤Î´Ø¿ô¤Ç¡¢¸Æ¤Ó½Ð¤µ¤ì¤ë¤È¡¢N¸Ä¤Î°Û¤Ê¤ë¥¹¥ì¥Ã¥É(CUDA¥¹¥ì¥Ã¥É¡Ë¾å¤Ç¡¢N²óÊÂÎó¤Ë¼Â¹Ô¤µ¤ì¤ë
- kernel¤ò¡Àë¸À¤¹¤ë¤È¤¡¢__global__¤Ç»ØÄꤷ¡¢¢¸Æ½Ð¤¹»þ¤Ë²¿¸Ä¤ÎÊÂÎó¥¹¥ì¥Ã¥É¤òÁö¤é¤»¤ë¤«¤ò¡¢<<< ... >>>µË¡¤Ç»ØÄꤹ¤ë¡£
__global__ void vecAdd(float* A, float* B, float* C)
{
int i = threadIdx.x;
C[i] = A[i] + B[i];
}
int main()
{
// Kernel invocation
vecAdd<<<1, N>>>(A, B, C);
}
¤³¤³¤Ç¡¢threadIdx¤ÏÁȹþ¤ß¤ÎÊÑ¿ô¤Ç¡¢¤½¤ì¤¾¤ì¤Î¥¹¥ì¥Ã¥É¤ËÍ¿¤¨¤é¤ì¤ë¥æ¥Ë¡¼¥¯¤Êthread ID¤ÎÃͤǤ¢¤ë¡£
´Ø¿ôvecAdd¤Î°ú¿ôA, B, C¤¬¤¤¤º¤ì¤â¥Ý¥¤¥ó¥¿¤Ê¤Î¤Ï¡¢Â¿Ê¬¡¢ÇÛÎó¤È¥Ý¥¤¥ó¥¿¤Î¸ß´¹¤òÍøÍѤ·¤¿¤À¤±¤Î¤³¤È¤À¤í¤¦¤È»×¤¦¡£
- ¥¹¥ì¥Ã¥É¤Î³¬ÁØ
¥¹¥ì¥Ã¥É¤ò£±¼¡¸µ¡¦£²¼¡¸µ¡¦£³¼¡¸µ¤Ç»È¤¨¤ë¤è¤¦¤Ë¡¢threadIdx¤Ï£³Í×ÁǤΥ٥¯¥È¥ë¤Ë¤Ê¤Ã¤Æ¤¤¤ë¡£
¤¿¤È¤¨¤Ð£²¼¡¸µ¤ÎÇÛÎó¤ÎÎã¡§
__global__ void matAdd(float A[N][N], float B[N][N], float C[N][N])
{
int i = threadIdx.x;
int j = threadIdx.y;
C[i][j] = A[i][j] + B[i][j];
}
int main()
{
// Kernel invocation
dim3 dimBlock(N, N);
matAdd<<<1, dimBlock>>>(A, B, C);
}
¤Þ¤º¡¢°ú¿ôA, B, C¤Ï¤¤¤º¤ì¤â£²¼¡¸µ¤ÇNxN¤È¤·¤Æ¤¢¤ë¡£i¤Èj¤ÏthreadIdx¤Î¥Ù¥¯¥È¥ë¤Î
xÍ×ÁǤÈyÍ×ÁÇ¡£
main¤Î¦¤Ï¡¢dim3·¿¤ÎdimBlock¤òÀë¸À¤·¤Æ¤ª¤¡¢¥¹¥ì¥Ã¥É¤Î¿ô¤Ï<<<1. dimBlock>>>¤È
¤¤¤Æ¤¤¤ë¡£
dim3·¿¤Ï¤³¤Î¤è¤¦¤Ê¼¡¸µ¤ò»ØÄꤹ¤ë¤Î¤ËÍѤ¤¤é¤ì¤ë¡¢À°¿ô¥Ù¥¯¥È¥ë¤Î·¿¤Ç¡¢»ØÄꤵ¤ì¤Ê¤«¤Ã¤¿Í×ÁǤϣ±¤Ë½é´ü²½¤µ¤ì¤ë¡Ê4.3.1.12¡¢34¥Ú¡¼¥¸¡Ë
- ¿¼¡¸µ¤Î¥¹¥ì¥Ã¥ÉID¤Ïñ½ã¤Ë¡¢£²¼¡¸µ¡§(x, y)¤Ï(x + y*Dx)¡¢£³¼¡¸µ¡§(x, y, z)¤Ï(x + y*Dx + z*Dx*Dy)¤Ç·×»»¤µ¤ì¤ë¡£Ã¢¤·Dx, Dy, Dz¤Ï³Æ¼¡¸µ¤Ç¤ÎÂ礤µ¡£
- ¥Ö¥í¥Ã¥¯Æâ¤Î¥¹¥ì¥Ã¥É¤Ï¡¢¶¦Í¥á¥â¥ê¤Ë¤è¤ë¥Ç¡¼¥¿¶¦Í¤È¡¢Æ±´üµ¡¹½__syncthreads()¤ò»È¤Ã¤Æ¶¨Ä´Æ°ºî¤¹¤ë¡£
__synctrehds()¤ÏÁ´¤Æ¤Î¥¹¥ì¥Ã¥É¤ò¤½¤ÎÅÀ¤ÇƱ´ü¤µ¤»¤ë¥Ð¥ê¥ä¤ÎƯ¤¤ò¤¹¤ë¡£
- ¸úΨŪ¤Ê¼Â¹Ô¤Î¤¿¤á¤Ë¤Ï¡¢¶¦Í¥á¥â¥ê¤Ë¤Ä¤¤¤Æ¤ÏÃٱ䤬¾¯¤Ê¤¤¤³¤È¡¢¤À¤«¤é¥×¥í¥»¥Ã¥µ¥³¥¢¤Î¶á¤¯¤ËÃÖ¤«¤ì¤¿£Ì£±¥¥ã¥Ã¥·¥å¤Î¤è¤¦¤Ê¿¶¤ëÉñ¤¤¤¬Ë¾¤Þ¤ì¤ë¤·¡¢__syncthreads()¤Ë¤Ä¤¤¤Æ¤Ï¤Ï·ÚÎ̤Ǥ¢¤ë¤³¤È¡¢¤À¤«¤éƱ°ì¥Ö¥í¥Ã¥¯¤Î¥¹¥ì¥Ã¥É¤ÏƱ¤¸¥×¥í¥»¥Ã¥µ¥³¥¢Æâ¤Ë¤¢¤ë¤³¤È¤¬Ë¾¤Þ¤ì¤ë¡£¤³¤ì¤Ë¤è¤Ã¤Æ¡¢£±¥Ö¥í¥Ã¥¯Æâ¤Î¥¹¥ì¥Ã¥É¤Î¸Ä¿ô¤Ï¥×¥í¥»¥Ã¥µ¥³¥¢¤Î¥á¥â¥ê»ñ¸»¤Ë¤è¤Ã¤ÆÀ©¸Â¤µ¤ì¤ë¡£NVIDIA Tesla¤Î¥¢¡¼¥¥Æ¥¯¥Á¥ã¤Ç¤Ï512¥¹¥ì¥Ã¥É¤¬¾å¸Â¤Ë¤Ê¤ë¡£
- ¤·¤«¤·¤Ê¤¬¤é¡¢¥«¡¼¥Í¥ë¤Ï¡¢Æ±¤¸·Á¤ò¤·¤¿Ê£¿ô¤Î¡Ê¥¹¥ì¥Ã¥É¡Ë¥Ö¥í¥Ã¥¯¾å¤Ç¼Â¹Ô¤¹¤ë¤³¤È¤¬¤Ç¤¡¢¥¹¥ì¥Ã¥ÉÁí¿ô¤Ï¡Ê¥Ö¥í¥Ã¥¯Åö¤¿¤ê¤Î¥¹¥ì¥Ã¥É¿ô¡Ë¡ß¡Ê¥Ö¥í¥Ã¥¯¿ô¡Ë¤Ë¤Ç¤¤ë¡£¸Ä¤ÎÊ£¿ô¥Ö¥í¥Ã¥¯¤Ï¡¢£±¼¡¸µËô¤Ï£²¼¡¸µ¤Î¡Ö¥°¥ê¥Ã¥É¡×¤Î·Á¤Ë¤ª¤¯¤³¤È¤¬¤Ç¤¤ë¡£

¥°¥ê¥Ã¥É¤Î¿ô¤Ï¡¢<<< ... >>>¤ÎÂ裱¹àÌܤ˽ñ¤¯¤³¤È¤Ç»ØÄê¤Ç¤¤ë¡£¡Êº£¤Þ¤Ç¤ÎÎã¤Ç¤Ï£±¤À¤Ã¤¿¡Ë
¥°¥ê¥Ã¥ÉÆâ¤Î¥Ö¥í¥Ã¥¯¤Ï¡¢Áȹþ¤ßÊÑ¿ôblockIdx¤Ç¼¨¤µ¤ì¤ë¥¤¥ó¥Ç¥Ã¥¯¥¹ÈÖ¹æ¡Ê£±¼¡¸µ¤â¤·¤¯¤Ï£²¼¡¸µ¡Ë¤Çɽ¤µ¤ì¤ë¡£
__global__ void matAdd(float A[N][N], float B[N][N], float C[N][N])
{
int i = blockIdx.x * blockDim.x + threadIdx.x;
int j = blockIdx.y * blockDim.y + threadIdx.y;
if (i < N && j < N)
C[i][j] = A[i][j] + B[i][j];
}
int main()
{
// Kernel invocation
dim3 dimBlock(16, 16);
dim3 dimGrid( (N + dimBlock.x - 1) / dimBlock.x,
(N + dimBlock.y - 1) / dimBlock.y );
matAdd<<<dimGrid, dimBlock>>>(A, B, C);
}
¤³¤³¤Ç¡¢Áȹþ¤ßÊÑ¿ôblockDim¤Ï¡¢¥Ö¥í¥Ã¥¯¤ÎÂ礤µ¤ò¼¨¤·¤Æ¤¤¤ë¡£
¤Þ¤¿¡¢¸Ä¡¹¤Î¥Ö¥í¥Ã¥¯¤Î¥Ö¥í¥Ã¥¯¥µ¥¤¥º16x16=256¤ÏŬÅö¤ËÁª¤ó¤Ç¤¢¤Ã¤Æ¡¢¥°¥ê¥Ã¥ÉÆâ¤Î¥Ö¥í¥Ã¥¯¿ô¤Ï¹ÔÎóÁ´ÂÎNxN¤ÎÍ×ÁÇ£±¤Ä£±¤Ä¤¬¥¹¥ì¥Ã¥É¤Ë³äÅö¤¿¤ë¤è¤¦¤Ë·×»»¤·¤Æ¤¤¤ë¡£
- ¥á¥â¥ê¤Î³¬ÁØ
- CUDA¤Î¥á¥â¥ê¹½Â¤¤Ï¡¢£³¼ïÎà¤Î¶õ´Ö¡Ê¥¹¥ì¥Ã¥É¤´¤È¤Î¥í¡¼¥«¥ë¥á¥â¥ê¡¢¥Ö¥í¥Ã¥¯¤´¤È¤Î¶¦Í¥á¥â¥ê¡¢Á´ÂΤΥ°¥í¡¼¥Ð¥ë¥á¥â¥ê¡Ë¤«¤é¤Ê¤ë¡£
¥¹¥ì¥Ã¥É¥Ö¥í¥Ã¥¯¤Î¶¦Í¥á¥â¥ê¤Ï¡¢¥Ö¥í¥Ã¥¯Æâ¤ÎÁ´¤Æ¤Î¥¹¥ì¥Ã¥É¤«¤é»²¾È²Äǽ¤Ç¤¢¤ê¡¢¥Ö¥í¥Ã¥¯¤Î¥é¥¤¥Õ¥¿¥¤¥à¤ÈƱ¤¸¥é¥¤¥Õ¥¿¥¤¥à¤ò»ý¤Ä¡£¥°¥í¡¼¥Ð¥ë¥á¥â¥ê¤Ï¸ºß¤¹¤ëÁ´¤Æ¤Î¥¹¥ì¥Ã¥É¤«¤é¥¢¥¯¥»¥¹²Äǽ¤Ç¤¢¤ë¡£
- ¤³¤ì¤Ë²Ã¤¨¤Æ¡¢Á´¤Æ¤Î¥¹¥ì¥Ã¥É¤«¤é¥¢¥¯¥»¥¹²Äǽ¤Ê¡¢ÆÉ½Ð¤·ÀìÍѤΥá¥â¥ê¤¬£²¼ïÎฺߤ¹¤ë¡£Äê¿ô¥á¥â¥ê¶õ´Ö¤È¥Æ¥¯¥¹¥Á¥ã¥á¥â¥ê¶õ´Ö¤Ç¤¢¤ë¡£¤½¤ì¤¾¤ì°Û¤Ê¤Ã¤¿»È¤¤Êý¤ËºÇŬ²½¤µ¤ì¤Æ¤¤¤ë¡£¤Þ¤¿¡¢¥Æ¥¯¥¹¥Á¥ã¥á¥â¥ê¤Ï¡¢¥Ç¡¼¥¿¥Õ¥£¥ë¥¿¥ê¥ó¥°¤È¤È¤â¤Ë¡¢¤¢¤ëÆÃÄê¤Î¥Ç¡¼¥¿·Á¼°¤Î¤¿¤á¤Î¤¤¤¯¤Ä¤«¤Î°Û¤Ê¤ë¥¢¥É¥ì¥·¥ó¥°¥â¡¼¥É¤òÈ÷¤¨¤Æ¤¤¤ë¡£
- ¥°¥í¡¼¥Ð¥ë¥á¥â¥ê¶õ´Ö¡¦Äê¿ô¥á¥â¥ê¶õ´Ö¡¦¥Æ¥¯¥¹¥Á¥ã¥á¥â¥ê¶õ´Ö¤Ï¡¢¤¤¤º¤ì¤â¡¢Æ±¤¸¥¢¥×¥ê¥±¡¼¥·¥ç¥óÆâ¤Ç¤Î¥«¡¼¥Í¥ëµ¯Æ°¤ËÂФ·¤Æ»ý³Ū¡Ê¥«¡¼¥Í¥ë¤¬½ªÎ»¤·¼¡¤ËÊ̤Υ«¡¼¥Í¥ë¤¬µ¯Æ°¤µ¤ì¤¿¤È¤¤Ç¤â¡¢ÊѤï¤é¤Ê¤¤¾ðÊ󤬥¢¥¯¥»¥¹¤Ç¤¤ë¡Ë¤Ç¤¢¤ë¡£
- ¥Û¥¹¥È¡ÊÍפ¹¤ë¤ËCPU¡Ë¤È¥Ç¥Ð¥¤¥¹¡ÊÍפ¹¤ë¤ËGPU¡Ë
- CUDA¤Î¥×¥í¥°¥é¥à¼Â¹Ô¤Ï¡¢¥Û¥¹¥È¡ÊCPU¡Ë¤È¥Ç¥Ð¥¤¥¹¡ÊGPU¡Ë¤È¤¤¤¦°Û¤Ê¤ë£²²Õ½ê¤Ç¹Ô¤ï¤ì¤ë¡£¼Â¹Ô¤ÎÍͻҤÎÎã¤Ï°Ê²¼¤Î¿Þ¡£

- ¤³¤Î¤È¤¡¢¥á¥â¥ê¤â¥Û¥¹¥È¥á¥â¥ê¤È¥Ç¥Ð¥¤¥¹¥á¥â¥ê¤Î£²²Õ½ê¤ËÃÖ¤«¤ì¤ë¤³¤È¤Ë¤Ê¤ë¡£
¥×¥í¥°¥é¥à¤Ï¡¢¥°¥í¡¼¥Ð¥ë¡¦Äê¿ô¡¦¥Æ¥¯¥¹¥Á¥ã¥á¥â¥ê¤ò¥«¡¼¥Í¥ë¤Ë¸«¤¨¤ë¤è¤¦¤Ë¤¹¤ë¤¿¤á¡¢CUDA¥é¥ó¥¿¥¤¥à¤ò¸Æ¤Ó½Ð¤¹¡£¥é¥ó¥¿¥¤¥à¤Ç¤Ï¡¢¥Ç¥Ð¥¤¥¹¥á¥â¥ê¤Î³ÎÊݤȲòÊü¤ä¡¢
¥Û¥¹¥È¡¦¥Ç¥Ð¥¤¥¹¥á¥â¥ê´Ö¤ÎžÁ÷¤ò¹Ô¤¦¡£
- ¥½¥Õ¥È¥¦¥§¥¢¥¹¥¿¥Ã¥¯
- CUDA¤Î¥½¥Õ¥È¥¦¥§¥¢¥¹¥¿¥Ã¥¯¤Ï¡¢¿Þ¤Ë¤¢¤ë¤è¤¦¤Ë¡¢¥Ç¥Ð¥¤¥¹¥É¥é¥¤¥Ð¡¢£Á£Ð£É¤È¤½¤Î¤¿¤á¤Î¥é¥ó¥¿¥¤¥à¡¢µÚ¤Ó£²¤Ä¤Î¤è¤¯»È¤ï¤ì¤ë¹â¥ì¥Ù¥ë¿ô³Ø¥é¥¤¥Ö¥é¥ê¡ÊCUFFT¤ÈCUBLAS¡¢¤¤¤º¤ì¤âÊ̤Υޥ˥奢¥ë¤Ë²òÀ⤵¤ì¤Æ¤¤¤ë¡Ë¤«¤éÀ®¤Ã¤Æ¤¤¤ë¡£
- ¡Ê¥Ç¥Ð¥¤¥¹¤Î¡ËCompute Capability¡¡¡¡¡Á¡Á¡¡¾Êά
Chapter 3 GPU Implementation †
- (Ê£¿ô¤Î¡Ë¥ª¥ó¥Á¥Ã¥×¶¦Í¥á¥â¥ê¤ò»ý¤ÄSIMT (Single-Instruction Multiple-Thread) ¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ
- Tesla¥¢¡¼¥¥Æ¥¯¥Á¥ã¤Ç¤Ï¡¢¥Û¥¹¥ÈCPU¤ÇCUDA¥×¥í¥°¥é¥à¤¬¥«¡¼¥Í¥ë¤òµ¯Æ°¤¹¤ë¤È¡¢¥°¥ê¥Ã¥ÉÆâ¤Î¥Ö¥í¥Ã¥¯¤¬¿ô¤¨¤é¤ì¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤ËʬÇÛ¤µ¤ì¤ë¡£¥Ö¥í¥Ã¥¯Æâ¤Î¥¹¥ì¥Ã¥É¤Ï£±¤Ä¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µÆâ¤Ç¼Â¹Ô¤µ¤ì¤ë¡££±¤Ä¤Î¥Ö¥í¥Ã¥¯¤¬½ªÎ»¤¹¤ë¤È¡¢¿·¤·¤¤¥Ö¥í¥Ã¥¯¤¬¶õ¤¤¤¿¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¾å¤Çµ¯Æ°¤µ¤ì¤ë¡£
- £±¤Ä¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤Ï¡¢£¸¤Ä¤Î¥¹¥«¥é¡¼¥×¥í¥»¥Ã¥µ(SP)¥³¥¢¡¢£²¤Ä¤ÎÆÃ¼ì¤Ê´Ø¿ô¥æ¥Ë¥Ã¥È¡ÊĶ±Û´Ø¿ô¤Î¤¿¤á¤Î¡Ë¡¢Â¿¥¹¥ì¥Ã¥ÉÌ¿Îá¥æ¥Ë¥Ã¥È¡¢¥ª¥ó¥Á¥Ã¥×¶¦Í¥á¥â¥ê¤«¤éÀ®¤ë¡£¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤ÏÊÂÎó¥¹¥ì¥Ã¥É¤òÀ¸À®¤·¡¢´ÉÍý¤·¡¢¥¹¥±¥¸¥å¡¼¥ê¥ó¥°¥ª¡¼¥Ð¡¼¥Ø¥Ã¥É¤Ê¤·¤Ë¥Ï¡¼¥É¥¦¥§¥¢¾å¤Ç¼Â¹Ô¤¹¤ë¡£¤Þ¤¿¡¢__syncthreads()¥Ð¥ê¥¢Æ±´ü¤òÆâÉôµ¡Ç½¤È¤·¤Æ£±Ì¿Îá¤Ç¼Â¹Ô¤Ç¤¤ë¡£¹â®¤Ê¥Ð¥ê¥¢Æ±´ü¡¢·ÚÎ̤ʥ¹¥ì¥Ã¥ÉÀ¸À®¡¢¥ª¡¼¥Ð¡¼¥Ø¥Ã¥É¤Ê¤·¤Î¥¹¥ì¥Ã¥É¥¹¥±¥¸¥å¡¼¥ê¥ó¥°¤Ï¤È¤â¤Ë¡¢Èó¾ï¤ËºÙγÅÙ¤ÎÊÂÎó¤ò¼Â¸½¤·¤Æ¤ª¤ê¡¢¤¿¤È¤¨¤Ð¡¢£²¼¡¸µ²èÁü¤Î¥Ô¥¯¥»¥ë¡¢£³¼¡¸µ¤Î¥Ü¥¯¥»¥ë¡¢¥°¥ê¥Ã¥É¥³¥ó¥Ô¥å¡¼¥Æ¥£¥ó¥°¤Î¥»¥ë¤Ê¤É¤Î¡¢¸Ä¡¹¤Î¥Ç¡¼¥¿¤ò£±¤Ä¤Î¥¹¥ì¥Ã¥É¤Ë³ä¤êÅö¤Æ¤ë¤È¤¤¤Ã¤¿Èó¾ï¤ËºÙ¤«¤¤Î³ÅÙ¤Îʬ³ä¤ò²Äǽ¤Ë¤·¤Æ¤¤¤ë¡£
- ¿ô¸Ä¤Î°Û¤Ê¤ë¥×¥í¥°¥é¥à¤ò¼Â¹Ô¤·¤Æ¤¤¤ë¿ôÉ´¤Î¥¹¥ì¥Ã¥É¤ò´ÉÍý¤¹¤ë¤¿¤á¡¢¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤Ï¿·¤·¤¤SIMT (single-instruction, multiple-thread)¥¢¡¼¥¥Æ¥¯¥Á¥ã¤òºÎÍѤ·¤Æ¤¤¤ë¡£¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤Ï¡¢¤½¤ì¤¾¤ì¤Î¥¹¥ì¥Ã¥É¤ò£±¤Ä¤Î¥¹¥«¥é¥×¥í¥»¥Ã¥µ¥³¥¢¤Ë³äÅö¤Æ¡¢¸Ä¡¹¤Î¥¹¥«¥é¥¹¥ì¥Ã¥É¤Ï¤½¤ì¤¾¤ì¸ÄÊ̤ÎÌ¿Îᥢ¥É¥ì¥¹¤È¥ì¥¸¥¹¥¿¤ò»ý¤Ã¤ÆÆÈΩ¤Ë¼Â¹Ô¤¹¤ë¡£¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤ÎSIMT¥æ¥Ë¥Ã¥È¤Ï¡¢£³£²¸Ä¤ÎÊÂÎó¥¹¥ì¥Ã¥É¤ò¥°¥ë¡¼¥×¤È¤·¤Æ¡Ê¤³¤ì¤òwarp¤È¸Æ¤Ö¡Ë¡¢À¸À®¤·¡¢¥¹¥±¥¸¥å¡¼¥ë¤·¡¢¼Â¹Ô¤¹¤ë¡££±¤Ä¤ÎSIMT warp¤ò¹½À®¤¹¤ë¤½¤ì¤¾¤ì¤Î¥¹¥ì¥Ã¥É¤Ï¡¢Æ±¤¸¥×¥í¥°¥é¥à¥¢¥É¥ì¥¹¤«¤é³«»Ï¤¹¤ë¤¬¡¢¤½¤ì°Ê³°¤Ï¼«Í³¤Ëʬ´ô¤·¡¢ÆÈΩ¤Ë¼Â¹Ô¤¹¤ë¡£
- £±¤Ä¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤¬£±¤Ä¤«¤½¤ì°Ê¾å¤Î¥Ö¥í¥Ã¥¯¤ò¼Â¹Ô¤¹¤ë¤È¤¤Ï¡¢SIMT¥æ¥Ë¥Ã¥È¤Ç¥¹¥±¥¸¥å¡¼¥ë¤µ¤ì¤ëwarp¤Ëʬ³ä¤¹¤ë¡£¥Ö¥í¥Ã¥¯¤¬warp¤Ëʬ³ä¤µ¤ì¤ë¤ä¤ê¤«¤¿¤Ï¡¢¤¤¤Ä¤âƱ¤¸¤Ç¡¢¤½¤ì¤¾¤ì¤Îwarp¤ÏϢ³¤·¤¿¥¹¥ì¥Ã¥ÉID¤ò»ý¤Ä¥¹¥ì¥Ã¥É¤ò»ý¤Ä¡ÊºÇ½é¤Îwarp¤Ï¥¹¥ì¥Ã¥É0ÈÖ¤ò»ý¤Ä¡Ë¡£
- Ì¿Î᤬ȯ¹Ô¤µ¤ì¤ë¤È¤¤Ï¡¢SIMT¥æ¥Ë¥Ã¥È¤Ï¼Â¹Ô½àÈ÷¤Î¤Ç¤¤¿warp¤òÁªÂò¤·¡¢warp¤Î¥¢¥¯¥Æ¥£¥Ö¤Ê¥¹¥ì¥Ã¥É¤ËÂФ·¤Æ¤½¤ÎÌ¿Îá¤òȯ¹Ô¤¹¤ë¡££±¤Ä¤Îwarp¤Ï£±¤Ä¤Î¶¦ÄÌÌ¿Îá¤ò£±¤Ä¤º¤Ä¼Â¹Ô¤¹¤ë¡£¤À¤«¤éwarp¾å¤Î£³£²¸ÄÁ´¤Æ¤Î¥¹¥ì¥Ã¥É¤¬Æ±¤¸execution path¤ò¼Â¹Ô¤¹¤ë¤È¤¡¢¥Õ¥ë¤Î¸úΨ¤¬ÆÀ¤é¤ì¤ë¡£¤â¤·warp¾å¤Î¥¹¥ì¥Ã¥É¤¬¥Ç¡¼¥¿¤Ë°Í¸¤¹¤ë¾ò·ïʬ´ô¤Ç°Û¤Ê¤ë¼Â¹Ô¥Ñ¥¹¤ò¼è¤ë¾ì¹ç¡¢warp¤Ï¤½¤ì¤¾¤ì¤Îʬ´ô¥Ñ¥¹¤ò£±¤Ä£±¤ÄľÎó¤Ë¡Ê¤½¤Î¥Ñ¥¹¤Ë¾è¤é¤Ê¤¤¥¹¥ì¥Ã¥É¤Î¼Â¹Ô¤òÄä»ß¤·¤Ê¤¬¤é¡Ë¼Â¹Ô¤¹¤ë¡£¤½¤Î¸å¡¢Á´¤Æ¤Î¥Ñ¥¹¤¬½ª¤ï¤Ã¤¿Ãʳ¬¤Ç¡¢¥¹¥ì¥Ã¥É¤ÏƱ¤¸¼Â¹Ô¥Ñ¥¹¤Ë½¸¹ç¤¹¤ë¡£¤³¤Î¤è¤¦¤Êʬ´ô¤Ë¤è¤ëʬÎö¤ÏƱ¤¸warpÆâ¤Ë¤Î¤ßµ¯¤³¤ë¡£warp¤¬°Û¤Ê¤ì¤Ð¡¢Ì¿Îá¥Ñ¥¹¤¬¶¦Ä̤«°Û¤Ê¤ë¤«¤Ë¤«¤«¤ï¤é¤º¡¢¼Â¹Ô¤ÏÆÈΩ¤Ç¤¢¤ë¡£
- SIMT¥¢¡¼¥¥Æ¥¯¥Á¥ã¤ÏSIMD(Single-Instruction, Multiple-Data)¤Î¥Ù¥¯¥È¥ë·¿¤Î¹½À®¤Ë»÷¤Æ¤¤¤ë¡£Âç»ö¤Ê°ã¤¤¤Ï¡¢SIMD¤Ç¤ÏSIMD¤ÎÉý¤¬¥½¥Õ¥È¥¦¥§¥¢¤Ë¸«¤¨¤Æ¤¤¤ë¤Î¤ËÂФ·¤Æ¡¢SIMT¤Ç¤ÏÌ¿Îá¤Ï£±¤Ä¤Î¥¹¥ì¥Ã¥É¤Î¼Â¹Ô¤Èʬ´ô¤Î¿¶Éñ¤¤¤ò»ØÄꤷ¤Æ¤¤¤ë¤Ë²á¤®¤Ê¤¤¡£SIMT¤Ç¤Ï¥×¥í¥°¥é¥Þ¤¬¡¢ÆÈΩ¤·¤¿¥¹¥«¥é¡¼¥¹¥ì¥Ã¥É¤«¤é¤Ê¤ë¥¹¥ì¥Ã¥ÉÊÂÎ󥳡¼¥É¤â¡¢¶¨Ä´¤·¤ÆÆ°¤¯¥¹¥ì¥Ã¥É¤«¤é¤Ê¤ë¥Ç¡¼¥¿ÊÂÎó¤Î¥³¡¼¥É¤â¡¢½ñ¤¯¤³¤È¤¬¤Ç¤¤ë¡£¥×¥í¥°¥é¥à¤ÎÏÀÍýŪ¤ÊÀµ¤·¤µ¤È¤¤¤¦ÅÀ¤«¤é¤Ï¥×¥í¥°¥é¥Þ¤ÏSIMT¤Î¿¶Éñ¤¤¤ò̵»ë¤¹¤ë¤³¤È¤¬¤Ç¤¤ë°ìÊý¤Ç¡¢¼Â¹ÔÀǽ¤È¤¤¤¦ÅÀ¤«¤é¤ÏÌÇ¿¤Ë°Û¤Ê¤ëʬ´ô¤ò¤·¤Ê¤¤¤è¤¦¤Ë¥³¡¼¥É¤òºîÀ®¤¹¤ë¤³¤È¤Ë¤è¤Ã¤ÆÀǽ¤ò·å°ã¤¤¤Ë¸þ¾å¤Ç¤¤ë¡£¤³¤ì¤Ï¥¥ã¥Ã¥·¥å¤ÎÌò³ä¤ÈƱ¤¸¤è¤¦¤Ë¹Í¤¨¤ë¤³¤È¤¬½ÐÍè¤ë¡£¤Ä¤Þ¤ê¡¢¥¥ã¥Ã¥·¥å¤Ï¥×¥í¥°¥é¥à¤ÎÏÀÍýŪÀµ¤·¤µ¤È¤¤¤¦ÅÀ¤«¤é¤Ï¸ºß¤ò̵»ë¤Ç¤¤ë¤¬¡¢ºÇ¹âÀǽ¤òÆÀ¤è¤¦¤È¥Ç¥¶¥¤¥ó¤¹¤ë¤È¤¤Ë¤Ï¥³¡¼¥É¤Î¹½Â¤¤ò¹Í¤¨¤Ê¤±¤ì¤Ð¤Ê¤é¤Ê¤¤¡£¤Á¤Ê¤ß¤Ë¡¢¥Ù¥¯¥È¥ë¥¢¡¼¥¥Æ¥¯¥Á¥ã¤Î¾ì¹ç¤Ï¡¢¥½¥Õ¥È¥¦¥§¥¢¤¬¥Ù¥¯¥È¥ë¤Î¥í¡¼¥É¤ò¤·¤¿¤êʬ´ô¤Î°Û¤Ê¤ë¾ì¹ç¤Î´ÉÍý¤ò¼«Ê¬¤Ç¡Ê¥½¥Õ¥È¥¦¥§¥¢¤Ç¡Ë¤·¤Ê¤±¤ì¤Ð¤Ê¤é¤Ê¤¤¡£

- ¿Þ¤Î¤è¤¦¤Ë¡¢£±¤Ä¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤Ï£´¼ïÎà¤Î¥ª¥ó¥Á¥Ã¥×¥á¥â¥ê¤ò»ý¤Ã¤Æ¤¤¤ë
- £³£²¥Ó¥Ã¥È¤Î¥í¡¼¥«¥ë¥ì¥¸¥¹¥¿¤ò£±¥×¥í¥»¥Ã¥µ¤ËÉÕ¤£±¥»¥Ã¥È
- ¥Ñ¥é¥ì¥ë¥Ç¡¼¥¿¥¥ã¥Ã¥·¥å¡Ê¶¦Í¥á¥â¥ê¡Ë¡£Á´¤Æ¤Î¥¹¥«¥é¥×¥í¥»¥Ã¥µ¥³¥¢¤Ë¶¦Í¤µ¤ì¤Æ¤¤¤ë¡£¶¦Í¥á¥â¥ê¶õ´Ö¤¬ÃÖ¤«¤ì¤ë¾ì½ê¡£
- ¥ê¡¼¥É¥ª¥ó¥ê¡¼Äê¿ô¥¥ã¥Ã¥·¥å¡£Á´¤Æ¤Î¥¹¥«¥é¥×¥í¥»¥Ã¥µ¥³¥¢¤Ë¶¦Í¤µ¤ì¤Æ¤¤¤ë¡£Äê¿ô¥á¥â¥ê¶õ´Ö¡Ê¥Ç¥Ð¥¤¥¹¥á¥â¥ê¤Î¥ê¡¼¥É¥ª¥ó¥ê¡¼Îΰè¡Ë¤ÎÆÉ½Ð¤·¤ò¹â®²½¤¹¤ë¡£
- ¥ê¡¼¥É¥ª¥ó¥ê¡¼¥Æ¥¯¥¹¥Á¥ã¥¥ã¥Ã¥·¥å¡£Á´¤Æ¤Î¥¹¥«¥é¥×¥í¥»¥Ã¥µ¥³¥¢¤Ë¶¦Í¤µ¤ì¤Æ¤¤¤ë¡£¥Æ¥¯¥¹¥Á¥ã¥á¥â¥ê¶õ´Ö¡Ê¥Ç¥Ð¥¤¥¹¥á¥â¥ê¤Î¥ê¡¼¥É¥ª¥ó¥ê¡¼Îΰè¡Ë¤ÎÆÉ½Ð¤·¤ò¹â®²½¤¹¤ë¡£¤½¤ì¤¾¤ì¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤Ï¡¢¥Æ¥¯¥¹¥Á¥ã¥¥ã¥Ã¥·¥å¤ò¡Ö¥Æ¥¯¥¹¥Á¥ã¥æ¥Ë¥Ã¥È¡×¤ò·Ðͳ¤·¤Æ¥¢¥¯¥»¥¹¤¹¤ë¡£¥Æ¥¯¥¹¥Á¥ã¥æ¥Ë¥Ã¥È¤Ï¤¤¤í¤¤¤í¤Ê¥¢¥É¥ì¥·¥ó¥°¥â¡¼¥É¤È¥Ç¡¼¥¿¥Õ¥£¥ë¥¿¥ê¥ó¥°¤ò¼Â¸½¤·¤Æ¤¤¤ë¡£
- ¥í¡¼¥«¥ë¤È¥°¥í¡¼¥Ð¥ë¤Î¥á¥â¥ê¶õ´Ö¤Ï¡¢ÆÉ¤ß½ñ¤²Äǽ¤ÎÎΰè¤Ç¡¢¥¥ã¥Ã¥·¥å¤µ¤ì¤Æ¤¤¤Ê¤¤¡£
- £±¤Ä¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤¬¤¤¤¯¤Ä¤Î¥Ö¥í¥Ã¥¯¤ò¥×¥í¥»¥¹¤Ç¤¤ë¤«¤Ï¡¢¥¹¥ì¥Ã¥É¤¢¤¿¤ê¤ËɬÍפʥ쥸¥¹¥¿¤Î¸Ä¿ô¤È¡¢¥Ö¥í¥Ã¥¯¤´¤È¤ËɬÍפʶ¦Í¥á¥â¥ê¤ÎÎ̤˰͸¤¹¤ë¡£¤Ê¤¼¤Ê¤é¡¢¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤Î¥ì¥¸¥¹¥¿¤È¶¦Í¥á¥â¥ê¤Ï¡¢¥Ö¥í¥Ã¥¯¤Î¥Ð¥Ã¥Á¤Çư¤¯¥¹¥ì¥Ã¥ÉÁ´¤Æ¤Ëʬ¤±¤é¤ì¤ë¤«¤é¤Ç¤¢¤ë¡£¤â¤·¡¢¾¯¤Ê¤¯¤È¤â£±¤Ä¤Î¥Ö¥í¥Ã¥¯¤ò¼Â¹Ô¤¹¤ë¤Î¤Ë¤â¥ì¥¸¥¹¥¿¤ä¶¦Í¥á¥â¥ê¤¬Â¤ê¤Ê¤¤¤Î¤Ç¤¢¤ì¤Ð¡¢¤½¤Î¥«¡¼¥Í¥ë¤Ï¼Â¹Ô¤Ç¤¤Ê¤¤¡££±¤Ä¤Î¥Þ¥ë¥Á¥×¥í¥»¥Ã¥µ¤ÏºÇÂ磸¥Ö¥í¥Ã¥¯¤òƱ»þ¤Ë¼Â¹Ô¤Ç¤¤ë¡£
- ¤â¤·£±¤Ä¤ÎwarpÆâ¤Çnon-atomic¤ÊÌ¿Î᤬¡¢¥°¥í¡¼¥Ð¥ë²½¶¦Í¥á¥â¥ê¤ÎƱ¤¸¾ì½ê¤Ë½ñ¤¹þ¤ß¤ò¤·¤¿¾ì¹ç¡¢Ä¾Î󲽤µ¤ì¤¿½ñ¤¹þ¤ß¤Î²ó¿ô¤ä¤É¤Î½çÈ֤ǽñ¤«¤ì¤¿¤«¤Ïundefined¤Ç¤¢¤ë¤¬¡¢¤É¤ì¤«¤Î½ñ¤¹þ¤ß¤ÏÀ®¸ù¤¹¤ë¤³¤È¤¬Êݾڤµ¤ì¤Æ¤¤¤ë¡£atomic¤ÊÌ¿Î᤬¥°¥í¡¼¥Ð¥ë¥á¥â¥ê¤ËƱ»þ¤ËÆÉ½Ð¤·¡¢Êѹ¹¤·¡¢Ëô¤Ï½ñ¤¤³¤ó¤À¾ì¹ç¤Ï¡¢¤½¤ì¤¾¤ì¤ÎÁàºî¤Ï¡Öµ¯¤³¤ê¡×¡¢¤«¤Ä¤¹¤Ù¤ÆÄ¾Î󲽤µ¤ì¤ë¤¬¡¢¤½¤Î½çÈÖ¤Ïundefined¤Ç¤¢¤ë¡£
Chapter 4 API †