Contents
1 Instruction set
1.1 General Info
1.2 Predicates
1.3 Operands
1.4 Registers
1.5 Immediates
1.6 Indexes
1.7 Constants
1.8 Swizzles
1.9 Modifier and dest data format
1.10 0x00000000 - 0x08000000
1.11 0x08000000 - 0x10000000
1.12 0x10000000 - 0x18000000
1.13 0x18000000 - 0x20000000
1.14 0x20000000 - 0x28000000
1.14.1 Instructions
1.14.2 Encoding
1.14.3 Notes
1.14.4 mad.f32 i0, r0, i0, i0
1.14.5 mad.f32 r0.xy, r0.xx, i0.xx, i0.xy
1.14.6 mad.f32 i0, r0.x, i0.x, i0.x
1.14.7 mad.f32 r0.x, r0.x, i0.x, i0.x
1.14.8 dot.f32 i0, r0.xxx, i0.xxx
1.14.9 dot.f32 r0.x, r0.xxx, i0.xxx
1.14.10 dot.f32 i0, r0.xxx, r0.xxx
1.14.11 dot.f32 r0.x, r0.xxx, r0.xxx
1.14.12 add.f32 i0, r0, i0
1.14.13 add.f32 r0.xy, r0.xx, i0.xx
1.14.14 add.f32 i0, r0.x, i0.x
1.14.15 add.f32 r0.x, r0.x, i0.x
1.14.16 mul.f32 i0, r0, i0
1.14.17 mul.f32 r0.xy, r0.xx, i0.xx
1.14.18 mul.f32 i0, r0.x, i0.x
1.14.19 mul.f32 r0.x, r0.x, i0.x
1.14.20 subflr.f32 i0, r0.x, i0.x
1.14.21 subflr.f32 r0.x, r0.x, i0.x
1.14.22 exp.f32 i0, r0.x
1.14.23 exp.f32 r0.x, r0.x
1.14.24 mov.f32 i0, r0
1.14.25 mov.f32 r0.xy, r0.xx
1.14.26 log.f32 i0, r0.x
1.14.27 log.f32 r0.x, r0.x
1.14.28 rsq.f32 i0, r0.x
1.14.29 rsq.f32 r0.x, r0.x
1.14.30 rcp.f32 i0, r0.x
1.14.31 rcp.f32 r0.x, r0.x
1.14.32 group 0
1.14.33 group 1
1.15 0x28000000 - 0x30000000
1.16 0x30000000 - 0x38000000
1.17 0x38000000 - 0x40000000
1.18 0x40000000 - 0x48000000
1.19 0x48000000 - 0x50000000
1.20 0x50000000 - 0x58000000
1.21 0x58000000 - 0x60000000
1.22 0x60000000 - 0x68000000
1.23 0x68000000 - 0x70000000
1.24 0x70000000 - 0x78000000
1.25 0x78000000 - 0x80000000
1.26 0x80000000 - 0x88000000
1.27 0x88000000 - 0x90000000
1.28 0x90000000 - 0x98000000
1.29 0x98000000 - 0xA0000000
1.30 0xA0000000 - 0xA8000000
1.31 0xA8000000 - 0xB0000000
1.32 0xB0000000 - 0xB8000000
1.33 0xB8000000 - 0xC0000000
1.34 0xC0000000 - 0xC8000000
1.35 0xC8000000 - 0xD0000000
1.36 0xD0000000 - 0xD8000000
1.37 0xD8000000 - 0xE0000000
1.38 0xE0000000 - 0xE8000000
1.39 0xE8000000 - 0xF0000000
1.40 0xF0000000 - 0xF8000000
1.41 0xF8000000 - 0xFF000000
Instruction set
General Info
It looks like instructions are 8 bytes long. Roughly speaking - first 4 bytes contain opcode and addressing mode. Second 4 bytes contain operands encoding.
Bit encoding used in this reference:
value
meaning
0
bit clear
1
bit set
x
dont care
?
unknown
see reference
Predicates
Not sure about predicates yet, but they are used to mask execution of certain instructions.
Notation is the following:
<predicate> <instruction>
For example:
!p0 mad.f32
To reduce amount of examples - they are not listed with predicates.
It is assumed that all predicates are applicable to all instructions in the group unless opposite is indicated.
Operands
Different types of operands exist. They are described in further sections:
Instructions may have up to four operands specified.
In this documentation they will be encoded as:
<op0> <op1> <op2> <op3>
Operand 0
Destination operand 0 can be encoded in different ways.
Usually the following fields are used to encode it:
alt_opt0 - alter opt0. this bit can be combined with opt0 to produce the following modes for op0:
alt_opt0
opt0
value
details
1
0
0
sa
1
0
1
{}
op0 encodes CNST6 . applicable only with swizzles.
1
1
0
index<N>
op0 encodes IDX6
1
1
1
index2 mode
op0 encodes RIO6 .
or selects other modes for encoding op0 if specified in alt_opt0.
or with Register Index Offset RIO6 using index1 mode if specified in opt0.
or with Constant CNST6 if specified in alt_opt0.
or with Index IDX6 if specified in alt_opt0.
or with Register Index Offset RIO6 using index2 mode if specified in alt_opt0.
Operand N
Source operand <N> can be encoded in different ways.
Usually the following fields are used to encode it:
alt_opt<N> - alter opt<N>. this bit can be combined with opt<N> to produce the following modes for op<N >:
alt_opt<N>
opt<N>
value
details
1
0
0
index1 mode
op<N> encodes RIO6 .
1
0
1
{}
op<N> encodes CNST6 . applicable only with swizzles.
1
1
0
immediate
op<N> encodes IMM6 .
1
1
1
index2 mode
op<N> encodes RIO6 .
or selects other modes for encoding op<N> if specified in alt_opt<N>.
or with Register Index Offset RIO6 using index1 mode if specified in alt_opt<N>.
or with CNST6 if specified in alt_opt<N>.
or IMM6 if specified in alt_opt<N>.
or with Register Index Offset RIO6 using index2 mode if specified in alt_opt<N>.
Registers
pa - primary attribute register. 32 bit long.
sa - secondary attribute register. 32 bit long.
o - output register. 32 bit long.
r - temporary register. 32 bit long.
i - internal register. 128 bit long.
Register Selector RS2
This encoding uses 2 bits to encode register type.
selector is encoded as:
1
0
meaning
0
0
r
0
1
o
1
0
pa
1
1
sa
Note that internal registers are not encoded - they are reserved in Register R6
Register Selector Indexable RSI2
This encoding uses 2 bits to encode register type.
selector is encoded as:
1
0
meaning
0
0
r
0
1
o
1
0
pa
1
1
index<N> mode
When index<N> mode is used - there has to be another field that encodes index expression with Register Index Offset RIO6
The way that index expression is buit:
<reg>[index1 * 2 + <offset>]
Example:
r[index1 * 2 + 8]
Register R6
This encoding uses 6 bits to encode register index.
register is encoded as:
5
4
3
2
1
0
index
0
0
0
0
0
0
0
0
0
0
0
0
1
2
...
...
...
...
...
...
1
1
1
0
1
0
116
1
1
1
0
1
1
118
1
1
1
1
0
0
i0 (reserved)
1
1
1
1
0
1
i1 (reserved)
1
1
1
1
1
0
i2 (reserved)
1
1
1
1
1
1
i3 (reserved)
index is calculated as: value * 2
Register expression is built as:
<reg><index>
Example:
sa68
Specific type of register can be selected with Register Selector RS2
For destination operand op0 specific type of register can be selected with Register Selector Indexable RSI2
Last 4 values are reserved for internal registers i0, i1, i2, i3
Register RI2
This encoding uses 2 bits to encode internal register.
1
0
value
0
0
i0
0
1
i1
1
0
i2
1
1
i3
Register Index Offset RIO6
rt is encoded as Register Selector RS2
offset is calculated as: value * 2
offset is encoded as:
3
2
1
0
offset
0
0
0
0
0
0
0
0
1
2
...
...
...
...
...
1
1
1
0
28
1
1
1
1
30
Immediates
Immediate IMM6
Some operands may act as immediate values which are encoded using 6 bits.
Indexes
Not sure what index<N>
expression means.
It can be used in 2 places:
Alternative mode for operand 0
Index IDX6
Uses 6 bits to encode index<N>
expression.
Index is calculated as value * 2. Max index is 126.
Example:
mad.f32 index24, r0, r0, r0
Constants
Constant CNST6
Some operands may act as constant values which are encoded using 6 bits.
Constants are taken from table below.
Constants differ in 32 and 16 bit mode.
32 bit mode
bank 1 is used for channel 1 of op0
for other operands - only bank 0 is used for each channel
f32 mode - bank0:
operand
hex
value
comment
0x00
0x00000000
0.0
0x01
0x00000000
0.0
0x02
0x3F800000
1.0
0x03
0x3F800000
1.0
0x04
0x40000000
2.0
2 ^ 1
0x05
0x41000000
8.0
2 ^ 3
0x06
0x42000000
32.0
2 ^ 5
0x07
0x43000000
128.0
2 ^ 7
0x08
0x44000000
512.0
2 ^ 9
0x09
0x45000000
2048.0
2 ^ 11
0x0A
0x46000000
8192.0
2 ^ 13
0x0B
0x47000000
32768.0
2 ^ 15
0x0C
0x3F000000
0.5
2 ^ -1
0x0D
0x3E000000
0.125
2 ^ -3
0x0E
0x3D000000
0.03125
2 ^ -5
0x0F
0x3C000000
0.0078125
2 ^ -7
0x10
0x3B000000
0.001953125
2 ^ -9
0x11
0x3A000000
0.00048828125
2 ^ -11
0x12
0x39000000
0.00012207031
2 ^ -13
0x13
0x38000000
0.000030517578
2 ^ -15
0x14
0x402DF854
2.7182817
e
0x15
0x3FB504F3
1.4142135
sqrt(2)
0x16
0x40490FDB
3.1415927
pi
0x17
0x3F490FDB
0.78539819
pi / 4
0x18
0x40C90FDB
6.2831855
pi * 2
0x19
0x41C90FDB
25.132742
pi * 8
0x1A
0x37800000
0.000015258789
1 / 2 ^ 16
0x1B
0x37800080
0.000015259022
1 / (2 ^ 16 - 1)
0x1C
0x35D00D01
0.0000015500992
sin(0.5) - taylor 4th term
0x1D
0x39888889
0.00026041668
sin(0.5) - taylor 3rd term
0x1E
0x3CAAAAAB
0.020833334
sin(0.5) - taylor 2nd term
0x1F
0x3F000000
0.5
sin(0.5) - taylor 1st term
0x20
0x00000000
0.0
0x21
0x00000000
0.0
0x22
0x3C003C00
0.0078268051
0x23
0x44004000
513.0
0x24
0x54005000
2.204392e12
0x25
0x64006000
9.4724031e21
0x26
0x74007000
4.0703468e31
0x27
0x34003800
1.1941302e-7
0x28
0x24002800
2.7789457e-17
0x29
0x14001800
6.4670817e-27
0x2A
0x04000800
1.5050001e-36
0x2B
0x35E2416F
0.0000016857356
0x2C
0x39A83DA8
0.00032089395
0x2D
0x3E484248
0.19556534
0x2E
0x4A484648
3281298.0
0x2F
0x00000000
0.0
0x30
0x00000000
0.0
0x31
0x30002555
4.6619181e-10
0x32
0x00000000
0.0
0x33
0x00000000
0.0
0x34
0x00000000
0.0
0x35
0x00000000
0.0
0x36
0x00000000
0.0
0x37
0x00000000
0.0
0x38
0xFFFFFFFF
-1.#QNB
-nan
0x39
0xFFFFFFFF
-1.#QNB
-nan
0x3A
0xFFFFFFFF
-1.#QNB
-nan
0x3B
0xFFFFFFFF
-1.#QNB
-nan
0x3C
0x7FFF7FFF
1.#QNB
+nan
0x3D
0x7FFF7FFF
1.#QNB
+nan
0x3E
0x7FFF7FFF
1.#QNB
+nan
0x3F
0x7FFF7FFF
1.#QNB
+nan
f32 mode - bank1:
operand
hex
value
comment
0x00
0x00000000
0.0
0x01
0x3F800000
1.0
0x02
0x00000000
0.0
0x03
0x3F800000
1.0
0x04
0x40800000
4.0
2 ^ 2
0x05
0x41800000
16.0
2 ^ 4
0x06
0x42800000
64.0
2 ^ 6
0x07
0x43800000
256.0
2 ^ 8
0x08
0x44800000
1024.0
2 ^ 10
0x09
0x45800000
4096.0
2 ^ 12
0x0A
0x46800000
16384.0
2 ^ 14
0x0B
0x47800000
65536.0
2 ^ 16
0x0C
0x3E800000
0.25
2 ^ -2
0x0D
0x3D800000
0.0625
2 ^ -4
0x0E
0x3C800000
0.015625
2 ^ -6
0x0F
0x3B800000
0.00390625
2 ^ -8
0x10
0x3A800000
0.0009765625
2 ^ 10
0x11
0x39800000
0.00024414062
2 ^ 12
0x12
0x38800000
0.000061035156
2 ^ 14
0x13
0x37800000
0.000015258789
2 ^ -16
0x14
0x3EBC5AB2
0.36787945
e ^ -1
0x15
0x3F3504F3
0.70710677
sqrt(2) ^ -1
0x16
0x3FC90FDB
1.5707964
pi / 2
0x17
0x3EC90FDB
0.39269909
pi / 8
0x18
0x41490FDB
12.566371
pi * 4
0x19
0x00000000
0.0
0x1A
0x38000000
0.000030517578
1 / 2 ^ 15
0x1B
0x38000100
0.000030518509
1 / (2 ^ 15 - 1)
0x1C
0x37B60B61
0.000021701389
cos(0.5) - taylor 4th term
0x1D
0x3B2AAAAB
0.0026041667
cos(0.5) - taylor 3rd term
0x1E
0x3E000000
0.125
cos(0.5) - taylor 2nd term
0x1F
0x3F800000
1.0
cos(0.5) - taylor 1st term
0x20
0x3C000000
0.0078125
1 / 2 ^ 7
0x21
0x00000000
0.0
0x22
0x3C003C00
0.0078268051
0x23
0x4C004800
3.362816e7
0x24
0x5C005800
1.4450222e17
0x25
0x6C006800
6.2093452e26
0x26
0x00007800
4.3048e-41
0x27
0x2C003000
1.8216539e-12
0x28
0x1C002000
4.2393006e-22
0x29
0x0C001000
9.8655761e-32
0x2A
0x00000000
0.0
0x2B
0x00000000
0.0
0x2C
0x00000000
0.0
0x2D
0x36483A48
0.0000029836247
0x2E
0x00004E48
2.8082e-41
0x2F
0x00000000
0.0
0x30
0x19550C44
1.1014319e-23
0x31
0x3C003800
0.0078258514
0x32
0x00000000
0.0
0x33
0x00000000
0.0
0x34
0x00000000
0.0
0x35
0x00000000
0.0
0x36
0x00000000
0.0
0x37
0x00000000
0.0
0x38
0x00000000
0.0
0x39
0x00000000
0.0
0x3A
0x00000000
0.0
0x3B
0x00000000
0.0
0x3C
0x00000000
0.0
0x3D
0x00000000
0.0
0x3E
0x00000000
0.0
0x3F
0x00000000
0.0
16 bit mode
Table for 16 bit mode does not have accurate values.
bank 1, bank 2, bank 3 is used for channel 1, channel 2, channel 3 of op0
for other operands - only bank 0 is used for each channel
f16 mode - bank 0:
operand
value
comment
0x00
0.0
0x01
0.0
0x02
0.0
0x03
0.0
0x04
0.0
0x05
0.0
0x06
0.0
0x07
0.0
0x08
0.0
0x09
0.0
0x0A
0.0
0x0B
0.0
0x0C
0.0
0x0D
0.0
0x0E
0.0
0x0F
0.0
0x10
0.0
0x11
0.0
0x12
0.0
0x13
0.0
0x14
-35456.0
0x15
7.5519e-05
0x16
0.00047946
0x17
0.00047946
0x18
0.00047946
0x19
0.00047946
0x1A
0.0
0x1B
7.6294e-06
0x1C
0.00030541
0x1D
-0.0001384
0x1E
-0.052094
0x1F
0.0
0x20
0.0
0x21
0.0
0x22
1.0
0x23
2.0
2 ^ 1
0x24
32.0
2 ^ 5
0x25
512.0
2 ^ 9
0x26
8192.0
2 ^ 13
0x27
0.5
2 ^ -1
0x28
0.03125
2 ^ -5
0x29
0.0019531
2 ^ -9
0x2A
0.00012207
2 ^ - 13
0x2B
2.7168
e
0x2C
1.4141
sqrt(2)
0x2D
3.1406
pi
0x2E
6.2813
pi * 2
0x2F
0.0
0x30
0.0
0x31
0.020828
0x32
0.0
0x33
0.0
0x34
0.0
0x35
0.0
0x36
0.0
0x37
0.0
0x38
-1.#QNB
0xFFFFFFFF
0x39
-1.#QNB
0xFFFFFFFF
0x3A
-1.#QNB
0xFFFFFFFF
0x3B
-1.#QNB
0xFFFFFFFF
0x3C
1.#QNB
0x7FFF7FFF
0x3D
1.#QNB
0x7FFF7FFF
0x3E
1.#QNB
0x7FFF7FFF
0x3F
1.#QNB
0x7FFF7FFF
f16 mode - bank 1:
operand
value
comment
0x00
0.0
0x01
0.0
0x02
1.875
0x03
1.875
0x04
2.0
0x05
2.5
0x06
3.0
0x07
3.5
0x08
4.0
0x09
5.0
0x0A
6.0
0x0B
7.0
0x0C
1.75
0x0D
1.5
0x0E
1.25
0x0F
1.0
0x10
0.875
0x11
0.75
0x12
0.625
0x13
0.5
0x14
2.0879
0x15
1.9268
0x16
2.1426
0x17
1.8213
0x18
2.3926
0x19
2.8926
0x1A
0.46875
0x1B
0.46875
0x1C
0.36328
0x1D
0.69141
0x1E
1.166
0x1F
1.75
0x20
0.0
0x21
0.0
0x22
1.0
0x23
4.0
0x24
64.0
0x25
1024.0
0x26
16384.0
0x27
0.25
0x28
0.015625
0x29
0.00097656
0x2A
6.1035e-05
0x2B
0.36768
0x2C
0.70703
0x2D
1.5703
0x2E
12.563
0x2F
0.0
0x30
0.0
0x31
0.125
0x32
0.0
0x33
0.0
0x34
0.0
0x35
0.0
0x36
0.0
0x37
0.0
0x38
-1.#QNB
0xFFFFFFFF
0x39
-1.#QNB
0xFFFFFFFF
0x3A
-1.#QNB
0xFFFFFFFF
0x3B
-1.#QNB
0xFFFFFFFF
0x3C
1.#QNB
0x7FFF7FFF
0x3D
1.#QNB
0x7FFF7FFF
0x3E
1.#QNB
0x7FFF7FFF
0x3F
1.#QNB
0x7FFF7FFF
f16 mode - bank 2:
operand
value
comment
0x00
0.0
0x01
0.0
0x02
0.0
0x03
0.0
0x04
0.0
0x05
0.0
0x06
0.0
0x07
0.0
0x08
0.0
0x09
0.0
0x0A
0.0
0x0B
0.0
0x0C
0.0
0x0D
0.0
0x0E
0.0
0x0F
0.0
0x10
0.0
0x11
0.0
0x12
0.0
0x13
0.0
0x14
214.25
0x15
7.5519e-05
0x16
0.00047946
0x17
0.00047946
0x18
0.00047946
0x19
0.0
0x1A
0.0
0x1B
1.5259e-05
0x1C
0.00022519
0x1D
-0.052094
0x1E
0.0
0x1F
0.0
0x20
0.0
0x21
0.0
0x22
1.0
0x23
8.0
0x24
128.0
0x25
2048.0
0x26
32768.0
0x27
0.125
0x28
0.0078125
0x29
0.00048828
0x2A
0.0
0x2B
0.0
0x2C
0.0
0x2D
0.78516
0x2E
25.125
0x2F
0.0
0x30
0.00026035
0x31
0.5
0x32
0.0
0x33
0.0
0x34
0.0
0x35
0.0
0x36
0.0
0x37
0.0
0x38
0.0
0x39
0.0
0x3A
0.0
0x3B
0.0
0x3C
0.0
0x3D
0.0
0x3E
0.0
0x3F
0.0
f16 mode - bank 3:
operand
value
comment
0x00
0.0
0x01
1.875
0x02
0.0
0x03
1.875
0x04
2.25
0x05
2.75
0x06
3.25
0x07
3.75
0x08
4.5
0x09
5.5
0x0A
6.5
0x0B
7.5
0x0C
1.625
0x0D
1.375
0x0E
1.125
0x0F
0.9375
0x10
0.8125
0x11
0.6875
0x12
0.5625
0x13
0.46875
0x14
1.6836
0x15
1.8018
0x16
1.9463
0x17
1.6963
0x18
2.6426
0x19
0.0
0x1A
0.5
0x1B
0.5
0x1C
0.48193
0x1D
0.89551
0x1E
1.5
0x1F
1.875
0x20
1.0
0x21
0.0
0x22
1.0
0x23
16.0
0x24
256.0
0x25
4096.0
0x26
0.0
0x27
0.0625
0x28
0.0039063
0x29
0.00024414
0x2A
0.0
0x2B
0.0
0x2C
0.0
0x2D
0.39258
0x2E
0.0
0x2F
0.0
0x30
0.0026035
0x31
1.0
0x32
0.0
0x33
0.0
0x34
0.0
0x35
0.0
0x36
0.0
0x37
0.0
0x38
0.0
0x39
0.0
0x3A
0.0
0x3B
0.0
0x3C
0.0
0x3D
0.0
0x3E
0.0
0x3F
0.0
Swizzles
Swizzle notation
There are 2 notations:
When some of channels have constants - text notation is used
mul.f32 r0.xyzw, r0.h1xx, r0.xxxx
When all channels have constants - constant notation is used
mul.f32 r0.xyzw, {0.5, 1, 1, 0.5}, r0.xxxx
When channel is masked in text notation it is marked as -
mul.f32 r0.-y-w, r0.-x-x, r0.-x-x
When channel is masked in constant notation it is replaced with zero
mul.f32 r0.-y-w, {0, 1, 0, 0.5}, r0.-x-x
Register Swizzle RSWZ2
This encoding uses 2 bits to encode the swizzle.
Usually combinations are additinally controlled by 1 or more bits called swz_alt_op
This type of swizzling does not allow precise control on each channel as opposed to RSWZ3
Usually there is a predefined table of swizzles.
Swizzle expression is built as:
<reg><index>.<swizzle>
Example:
r22.x
Register Swizzle RSWZ3
This encoding uses 3 bits to encode the mask.
channel is encoded as:
2
1
0
text notation
constant notation
0
0
0
x
x
0
0
1
y
y
0
1
0
z
z
0
1
1
w
w
1
0
0
0
0.0
1
0
1
1
1.0
1
1
0
2
2.0
1
1
1
h
0.5
swizzle expression is built as:
<reg><index>.<swizzle>
Example:
r22.x
Modifier and dest data format
At the moment it is not known which of the data format fields is dest and which is source.
This is the reason why term modifier is mixed with term dest data format.
0x00000000 - 0x08000000
Instructions
mad
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
data_format
predicate
0
0
0
0
0
7
6
5
4
3
2
1
0
swz_alt_op1
alt_opt0
abs_op1
alt_opt2
alt_opt3
x
x
x
7
6
5
4
3
2
1
0
swz_alt_op3
op3_swz
swz_alt_op2
swz_mask16
swz_mask32
x
x
7
6
5
4
3
2
1
0
swz_en
abs_op2
neg_op2
abs_op3
neg_op3
opt1
opt0
Lower 4 bytes
7
6
5
4
3
2
1
0
opt2
opt3
op0
7
6
5
4
3
2
1
0
op0
op2_swz
op1_swz
op1
Notes
x bits do not affect instruction or operands. might affect something else?
what do index<N> mean. are these registers or something?
looks like there is functionality to switch sign of index expression
probably can move swizzle masking to generic section? if other instructions use same encodings.
Fields - instruction
data_format:
predicate:
1
0
value
0
0
0
1
p0
1
0
!p0
1
1
Pn
Fields - operands
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
alt_opt0, opt0, op0
alt_opt2, opt2, op2
alt_opt3, opt3, op3
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzles_f32 and Swizzles_f16 .
Constants are taken from tables Constants .
Constants differ between 32 and 16 bit mode.
Swizzle masking
Masking is controled by control bits:
control bits: swz_en, swz_mask32, swz_mask16
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
Encoding used in masking table:
value
meaning
0
channel not selected
1
channel selected
x
channel masked
Masking table 32 bit mode:
swz_mask32
swz_en
ch0
ch1
0
0
0
0
0
1
1
0
1
0
x
1
1
1
1
1
Masking table 16 bit mode:
swz_mask16
swz_en
ch0
ch1
ch2
ch3
0
0
0
0
0
0
0
1
1
1
0
0
1
0
x
x
1
1
1
1
1
1
1
1
Swizzles f32
Swizzles of operand 1, operand 2 and operand 3 can not be precisely controlled and have predefined combinations.
Swizzles are controlled with bits:
swizzle fields: op1_swz, op2_swz, op3_swz
control bits: swz_alt_op1, swz_alt_op2, swz_alt_op3
Swizzles of operand 0 can not be controlled.
operand 0
operand 1
operand 2
operand 3
swz_alt_op1
op1_swz
value
0
0
0
xx
0
0
1
yy
0
1
0
zz
0
1
1
ww
1
0
0
xy
1
0
1
yz
1
1
0
xy
1
1
1
zw
swz_alt_op2
op2_swz
value
0
0
0
xx
0
0
1
yy
0
1
0
zz
0
1
1
ww
1
0
0
xy
1
0
1
xy
1
1
0
yy
1
1
1
wy
swz_alt_op3
op3_swz
value
0
0
0
xx
0
0
1
yy
0
1
0
zz
0
1
1
ww
1
0
0
xy
1
0
1
xz
1
1
0
xx
1
1
1
xy
Swizzles f16
Swizzles of operand 1, operand 2 and operand 3 can not be precisely controlled and have predefined combinations.
Swizzles are controlled with bits:
swizzle fields: op1_swz, op2_swz, op3_swz
control bits: swz_alt_op1, swz_alt_op2, swz_alt_op3
Swizzles of operand 0 can not be controlled.
operand 0
operand 1
operand 2
operand 3
swz_alt_op1
op1_swz
value
0
0
0
xxxx
0
0
1
yyyy
0
1
0
zzzz
0
1
1
wwww
1
0
0
xyzw
1
0
1
yzxw
1
1
0
xyww
1
1
1
zwxy
swz_alt_op2
op2_swz
value
0
0
0
xxxx
0
0
1
yyyy
0
1
0
zzzz
0
1
1
wwww
1
0
0
xyzw
1
0
1
xyyz
1
1
0
yyww
1
1
1
wyzw
swz_alt_op3
op3_swz
value
0
0
0
xxxx
0
0
1
yyyy
0
1
0
zzzz
0
1
1
wwww
1
0
0
xyzw
1
0
1
xzww
1
1
0
xxyz
1
1
1
xyzz
Examples
mad.f32 r0, r0, r0, r0
mad.f16 r0, r0, r0, r0
0x08000000 - 0x10000000
Instructions
mul.f32, add.f32, frc.f32, dsx.f32, dsy.f32, min.f32, max.f32, dot.f32
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
0
0
1
7
6
5
4
3
2
1
0
op1_swz_c3x
alt_opt0
op1_swz_c30
alt_opt1
alt_opt2
x
x
7
6
5
4
3
2
1
0
swz_alt_op2
op2_swz
swz_mask3
swz_mask2
swz_mask1
x
7
6
5
4
3
2
1
0
swz_en
abs_op1
neg_op1
abs_op2
op1_swz_c2x
opt0
Lower 4 bytes
7
6
5
4
3
2
1
0
opt1
opt2
op0
7
6
5
4
3
2
1
0
op0
op1_swz_c20
op1_swz_c1
op1_swz_c0
7
6
5
4
3
2
1
0
op1_swz_c0
opcode2
op1
Notes
Fields - instruction
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
!p0
1
0
1
!p1
1
1
0
!p2
1
1
1
Pn
opcode2:
6
5
4
value
0
0
0
mul.f32
0
0
1
add.f32
0
1
0
frc.f32
0
1
1
dsx.f32
1
0
0
dsy.f32
1
0
1
min.f32
1
1
0
max.f32
1
1
1
dot.f32
Fields - operands
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
alt_opt0, opt0, op0
alt_opt1, opt1, op1
alt_opt2, opt2, op2
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzle_masking .
Constants are taken from tables Constants .
Constants correspond to table for 32 bit mode.
Swizzle masking
Masking is controled by control bits:
control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
dot.f32 instruction has explicit swizzling in operand 1 and operand 2 so masking does not apply to these operands. number of channels is always 4.
Encoding used in masking table:
value
meaning
0
channel not selected
1
channel selected
x
channel masked
Masking table:
swz_mask3
swz_mask2
swz_mask1
swz_en
ch0
ch1
ch2
ch3
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
0
x
1
0
0
0
0
1
1
1
1
0
0
0
1
0
0
x
x
1
0
0
1
0
1
1
x
1
0
0
1
1
0
x
1
1
0
0
1
1
1
1
1
1
0
1
0
0
0
x
x
x
1
1
0
0
1
1
x
x
1
1
0
1
0
x
1
x
1
1
0
1
1
1
1
x
1
1
1
0
0
x
x
1
1
1
1
0
1
1
x
1
1
1
1
1
0
x
1
1
1
1
1
1
1
1
1
1
1
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Each channel of operand 1 can be precisely controlled with swizzle fields encoded as RSWZ3 .
op1_swz_c0, op1_swz_c1, op1_swz_c20, op1_swz_c2x, op1_swz_c30, op1_swz_c3x
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f32 instruction
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f32 instruction
swz_alt_op2
op2_swz
value
0
0
0
0
xxxx
0
0
0
1
yyyy
0
0
1
0
zzzz
0
0
1
1
wwww
0
1
0
0
xyzw
0
1
0
1
yzww
0
1
1
0
xyzz
0
1
1
1
xxyz
1
0
0
0
xyxy
1
0
0
1
xywz
1
0
1
0
zxyw
1
0
1
1
zwzw
1
1
0
0
yzxz
1
1
0
1
xxyy
1
1
1
0
xzww
1
1
1
1
xyz1
Examples
mul.f32 r0, r0, r0
add.f32 r0, r0, r0
frc.f32 r0, r0, r0
dsx.f32 r0, r0, r0
dsy.f32 r0, r0, r0
min.f32 r0, r0, r0
max.f32 r0, r0, r0
dot.f32 r0, r0.xxxx, r0.xxxx
0x10000000 - 0x18000000
Instructions
mul.f16, add.f16, frc.f16, dsx.f16, dsy.f16, min.f16, max.f16, dot.f16
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
0
1
0
7
6
5
4
3
2
1
0
op1_swz_c3x
alt_opt0
op1_swz_c30
alt_opt1
alt_opt2
x
x
7
6
5
4
3
2
1
0
swz_alt_op2
op2_swz
swz_mask3
swz_mask2
swz_mask1
x
7
6
5
4
3
2
1
0
swz_en
abs_op1
neg_op1
abs_op2
op1_swz_c2x
opt0
Lower 4 bytes
7
6
5
4
3
2
1
0
opt1
opt2
op0
7
6
5
4
3
2
1
0
op0
op1_swz_c20
op1_swz_c1
op1_swz_c0
7
6
5
4
3
2
1
0
op1_swz_c0
opcode2
op1
Notes
Fields - instruction
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
!p0
1
0
1
!p1
1
1
0
!p2
1
1
1
Pn
opcode2:
6
5
4
value
0
0
0
mul.f16
0
0
1
add.f16
0
1
0
frc.f16
0
1
1
dsx.f16
1
0
0
dsy.f16
1
0
1
min.f16
1
1
0
max.f16
1
1
1
dot.f16
Fields - operands
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
alt_opt0, opt0, op0
alt_opt1, opt1, op1
alt_opt2, opt2, op2
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzle_masking .
Constants are taken from tables Constants .
Constants correspond to table for 16 bit mode.
Swizzle masking
Masking is controled by control bits:
control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
dot.f16 instruction has explicit swizzling in operand 1 and operand 2 so masking does not apply to these operands. number of channels is always 4.
Encoding used in masking table:
value
meaning
0
channel not selected
1
channel selected
x
channel masked
Masking table:
swz_mask3
swz_mask2
swz_mask1
swz_en
ch0
ch1
ch2
ch3
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
0
x
1
0
0
0
0
1
1
1
1
0
0
0
1
0
0
x
x
1
0
0
1
0
1
1
x
1
0
0
1
1
0
x
1
1
0
0
1
1
1
1
1
1
0
1
0
0
0
x
x
x
1
1
0
0
1
1
x
x
1
1
0
1
0
x
1
x
1
1
0
1
1
1
1
x
1
1
1
0
0
x
x
1
1
1
1
0
1
1
x
1
1
1
1
1
0
x
1
1
1
1
1
1
1
1
1
1
1
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Each channel of operand 1 can be precisely controlled with swizzle fields encoded as RSWZ3 .
op1_swz_c0, op1_swz_c1, op1_swz_c20, op1_swz_c2x, op1_swz_c30, op1_swz_c3x
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f16 instruction
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f16 instruction
swz_alt_op2
op2_swz
value
0
0
0
0
xxxx
0
0
0
1
yyyy
0
0
1
0
zzzz
0
0
1
1
wwww
0
1
0
0
xyzw
0
1
0
1
yzww
0
1
1
0
xyzz
0
1
1
1
xxyz
1
0
0
0
xyxy
1
0
0
1
xywz
1
0
1
0
zxyw
1
0
1
1
zwzw
1
1
0
0
yzxz
1
1
0
1
xxyy
1
1
1
0
xzww
1
1
1
1
xyz1
Examples
mul.f16 r0, r0, r0
add.f16 r0, r0, r0
frc.f16 r0, r0, r0
dsx.f16 r0, r0, r0
dsy.f16 r0, r0, r0
min.f16 r0, r0, r0
max.f16 r0, r0, r0
dot.f16 r0, r0.xxxx, r0.xxxx
0x18000000 - 0x20000000
Notes
Any way to use constants for op2, op3? Apart from swizzle constants.
2 Strange fields for both dot.f32 and mad.f32.
dot.f32
Instructions
dot.f32
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
0
1
1
7
6
5
4
3
2
1
0
opcode2
c3_en
alt_opt0
alt_opt1
x
x
0
x
x
7
6
5
4
3
2
1
0
abs_op2
swz_en_strange1
swz_en_strange0
swz_mask3
swz_mask2
swz_mask1
x
x
7
6
5
4
3
2
1
0
swz_en
neg_op1
abs_op1
opt0
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
opt1
op2i
op0
7
6
5
4
3
2
1
0
op0
swz_alt_op2
op2_swz
op1_swz_c3
7
6
5
4
3
2
1
0
op1_swz_c3
op1_swz_c2
op1_swz_c1
op1_swz_c0
7
6
5
4
3
2
1
0
op1_swz_c0
op1
Fields - instruction
opcode2:
5
value
0
dot.f32
1
mad.f32
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
!p0
1
0
1
!p1
1
1
0
!p2
1
1
1
Pn
Fields - operands
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
alt_opt0, opt0, op0
alt_opt1, opt1, op1
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzle_masking .
Constants are taken from tables Constants .
Constants correspond to table for 32 bit mode.
Swizzle masking
Masking is controled by control bits:
control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
dot.f32 instruction has explicit swizzling in operand 1 and operand 2 so masking does not apply to these operands. number of channels is 3 or 4 depending on c3_en.
Encoding used in masking table:
value
meaning
0
channel not selected
1
channel selected
x
channel masked
Masking table operand 0 :
swz_mask3
swz_mask2
swz_mask1
swz_en
ch0
ch1
ch2
ch3
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
0
x
1
0
0
0
0
1
1
1
1
0
0
0
1
0
0
x
x
1
0
0
1
0
1
1
x
1
0
0
1
1
0
x
1
1
0
0
1
1
1
1
1
1
0
1
0
0
0
x
x
x
1
1
0
0
1
1
x
x
1
1
0
1
0
x
1
x
1
1
0
1
1
1
1
x
1
1
1
0
0
x
x
1
1
1
1
0
1
1
x
1
1
1
1
1
0
x
1
1
1
1
1
1
1
1
1
1
1
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
Each channel can be masked with control bits. Masking is described in Swizzle_masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Each channel of operand 1 can be precisely controlled with swizzle fields encoded as RSWZ3 .
op1_swz_c0, op1_swz_c1, op1_swz_c2, op1_swz_c3
Channel 3 can be enabled with bit:
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
Channel 3 can be enabled with bit:
3 channels:
swz_alt_op2
op2_swz
value
0
0
0
0
xxx
0
0
0
1
yyy
0
0
1
0
zzz
0
0
1
1
www
0
1
0
0
xyz
0
1
0
1
yzw
0
1
1
0
xxy
0
1
1
1
xyx
1
0
0
0
yyx
1
0
0
1
yyz
1
0
1
0
zxy
1
0
1
1
xzy
1
1
0
0
yzx
1
1
0
1
zyx
1
1
1
0
zzy
1
1
1
1
xy1
4 channels:
swz_alt_op2
op2_swz
value
0
0
0
0
xxxx
0
0
0
1
yyyy
0
0
1
0
zzzz
0
0
1
1
wwww
0
1
0
0
xyzw
0
1
0
1
yzww
0
1
1
0
xyzz
0
1
1
1
xxyz
1
0
0
0
xyxy
1
0
0
1
xywz
1
0
1
0
zxyw
1
0
1
1
zwzw
1
1
0
0
yzxz
1
1
0
1
xxyy
1
1
1
0
xzww
1
1
1
1
xyz1
Examples
dot.f32 r0, r0.xxx, i0.xxx
mad.f32
Instructions
mad.f32
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
0
1
1
7
6
5
4
3
2
1
0
swz_alt_op3_2
opcode2
alt_opt0
alt_opt1
x
1
x
x
x
7
6
5
4
3
2
1
0
abs_op2
op0_strange1
op0_strange0
swz_mask3
swz_mask2
swz_mask1
x
7
6
5
4
3
2
1
0
swz_en
neg_op1
abs_op1
neg_op3
abs_op3
swz_alt_op2_2
opt0
Lower 4 bytes
7
6
5
4
3
2
1
0
opt1
op2i
op0
7
6
5
4
3
2
1
0
op0
swz_alt_op2_x
op2_swz
swz_alt_op3_x
7
6
5
4
3
2
1
0
op3_swz
op3i
swz_alt_op1
x
7
6
5
4
3
2
1
0
op1_swz
op1
Fields - instruction
opcode2:
5
value
0
dot.f32
1
mad.f32
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
!p0
1
0
1
!p1
1
1
0
!p2
1
1
1
Pn
Fields - operands
Swizzle masking
Masking is controled by control bits:
control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
Encoding used in masking table:
value
meaning
0
channel not selected
1
channel selected
x
channel masked
Masking table:
swz_mask3
swz_mask2
swz_mask1
swz_en
ch0
ch1
ch2
ch3
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
0
x
1
0
0
0
0
1
1
1
1
0
0
0
1
0
0
x
x
1
0
0
1
0
1
1
x
1
0
0
1
1
0
x
1
1
0
0
1
1
1
1
1
1
0
1
0
0
0
x
x
x
1
1
0
0
1
1
x
x
1
1
0
1
0
x
1
x
1
1
0
1
1
1
1
x
1
1
1
0
0
x
x
1
1
1
1
0
1
1
x
1
1
1
1
1
0
x
1
1
1
1
1
1
1
1
1
1
1
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
Each channel can be masked with control bits. Masking is described in Swizzle masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Swizzles of operand 1 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
Each channel can be masked with control bits. Masking is described in Swizzle masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
swz_alt_op1
op1_swz
value
0
0
0
0
0
xxxx
0
0
0
0
1
yyyx
0
0
0
1
0
zzzx
0
0
0
1
1
wwwx
0
0
1
0
0
xyzx
0
0
1
0
1
yzwx
0
0
1
1
0
xxyx
0
0
1
1
1
xyxx
0
1
0
0
0
yyxx
0
1
0
0
1
yyzx
0
1
0
1
0
zxyx
0
1
0
1
1
xzyx
0
1
1
0
0
yzxx
0
1
1
0
1
zyxx
0
1
1
1
0
zzyx
0
1
1
1
1
xy1x
1
0
0
0
0
xyyx
1
0
0
0
1
yxyx
1
0
0
1
0
xxzx
1
0
0
1
1
yxxx
1
0
1
0
0
xy0x
1
0
1
0
1
x10x
1
0
1
1
0
000x
1
0
1
1
1
111x
1
1
0
0
0
hhhx
1
1
0
0
1
222x
1
1
0
1
0
x00x
1
1
0
1
1
{0.5, 0.5, 0.5, 0.5}
1
1
1
0
0
{0.5, 0.5, 0.5, 0.5}
1
1
1
0
1
{0.5, 0.5, 0.5, 0.5}
1
1
1
1
0
{0.5, 0.5, 0.5, 0.5}
1
1
1
1
1
{0.5, 0.5, 0.5, 0.5}
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
op2_swz, swz_alt_op2_x, swz_alt_op2_2
Each channel can be masked with control bits. Masking is described in Swizzle masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
swz_alt_op2_2
swz_alt_op2_x
op2_swz
value
0
0
0
0
0
xxxx
0
0
0
0
1
yyyx
0
0
0
1
0
zzzx
0
0
0
1
1
wwwx
0
0
1
0
0
xyzx
0
0
1
0
1
yzwx
0
0
1
1
0
xxyx
0
0
1
1
1
xyxx
0
1
0
0
0
yyxx
0
1
0
0
1
yyzx
0
1
0
1
0
zxyx
0
1
0
1
1
xzyx
0
1
1
0
0
yzxx
0
1
1
0
1
zyxx
0
1
1
1
0
zzyx
0
1
1
1
1
xy1x
1
0
0
0
0
xyyx
1
0
0
0
1
yxyx
1
0
0
1
0
xxzx
1
0
0
1
1
yxxx
1
0
1
0
0
xy0x
1
0
1
0
1
x10x
1
0
1
1
0
000x
1
0
1
1
1
111x
1
1
0
0
0
hhhx
1
1
0
0
1
222x
1
1
0
1
0
x00x
1
1
0
1
1
{0.5, 0.5, 0.5, 0.5}
1
1
1
0
0
{0.5, 0.5, 0.5, 0.5}
1
1
1
0
1
{0.5, 0.5, 0.5, 0.5}
1
1
1
1
0
{0.5, 0.5, 0.5, 0.5}
1
1
1
1
1
{0.5, 0.5, 0.5, 0.5}
Swizzles - operand 3
Swizzles of operand 3 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
op3_swz, swz_alt_op3_x, swz_alt_op3_2
Each channel can be masked with control bits. Masking is described in Swizzle masking .
swz_en, swz_mask1, swz_mask2, swz_mask3
swz_alt_op3_2
swz_alt_op3_x
op3_swz
value
0
0
0
0
0
xxxx
0
0
0
0
1
yyyx
0
0
0
1
0
zzzx
0
0
0
1
1
wwwx
0
0
1
0
0
xyzx
0
0
1
0
1
yzwx
0
0
1
1
0
xxyx
0
0
1
1
1
xyxx
0
1
0
0
0
yyxx
0
1
0
0
1
yyzx
0
1
0
1
0
zxyx
0
1
0
1
1
xzyx
0
1
1
0
0
yzxx
0
1
1
0
1
zyxx
0
1
1
1
0
zzyx
0
1
1
1
1
xy1x
1
0
0
0
0
xyyx
1
0
0
0
1
yxyx
1
0
0
1
0
xxzx
1
0
0
1
1
yxxx
1
0
1
0
0
xy0x
1
0
1
0
1
x10x
1
0
1
1
0
000x
1
0
1
1
1
111x
1
1
0
0
0
hhhx
1
1
0
0
1
222x
1
1
0
1
0
x00x
1
1
0
1
1
{0.5, 0.5, 0.5, 0.5}
1
1
1
0
0
{0.5, 0.5, 0.5, 0.5}
1
1
1
0
1
{0.5, 0.5, 0.5, 0.5}
1
1
1
1
0
{0.5, 0.5, 0.5, 0.5}
1
1
1
1
1
{0.5, 0.5, 0.5, 0.5}
Examples
mad.f32 r0, r0, i0, i0
0x20000000 - 0x28000000
Instructions
mad, dot, add, mul, subfl, exp, mov, log, rsq, rcp
Encoding
There is total of 10 instructions with 28 variations in this group. However instruction encoding is quite complex and is controlled by following fields:
op_sel2
opcode2
gr_sel
op_sel1
opcode3
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Allowed Instruction Encodings
Allowed encodings are defined by these tables.
opcode2, op_sel2, op_sel1 are a composite key for table Allowed Instructions
Join with this table to fill in instruction
gaps.
gr_sel = 0
opcode2
op_sel2
op_sel1
gr_sel
opcode3
instruction
0
x
x
0
0
0
x
invalid
0
x
x
0
0
1
0
instruction
0
x
x
0
0
1
1
invalid
0
x
x
0
1
0
0
invalid
0
x
x
0
1
0
1
instruction
0
x
x
0
1
1
x
instruction
1
x
x
0
0
0
0
invalid
1
x
x
0
0
0
1
instruction
1
x
x
0
0
1
x
instruction
1
x
x
0
1
x
x
instruction
2
x
x
0
0
0
0
invalid
2
x
x
0
0
0
1
instruction
2
x
x
0
0
1
x
instruction
2
x
x
0
1
x
x
instruction
3
x
x
0
0
0
0
invalid
3
x
x
0
0
0
1
instruction
3
x
x
0
0
1
x
instruction
3
x
x
0
1
x
x
instruction
4
x
x
0
0
0
0
invalid
4
x
x
0
0
0
1
instruction
4
x
x
0
0
1
x
instruction
4
x
x
0
1
x
x
instruction
5
x
x
0
0
0
0
invalid
5
x
x
0
0
0
1
instruction
5
x
x
0
0
1
x
instruction
5
x
x
0
1
x
x
instruction
6
x
x
0
0
0
0
invalid
6
x
x
0
0
0
1
instruction
6
x
x
0
0
1
x
instruction
6
x
x
0
1
x
x
instruction
7
x
x
0
0
0
0
invalid
7
x
x
0
0
0
1
instruction
7
x
x
0
0
1
x
instruction
7
x
x
0
1
x
x
instruction
gr_sel = 1, op_sel2 = 0
opcode2
op_sel2
op_sel1
gr_sel
opcode3
instruction
0
0
x
1
0
x
x
invalid
0
0
x
1
1
0
x
instruction
0
0
x
1
1
1
x
invalid
1
0
x
1
0
0
0
invalid
1
0
x
1
0
0
1
instruction
1
0
x
1
0
1
x
instruction
1
0
x
1
1
0
x
instruction
1
0
x
1
1
1
x
invalid
2
0
x
1
0
x
x
instruction
2
0
x
1
1
0
x
instruction
2
0
x
1
1
1
x
invalid
3
0
x
1
0
0
0
invalid
3
0
x
1
0
0
1
instruction
3
0
x
1
0
1
x
instruction
3
0
x
1
1
0
x
instruction
3
0
x
1
1
1
x
invalid
4
0
x
1
0
0
0
invalid
4
0
x
1
0
0
1
instruction
4
0
x
1
0
1
x
instruction
4
0
x
1
1
0
x
instruction
4
0
x
1
1
1
x
invalid
5
0
x
1
0
x
x
instruction
5
0
x
1
1
0
x
instruction
5
0
x
1
1
1
x
invalid
6
0
x
1
0
x
x
instruction
6
0
x
1
1
0
x
instruction
6
0
x
1
1
1
x
invalid
7
0
x
1
0
x
x
instruction
7
0
x
1
1
0
x
instruction
7
0
x
1
1
1
x
invalid
gr_sel = 1, op_sel2 = 1
opcode2
op_sel2
op_sel1
gr_sel
opcode3
instruction
0
1
x
1
0
x
x
invalid
0
1
x
1
1
0
x
instruction
0
1
x
1
1
1
x
invalid
1
1
x
1
0
0
0
invalid
1
1
x
1
0
0
1
instruction
1
1
x
1
0
1
x
instruction
1
1
x
1
1
0
x
instruction
1
1
x
1
1
1
x
invalid
2
1
x
1
0
0
0
invalid
2
1
x
1
0
0
1
instruction
2
1
x
1
0
1
x
instruction
2
1
x
1
1
0
x
instruction
2
1
x
1
1
1
x
invalid
3
1
x
1
0
0
0
invalid
3
1
x
1
0
0
1
instruction
3
1
x
1
0
1
x
instruction
3
1
x
1
1
0
x
instruction
3
1
x
1
1
1
x
invalid
4
1
x
1
0
x
x
instruction
4
1
x
1
1
0
x
instruction
4
1
x
1
1
1
x
invalid
5
1
x
1
0
x
x
instruction
5
1
x
1
1
0
x
instruction
5
1
x
1
1
1
x
invalid
6
1
x
1
0
x
x
invalid
6
1
x
1
1
0
x
invalid
6
1
x
1
1
1
x
invalid
7
1
x
1
0
x
x
invalid
7
1
x
1
1
0
x
invalid
7
1
x
1
1
1
x
invalid
Allowed Instructions
This table describes 10 instructions with 28 variations.
opcode2, op_sel2, op_sel1 is a composite key.
Note: instructions given in this table serve as example.
data format, registers, swizzling - everything can be changed.
However it is easier to give examples like this instead of trying to come up with some names that will
differ mad.f32 i0, r0, i0, i0
from mad.f32 r0.x, r0.x, i0.x, i0.x
opcode2
op_sel2
op_sel1
instruction
0
0
0
mad.f32 i0, r0, i0, i0
0
0
1
mad.f32 r0.xy, r0.xx, i0.xx, i0.xy
0
1
0
mad.f32 i0, r0.x, i0.x, i0.x
0
1
1
mad.f32 r0.x, r0.x, i0.x, i0.x
1
0
0
dot.f32 i0, r0.xxx, i0.xxx
1
0
1
dot.f32 r0.x, r0.xxx, i0.xxx
1
1
0
add.f32 i0, r0.x, i0.x
1
1
1
add.f32 r0.x, r0.x, i0.x
2
0
0
dot.f32 i0, r0.xxx, r0.xxx
2
0
1
dot.f32 r0.x, r0.xxx, r0.xxx
2
1
0
mul.f32 i0, r0.x, i0.x
2
1
1
mul.f32 r0.x, r0.x, i0.x
3
0
0
mul.f32 i0, r0, i0
3
0
1
mul.f32 r0.xy, r0.xx, i0.xx
3
1
0
subflr.f32 i0, r0.x, i0.x
3
1
1
subflr.f32 r0.x, r0.x, i0.x
4
0
0
add.f32 i0, r0, i0
4
0
1
add.f32 r0.xy, r0.xx, i0.xx
4
1
0
exp.f32 i0, r0.x
4
1
1
exp.f32 r0.x, r0.x
5
0
0
mov.f32 i0, r0
5
0
1
mov.f32 r0.xy, r0.xx
5
1
0
log.f32 i0, r0.x
5
1
1
log.f32 r0.x, r0.x
6
0
0
rsq.f32 i0, r0.x
6
0
1
rsq.f32 r0.x, r0.x
6
1
0
invalid
6
1
1
invalid
7
0
0
rcp.f32 i0, r0.x
7
0
1
rcp.f32 r0.x, r0.x
7
1
0
invalid
7
1
1
invalid
Notes
mad.f32 i0, r0, i0, i0
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mad.f32 r0.xy, r0.xx, i0.xx, i0.xy
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mad.f32 i0, r0.x, i0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mad.f32 r0.x, r0.x, i0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
dot.f32 i0, r0.xxx, i0.xxx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
dot.f32 r0.x, r0.xxx, i0.xxx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
1
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
dot.f32 i0, r0.xxx, r0.xxx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
dot.f32 r0.x, r0.xxx, r0.xxx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
add.f32 i0, r0, i0
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
add.f32 r0.xy, r0.xx, i0.xx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
add.f32 i0, r0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
add.f32 r0.x, r0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
0
1
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mul.f32 i0, r0, i0
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mul.f32 r0.xy, r0.xx, i0.xx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
1
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mul.f32 i0, r0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mul.f32 r0.x, r0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
subflr.f32 i0, r0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
subflr.f32 r0.x, r0.x, i0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
0
1
1
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
exp.f32 i0, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
exp.f32 r0.x, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mov.f32 i0, r0
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
mov.f32 r0.xy, r0.xx
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
1
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
log.f32 i0, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
log.f32 r0.x, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
1
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
0
1
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
rsq.f32 i0, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
1
0
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
rsq.f32 r0.x, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
1
0
x
1
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
rcp.f32 i0, r0.x
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
0
0
1
0
0
x
x
x
7
6
5
4
3
2
1
0
op_sel2
x
0
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
op_sel1
x
1
1
1
x
0
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
Lower 4 bytes
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode3
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
rcp.f32 r0.x, r0.x
Encoding
group 0
Instructions
mad.f32
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
1
0
0
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
7
6
5
4
3
2
1
0
gr_sel
?
?
?
?
0
?
?
?
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Lower 4 bytes
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
7
6
5
4
3
2
1
0
opcode3
?
?
?
?
?
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields - instruction
gr_sel:
3
value
0
group 0
1
group 1
predicate:
1
0
value
0
0
0
1
p0
1
0
!p0
1
1
Pn
opcode 3:
4
3
2
value
0
0
x
invalid
0
1
0
mad.f32
0
1
1
invalid
1
0
0
invalid
1
0
1
mad.f32
1
1
x
mad.f32
Fields - operands
Examples
mad.f32 i0, r0, i0, i0
group 1
Instructions
dot, mov, rsq, rcp, exp, log
Encoding
Higher 4 bytes
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
1
0
0
x
7
6
5
4
3
2
1
0
op_sel
data_format
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
gr_sel
x
1
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Lower 4 bytes
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields - instruction
gr_sel:
3
value
0
group 0
1
group 1
data_format:
predicate:
1
0
value
0
0
0
1
p0
1
0
!p0
1
1
Pn
opcode2 (depends on op_sel):
op_sel
6
5
4
value
0
0
0
0
invalid
0
0
0
1
invalid
0
0
1
0
dot
0
0
1
1
invalid
0
1
0
0
invalid
0
1
0
1
mov
0
1
1
0
rsq
0
1
1
1
rcp
op_sel
6
5
4
value
1
0
0
0
invalid
1
0
0
1
invalid
1
0
1
0
invalid
1
0
1
1
invalid
1
1
0
0
exp
1
1
0
1
log
1
1
1
0
invalid
1
1
1
1
invalid
Fields - operands
Examples
dot.f32
mov.f32
rsq.f32
rcp.f32
exp.f32
log.f32
dot.f16
mov.f16
rsq.f16
rcp.f16
exp.f16
log.f16
0x28000000 - 0x30000000
Instructions: dot, mov, rsq, rcp
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
1
0
1
x
7
6
5
4
3
2
1
0
data_format
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
x
1
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
Having bit 3 in byte 2 set to 0 produces invalid instruction
Fields:
data_format:
predicate:
1
0
value
0
0
0
1
p0
1
0
!p0
1
1
Pn
opcode2:
6
5
4
value
0
0
0
invalid
0
0
1
invalid
0
1
0
dot
0
1
1
invalid
1
0
0
invalid
1
0
1
mov
1
1
0
rsq
1
1
1
rcp
Examples:
dot.f32
mov.f32
rsq.f32
rcp.f32
dot.f16
mov.f16
rsq.f16
rcp.f16
0x30000000 - 0x38000000
Instructions: rcp, rsq, log, exp
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
1
1
0
7
6
5
4
3
2
1
0
data_format
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
modifier
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
modifier
should be omitted if data_format
matches modifier
.
Fields:
opcode2:
2
1
value
0
0
rcp
0
1
rsq
1
0
log
1
1
exp
data_format:
6
5
value
0
0
f32
0
1
f16
1
0
fx10
1
1
invalid
modifier:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
rcp.f32
rsq.f32
log.f32
exp.f32
rcp.f32.fx10
rsq.f32.fx10
log.f32.fx10
exp.f32.fx10
rcp.f16.f32
rsq.f16.f32
log.f16.f32
exp.f16.f32
rcp.f16.fx10
rsq.f16.fx10
log.f16.fx10
exp.f16.fx10
rcp.fx10.f32
rsq.fx10.f32
log.fx10.f32
exp.fx10.f32
rcp.fx10
rsq.fx10
log.fx10
exp.fx10
0x38000000 - 0x40000000
Instructions: mov, cmov, cmov8
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
0
1
1
1
7
6
5
4
3
2
1
0
cond
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
opcode2
data_format
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
cond
is only applicable to cmov
and cmov8
since this is conditional move.
Fields:
opcode2:
7
6
value
0
0
mov
0
1
cmov
1
0
cmov8
1
1
invalid
cond:
6
value
0
eqzero
1
ltzero
data_format:
2
1
0
value
0
0
0
i8
0
0
1
i16
0
1
0
i32
0
1
1
fx10
1
0
0
f16
1
0
1
f32
1
1
0
invalid
1
1
1
invalid
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
mov.i8
mov.i16
mov.i32
mov.fx10
mov.f16
mov.f32
cmov.eqzero.i8
cmov.eqzero.i16
cmov.eqzero.i32
cmov.eqzero.fx10
cmov.eqzero.f16
cmov.eqzero.f32
cmov8.eqzero.i8
cmov8.eqzero.i16
cmov8.eqzero.i32
cmov8.eqzero.fx10
cmov8.eqzero.f16
cmov8.eqzero.f32
cmov.ltzero.i8
cmov.ltzero.i16
cmov.ltzero.i32
cmov.ltzero.fx10
cmov.ltzero.f16
cmov.ltzero.f32
cmov8.ltzero.i8
cmov8.ltzero.i16
cmov8.ltzero.i32
cmov8.ltzero.fx10
cmov8.ltzero.f16
cmov8.ltzero.f32
0x40000000 - 0x48000000
Instructions: pack, (mov)
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
1
0
0
0
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
modifier
data_format
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
when modifier
matches data_format
it shall be omitted since it has no effect in terms of packing.
furthermore instruction mnemonic shall be replaced to mov
Fields:
data_format:
modifier:
3
2
1
value
0
0
0
u8
0
0
1
s8
0
1
0
o8
0
1
1
u16
1
0
0
s16
1
0
1
f16
1
1
0
f32
1
1
1
invalid
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
mov.u8
pack.s16.u8
pack.u8.s8
pack.s16.s8
pack.u8.o8
pack.s16.o8
pack.u8.u16
pack.s16.u16
pack.u8.s16
mov.s16
pack.u8.f16
pack.s16.f16
pack.u8.f32
pack.s16.f32
0x48000000 - 0x50000000
Instructions: this group only contains illegal instructions
Encoding:
7
6
5
4
3
2
1
0
opcode1
0
1
0
0
1
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
0x50000000 - 0x58000000
Instructions: and.u32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
1
0
1
0
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
and.u32
0x58000000 - 0x60000000
Instructions: xor.u32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
1
0
1
1
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
xor.u32
0x60000000 - 0x68000000
Instructions: shl.u32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
1
1
0
0
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
shl.u32
0x68000000 - 0x70000000
Instructions: shr.u32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
1
1
0
1
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
shr.u32
0x70000000 - 0x78000000
Instructions: rlp.u32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
0
1
1
1
0
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
rlp.u32
0x78000000 - 0x80000000
Instructions: this group only contains illegal instructions
Encoding:
7
6
5
4
3
2
1
0
opcode1
0
1
1
1
1
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
0x80000000 - 0x88000000
Instructions: add.fx8
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
0
0
0
0
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
add.fx8
0x88000000 - 0x90000000
Instructions: add.fx8, sub.fx8
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
0
0
0
1
x
7
6
5
4
3
2
1
0
opcode2
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
0
0
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
Having bits 2, 3 in byte 2 set to 1 produces invalid instruction
Fields:
opcode2:
5
4
value
0
0
add.fx8
0
1
sub.fx8
1
0
invalid
1
1
invalid
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
add.fx8
sub.fx8
0x90000000 - 0x98000000
Instructions: add.fx8, sub.fx8, min.fx8, max.fx8
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
0
0
1
0
x
7
6
5
4
3
2
1
0
opcode2
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
0
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
Having bit 0 in byte 2 set to 1 produces invalid instruction
Fields:
opcode2:
5
4
value
0
0
add.fx8
0
1
sub.fx8
1
0
min.fx8
1
1
max.fx8
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
add.fx8
sub.fx8
min.fx8
max.fx8
0x98000000 - 0xA0000000
Instructions: mad.u8
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
0
0
1
1
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
modifier
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
modifier:
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
mad.u8
mad.sat.u8
0xA0000000 - 0xA8000000
Instructions: mad
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
0
1
0
0
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
data_format
modifier
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
data_format:
modifier:
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
mad.u16
mad.u16.sat
mad.i16
mad.i16.sat
0xA8000000 - 0xB0000000
Instructions: mad
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
0
1
0
1
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
data_format
modifier
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
data_format:
modifier:
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
mad.u32
mad.u32.sat
mad.i32
mad.i32.sat
0xB0000000 - 0xB8000000
Instructions: this group only contains illegal instructions
Encoding:
7
6
5
4
3
2
1
0
opcode1
1
0
1
1
0
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
0xB8000000 - 0xC0000000
Instructions: this group only contains illegal instructions
Encoding:
7
6
5
4
3
2
1
0
opcode1
1
0
1
1
1
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
0xC0000000 - 0xC8000000
Instructions: this group only contains illegal instructions
Encoding:
7
6
5
4
3
2
1
0
opcode1
1
1
0
0
0
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
0xC8000000 - 0xD0000000
Instructions: mad.u8
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
0
0
1
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
modifier
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
modifier:
predicate:
2
1
value
0
0
0
1
p0
1
0
p1
1
1
!p0
Examples:
mad.u8
mad.sat.u8
0xD0000000 - 0xD8000000
Instructions: mad
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
0
1
0
7
6
5
4
3
2
1
0
modifier
x
x
0
x
x
x
x
7
6
5
4
3
2
1
0
data_format
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
Having bit 5 in byte 1 set to 1 produces invalid instruction
Fields:
modifier:
data_format:
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
mad.u32.s0
mad.i32.s0
mad.u32.s1
mad.i32.s1
0xD8000000 - 0xE0000000
Instructions: this group only contains illegal instructions
Encoding:
7
6
5
4
3
2
1
0
opcode1
1
1
0
1
1
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
x
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
0xE0000000 - 0xE8000000
Instructions: tex
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
0
0
0
0
0
7
6
5
4
3
2
1
0
modifier
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
data_format
dim
func
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
dim:
3
2
value
0
0
1D
0
1
2D
1
0
Cube
1
1
invalid
func:
1
0
value
0
0
0
1
Bias
1
0
Replace
1
1
Grad
modifier:
data_format:
7
6
value
0
0
0
1
invalid
1
0
f16
1
1
f32
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
tex1D
tex1D.f16
tex1D.f32
tex1D.minp
tex1D.minp.f16
tex1D.minp.f32
tex1DBias
tex1DBias.f16
tex1DBias.f32
tex1DBias.minp
tex1DBias.minp.f16
tex1DBias.minp.f32
tex1DReplace
tex1DReplace.f16
tex1DReplace.f32
tex1DReplace.minp
tex1DReplace.minp.f16
tex1DReplace.minp.f32
tex1DGrad
tex1DGrad.f16
tex1DGrad.f32
tex1DGrad.minp
tex1DGrad.minp.f16
tex1DGrad.minp.f32
tex2D
tex2D.f16
tex2D.f32
tex2D.minp
tex2D.minp.f16
tex2D.minp.f32
tex2DBias
tex2DBias.f16
tex2DBias.f32
tex2DBias.minp
tex2DBias.minp.f16
tex2DBias.minp.f32
tex2DReplace
tex2DReplace.f16
tex2DReplace.f32
tex2DReplace.minp
tex2DReplace.minp.f16
tex2DReplace.minp.f32
tex2DGrad
tex2DGrad.f16
tex2DGrad.f32
tex2DGrad.minp
tex2DGrad.minp.f16
tex2DGrad.minp.f32
texCube
texCube.f16
texCube.f32
texCube.minp
texCube.minp.f16
texCube.minp.f32
texCubeBias
texCubeBias.f16
texCubeBias.f32
texCubeBias.minp
texCubeBias.minp.f16
texCubeBias.minp.f32
texCubeReplace
texCubeReplace.f16
texCubeReplace.f32
texCubeReplace.minp
texCubeReplace.minp.f16
texCubeReplace.minp.f32
texCubeGrad
texCubeGrad.f16
texCubeGrad.f32
texCubeGrad.minp
texCubeGrad.minp.f16
texCubeGrad.minp.f32
0xE8000000 - 0xF0000000
Instructions: lda32, ldl32, ldt32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
0
1
7
6
5
4
3
2
1
0
modifier
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
index
opcode2
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
index
is only applicable when fetch
modifier is specified
Fields:
modifier:
index:
7
6
5
4
value
0
0
0
0
1
0
0
0
1
2
0
0
1
0
3
0
0
1
1
4
0
1
0
0
5
0
1
0
1
6
0
1
1
0
7
0
1
1
1
8
1
0
0
0
9
1
0
0
1
10
1
0
1
0
11
1
0
1
1
12
1
1
0
0
13
1
1
0
1
14
1
1
1
0
15
1
1
1
1
16
opcode2:
3
2
value
0
0
lda32
0
1
ldl32
1
0
ldt32
1
1
invalid
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
lda32
ldl32
ldt32
lda32.fetch1
lda32.fetch2
lda32.fetch3
lda32.fetch4
lda32.fetch5
lda32.fetch6
lda32.fetch7
lda32.fetch8
lda32.fetch9
lda32.fetch10
lda32.fetch11
lda32.fetch12
lda32.fetch13
lda32.fetch14
lda32.fetch15
lda32.fetch16
ldl32.fetch1
ldl32.fetch2
ldl32.fetch3
ldl32.fetch4
ldl32.fetch5
ldl32.fetch6
ldl32.fetch7
ldl32.fetch8
ldl32.fetch9
ldl32.fetch10
ldl32.fetch11
ldl32.fetch12
ldl32.fetch13
ldl32.fetch14
ldl32.fetch15
ldl32.fetch16
ldt32.fetch1
ldt32.fetch2
ldt32.fetch3
ldt32.fetch4
ldt32.fetch5
ldt32.fetch6
ldt32.fetch7
ldt32.fetch8
ldt32.fetch9
ldt32.fetch10
ldt32.fetch11
ldt32.fetch12
ldt32.fetch13
ldt32.fetch14
ldt32.fetch15
ldt32.fetch16
0xF0000000 - 0xF8000000
Instructions: sta32, stl32, stt32
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
0
7
6
5
4
3
2
1
0
modifier
x
x
x
x
x
x
x
7
6
5
4
3
2
1
0
index
opcode2
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
index
is only applicable when fetch
modifier is specified
Fields:
modifier:
index:
7
6
5
4
value
0
0
0
0
1
0
0
0
1
2
0
0
1
0
3
0
0
1
1
4
0
1
0
0
5
0
1
0
1
6
0
1
1
0
7
0
1
1
1
8
1
0
0
0
9
1
0
0
1
10
1
0
1
0
11
1
0
1
1
12
1
1
0
0
13
1
1
0
1
14
1
1
1
0
15
1
1
1
1
16
opcode2:
3
2
value
0
0
sta32
0
1
stl32
1
0
stt32
1
1
invalid
predicate:
2
1
0
value
0
0
0
0
0
1
p0
0
1
0
p1
0
1
1
p2
1
0
0
p3
1
0
1
!p0
1
1
0
!p1
1
1
1
Pn
Examples:
sta32
stl32
stt32
sta32.fetch1
sta32.fetch2
sta32.fetch3
sta32.fetch4
sta32.fetch5
sta32.fetch6
sta32.fetch7
sta32.fetch8
sta32.fetch9
sta32.fetch10
sta32.fetch11
sta32.fetch12
sta32.fetch13
sta32.fetch14
sta32.fetch15
sta32.fetch16
stl32.fetch1
stl32.fetch2
stl32.fetch3
stl32.fetch4
stl32.fetch5
stl32.fetch6
stl32.fetch7
stl32.fetch8
stl32.fetch9
stl32.fetch10
stl32.fetch11
stl32.fetch12
stl32.fetch13
stl32.fetch14
stl32.fetch15
stl32.fetch16
stt32.fetch1
stt32.fetch2
stt32.fetch3
stt32.fetch4
stt32.fetch5
stt32.fetch6
stt32.fetch7
stt32.fetch8
stt32.fetch9
stt32.fetch10
stt32.fetch11
stt32.fetch12
stt32.fetch13
stt32.fetch14
stt32.fetch15
stt32.fetch16
0xF8000000 - 0xFF000000
Notes:
this instruction group is much more complex than others so description is given in form of "glued" truth tables instead of independent truth tables.
predicate 000
Instructions:
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
0
0
0
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
opcode4
modifier2
opcode3
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Fields:
opcode2:
predicate
modifier1
opcode3
opcode4
modifier2
5
4
3
value
0
0
0
0
0
x
0
0
0
0
ba
0
0
0
0
1
x
0
0
0
0
mov
0
0
0
0
0
x
1
0
0
0
ba.savelink
0
0
0
0
1
x
1
0
0
0
mov
0
0
0
0
0
x
x
0
0
1
mov.f32
0
0
0
0
1
x
x
0
0
1
mov
0
0
0
0
x
x
x
0
1
x
mov.f32
0
0
0
0
x
x
x
1
0
x
mov.f32
0
0
0
0
x
0
x
1
1
x
pcoeff
0
0
0
0
x
1
x
1
1
x
ptoff
predicate
modifier1
opcode4
5
4
3
value
0
0
0
1
x
0
0
x
invalid
0
0
0
1
x
0
1
x
invalid
0
0
0
1
x
1
0
x
mov.f32
0
0
0
1
0
1
1
x
pcoeff
0
0
0
1
1
1
1
x
ptoff
Examples:
predicate 001
Instructions:
Encoding
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
0
0
1
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
sub_pred
x
x
x
x
x
x
modifier2
opcode3
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2:
predicate
modifier1
opcode3
modifier2
sub_pred
5
4
3
value
0
0
1
0
0
0
x
x
0
0
0
p0 ba
0
0
1
0
1
0
x
x
0
0
0
p0 mov
0
0
1
0
0
1
x
x
0
0
0
p0 ba.savelink
0
0
1
0
1
1
x
x
0
0
0
p0 mov
0
0
1
0
0
x
x
x
0
0
1
mov.f32
0
0
1
0
1
x
x
x
0
0
1
p0 mov
0
0
1
0
x
x
x
x
0
1
x
mov.f32
0
0
1
0
x
x
x
x
1
0
x
mov.f32
0
0
1
0
x
x
0
0
1
1
x
kill
0
0
1
0
x
x
0
1
1
1
x
!p0 kill
0
0
1
0
x
x
1
0
1
1
x
!p1 kill
0
0
1
0
x
x
1
1
1
1
x
p0 kill
predicate
modifier1
sub_pred
5
4
3
value
0
0
1
1
x
x
0
0
x
invalid
0
0
1
1
x
x
0
1
x
invalid
0
0
1
1
x
x
1
0
x
mov.f32
0
0
1
1
0
0
1
1
x
kill
0
0
1
1
0
1
1
1
x
!p0 kill
0
0
1
1
1
0
1
1
x
!p1 kill
0
0
1
1
1
1
1
1
x
p0 kill
Examples:
predicate 010
Instructions:
Encoding:
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
0
1
0
7
6
5
4
3
2
1
0
modifier1
opcode2
opcode4
x
x
modifier1
opcode2
x
x
x
7
6
5
4
3
2
1
0
opcode5
opcode6
x
x
x
modifier2
opcode3
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2
predicate
modifier1
opcode3
modifier2
5
4
3
value
0
1
0
0
0
0
0
0
0
0
p1 ba
0
1
0
0
0
1
0
0
0
0
p1 mov
0
1
0
0
0
0
1
0
0
0
p1 ba.savelink
0
1
0
0
0
1
1
0
0
0
p1 mov
0
1
0
0
0
0
x
0
0
1
mov.f32
0
1
0
0
0
1
x
0
0
1
p1 mov
0
1
0
0
0
x
x
0
1
x
mov.f32
0
1
0
0
0
x
x
1
0
x
mov.f32
0
1
0
0
0
x
x
1
1
x
invalid
predicate
modifier1
opcode4
opcode5
opcode6
5
4
3
value
0
1
0
0
1
0
x
x
x
x
x
0
0
x
mov.f32
0
1
0
0
1
1
0
x
0
0
x
0
0
x
mov.f32
0
1
0
0
1
1
0
x
0
1
0
0
0
x
mov.f32
0
1
0
0
1
1
0
x
0
1
1
0
0
x
invalid
0
1
0
0
1
1
0
x
1
0
x
0
0
x
invalid
0
1
0
0
1
1
0
x
1
1
0
0
0
x
invalid
0
1
0
0
1
1
0
x
1
1
1
0
0
x
mov.f32
0
1
0
0
1
1
1
0
0
0
x
0
0
x
mov.f32
0
1
0
0
1
1
1
0
0
1
0
0
0
x
mov.f32
0
1
0
0
1
1
1
0
0
1
1
0
0
x
invalid
0
1
0
0
1
1
1
0
1
0
x
0
0
x
invalid
0
1
0
0
1
1
1
0
1
1
0
0
0
x
invalid
0
1
0
0
1
1
1
0
1
1
1
0
0
x
mov.f32
0
1
0
0
1
1
1
1
x
x
x
0
0
x
invalid
0
1
0
0
1
x
x
x
x
x
x
0
1
x
invalid
0
1
0
0
1
x
x
x
x
x
x
1
0
x
mov.f32
0
1
0
0
1
x
x
x
x
x
x
1
1
x
invalid
predicate
modifier1
opcode3
modifier2
5
4
3
value
0
1
0
1
0
0
0
0
0
0
p1 ba
0
1
0
1
0
1
0
0
0
0
p1 mov
0
1
0
1
0
0
1
0
0
0
p1 ba.savelink
0
1
0
1
0
1
1
0
0
0
p1 mov
0
1
0
1
0
0
x
0
0
1
mov.f32
0
1
0
1
0
1
x
0
0
1
p1 mov
0
1
0
1
0
x
x
0
1
x
mov.f32
0
1
0
1
0
x
x
1
0
x
mov.f32
0
1
0
1
0
x
x
1
1
x
invalid
predicate
modifier1
5
4
3
value
0
1
0
1
1
0
0
x
mov.f32
0
1
0
1
1
0
1
x
invalid
0
1
0
1
1
1
0
x
mov.f32
0
1
0
1
1
1
1
x
invalid
Examples:
predicate 011
Instructions:
Encoding
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
0
1
1
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
sub_pred
x
x
x
x
x
x
opcode4
opcode5
modifier2
opcode3
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2
predicate
modifier1
opcode3
modifier2
opcode4
opcode5
sub_pred
5
4
3
value
0
1
1
0
0
0
x
x
x
x
x
x
0
0
0
p2 ba
0
1
1
0
1
0
x
x
x
x
x
x
0
0
0
p2 mov
0
1
1
0
0
1
x
x
x
x
x
x
0
0
0
p2 ba.savelink
0
1
1
0
1
1
x
x
x
x
x
x
0
0
0
p2 mov
0
1
1
0
0
x
x
x
x
x
x
x
0
0
1
mov.f32
0
1
1
0
1
x
x
x
x
x
x
x
0
0
1
p2 mov
0
1
1
0
x
x
x
x
x
x
x
x
0
1
x
mov.f32
0
1
1
0
x
x
0
0
0
x
x
x
1
0
x
mov.f32
0
1
1
0
x
x
0
0
1
x
x
x
1
0
x
mov.f32
0
1
1
0
x
x
0
1
0
x
x
x
1
0
x
mov.f32
0
1
1
0
x
x
0
1
1
0
x
x
1
0
x
mov.f32
0
1
1
0
x
x
0
1
1
1
x
x
1
0
x
invalid
0
1
1
0
x
x
1
0
0
x
x
x
1
0
x
mov.f32
0
1
1
0
x
x
1
0
1
x
x
x
1
0
x
mov.f32
0
1
1
0
x
x
1
1
0
x
x
x
1
0
x
invalid
0
1
1
0
x
x
1
1
1
x
x
x
1
0
x
invalid
0
1
1
0
x
x
x
x
x
x
0
0
1
1
x
depthf
0
1
1
0
x
x
x
x
x
x
0
1
1
1
x
p0 depthf
0
1
1
0
x
x
x
x
x
x
1
0
1
1
x
p1 depthf
0
1
1
0
x
x
x
x
x
x
1
1
1
1
x
!p0 depthf
predicate
modifier1
opcode4
opcode5
sub_pred
5
4
3
value
0
1
1
1
x
x
x
x
x
x
0
0
x
invalid
0
1
1
1
x
x
x
x
x
x
0
1
x
invalid
0
1
1
1
0
0
0
x
x
x
1
0
x
mov.f32
0
1
1
1
0
0
1
x
x
x
1
0
x
mov.f32
0
1
1
1
0
1
0
x
x
x
1
0
x
mov.f32
0
1
1
1
0
1
1
0
x
x
1
0
x
mov.f32
0
1
1
1
0
1
1
1
x
x
1
0
x
invalid
0
1
1
1
1
0
0
x
x
x
1
0
x
mov.f32
0
1
1
1
1
0
1
x
x
x
1
0
x
mov.f32
0
1
1
1
1
1
0
x
x
x
1
0
x
invalid
0
1
1
1
1
1
1
x
x
x
1
0
x
invalid
0
1
1
1
x
x
x
x
0
0
1
1
x
depthf
0
1
1
1
x
x
x
x
0
1
1
1
x
p0 depthf
0
1
1
1
x
x
x
x
1
0
1
1
x
p1 depthf
0
1
1
1
x
x
x
x
1
1
1
1
x
!p0 depthf
Examples:
predicate 100
Instructions:
Encoding
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
1
0
0
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
sub_pred
x
x
x
x
x
modifier2
opcode3
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2:
predicate
modifier1
opcode3
modifier2
sub_pred
5
4
3
value
1
0
0
0
0
0
x
x
x
0
0
0
p3 ba
1
0
0
0
1
0
x
x
x
0
0
0
p3 mov
1
0
0
0
0
1
x
x
x
0
0
0
p3 ba.savelink
1
0
0
0
1
1
x
x
x
0
0
0
p3 mov
1
0
0
0
0
x
x
x
x
0
0
1
mov.f32
1
0
0
0
1
x
x
x
x
0
0
1
p3 mov
1
0
0
0
x
x
x
x
x
0
1
x
mov.f32
1
0
0
0
x
x
0
0
0
1
0
x
mov.u32
1
0
0
0
x
x
0
0
1
1
0
x
p0 mov.u32
1
0
0
0
x
x
0
1
0
1
0
x
p1 mov.u32
1
0
0
0
x
x
0
1
1
1
0
x
p2 mov.u32
1
0
0
0
x
x
1
0
0
1
0
x
p3 mov.u32
1
0
0
0
x
x
1
0
1
1
0
x
!p0 mov.u32
1
0
0
0
x
x
1
1
0
1
0
x
!p1 mov.u32
1
0
0
0
x
x
1
1
1
1
0
x
Pn mov.u32
1
0
0
0
x
x
x
x
x
1
1
x
invalid
predicate
modifier1
sub_pred
5
4
3
value
1
0
0
1
x
x
x
0
0
x
invalid
1
0
0
1
x
x
x
0
1
x
invalid
1
0
0
1
0
0
0
1
0
x
mov.u32
1
0
0
1
0
0
1
1
0
x
p0 mov.u32
1
0
0
1
0
1
0
1
0
x
p1 mov.u32
1
0
0
1
0
1
1
1
0
x
p2 mov.u32
1
0
0
1
1
0
0
1
0
x
p3 mov.u32
1
0
0
1
1
0
1
1
0
x
!p0 mov.u32
1
0
0
1
1
1
0
1
0
x
!p1 mov.u32
1
0
0
1
1
1
1
1
0
x
Pn mov.u32
1
0
0
1
x
x
x
1
1
x
invalid
Examples:
predicate 101
Instructions:
Encoding
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
1
0
1
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
modifier2
opcode3
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2:
predicate
modifier1
opcode3
modifier2
5
4
3
value
1
0
1
0
0
0
0
0
0
!p0 ba
1
0
1
0
1
0
0
0
0
!p0 mov
1
0
1
0
0
1
0
0
0
!p0 ba.savelink
1
0
1
0
1
1
0
0
0
!p0 mov
1
0
1
0
0
x
0
0
1
mov.f32
1
0
1
0
1
x
0
0
1
!p0 mov
1
0
1
0
x
x
0
1
x
mov.f32
1
0
1
0
x
x
1
0
x
mov.f32
1
0
1
0
x
x
1
1
x
invalid
predicate
modifier1
5
4
3
value
1
0
1
1
0
0
x
invalid
1
0
1
1
0
1
x
invalid
1
0
1
1
1
0
x
mov.f32
1
0
1
1
1
1
x
invalid
Examples:
predicate 110
Instructions:
Encoding
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
1
1
0
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
modifier2
opcode3
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2:
predicate
modifier1
opcode3
modifier2
5
4
3
value
1
1
0
0
0
0
0
0
0
!p1 ba
1
1
0
0
1
0
0
0
0
!p1 mov
1
1
0
0
0
1
0
0
0
!p1 ba.savelink
1
1
0
0
1
1
0
0
0
!p1 mov
1
1
0
0
0
x
0
0
1
mov.f32
1
1
0
0
1
x
0
0
1
!p1 mov
1
1
0
0
x
x
0
1
x
invalid
1
1
0
0
x
x
1
0
x
mov.f32
1
1
0
0
x
x
1
1
x
invalid
predicate
modifier1
5
4
3
value
1
1
0
1
0
0
x
invalid
1
1
0
1
0
1
x
invalid
1
1
0
1
1
0
x
mov.f32
1
1
0
1
1
1
x
invalid
Examples:
predicate 111
Instructions:
Encoding
7
6
5
4
3
2
1
0
opcode1
predicate
1
1
1
1
1
1
1
1
7
6
5
4
3
2
1
0
modifier1
opcode2
x
x
x
x
7
6
5
4
3
2
1
0
modifier2
opcode3
x
x
x
x
x
x
7
6
5
4
3
2
1
0
?
?
?
?
?
?
?
?
Notes:
predicate does not apply to all instructions
Fields
opcode2:
predicate
modifier1
opcode3
modifier2
5
4
3
value
1
1
1
0
0
0
0
0
0
Pn ba
1
1
1
0
1
0
0
0
0
Pn mov
1
1
1
0
0
1
0
0
0
Pn ba.savelink
1
1
1
0
1
1
0
0
0
Pn mov
1
1
1
0
0
x
0
0
1
mov.f32
1
1
1
0
1
x
0
0
1
Pn mov
1
1
1
0
x
x
0
1
x
invalid
1
1
1
0
x
x
1
0
x
mov.f32
1
1
1
0
x
x
1
1
x
invalid
predicate
modifier1
5
4
3
value
1
1
1
1
0
0
x
invalid
1
1
1
1
0
1
x
invalid
1
1
1
1
1
0
x
mov.f32
1
1
1
1
1
1
x
invalid
Examples: