SGX543: Difference between revisions
Line 2,644: | Line 2,644: | ||
Constants are taken from tables [[SGX543#Constants|Constants]]. | Constants are taken from tables [[SGX543#Constants|Constants]]. | ||
Constants correspond to table for 32 bit mode | Constants correspond to table for 32 bit mode. | ||
===== Swizzle masking ===== | ===== Swizzle masking ===== |
Revision as of 14:52, 4 March 2018
Instruction set
General Info
It looks like instructions are 8 bytes long. Roughly speaking - first 4 bytes contain opcode and addressing mode. Second 4 bytes contain operands encoding.
Bit encoding used in this reference:
value | meaning |
---|---|
0 | bit clear |
1 | bit set |
x | dont care |
? | unknown |
see reference |
Predicates
Not sure about predicates yet, but they are used to mask execution of certain instructions.
Notation is the following:
<predicate> <instruction>
For example:
!p0 mad.f32
To reduce amount of examples - they are not listed with predicates.
It is assumed that all predicates are applicable to all instructions in the group unless opposite is indicated.
Operands
Different types of operands exist. They are described in further sections:
Instructions may have up to four operands specified.
In this documentation they will be encoded as:
<op0> <op1> <op2> <op3>
Operand 0
Destination operand 0 can be encoded in different ways.
Usually the following fields are used to encode it:
- alt_opt0 - alter opt0. this bit can be combined with opt0 to produce the following modes for op0:
alt_opt0 | opt0 | value | details | |
---|---|---|---|---|
1 | 0 | 0 | sa | |
1 | 0 | 1 | {} | op0 encodes CNST6. applicable only with swizzles. |
1 | 1 | 0 | index<N> | op0 encodes IDX6 |
1 | 1 | 1 | index2 mode | op0 encodes RIO6. |
- opt0 - type of operand op0, encoded with Register Selector Indexable RSI2.
or selects other modes for encoding op0 if specified in alt_opt0.
- op0 - encoded with Register R6.
or with Register Index Offset RIO6 using index1 mode if specified in opt0.
or with Constant CNST6 if specified in alt_opt0.
or with Index IDX6 if specified in alt_opt0.
or with Register Index Offset RIO6 using index2 mode if specified in alt_opt0.
Operand N
Source operand <N> can be encoded in different ways.
Usually the following fields are used to encode it:
- alt_opt<N> - alter opt<N>. this bit can be combined with opt<N> to produce the following modes for op<N >:
alt_opt<N> | opt<N> | value | details | |
---|---|---|---|---|
1 | 0 | 0 | index1 mode | op<N> encodes RIO6. |
1 | 0 | 1 | {} | op<N> encodes CNST6. applicable only with swizzles. |
1 | 1 | 0 | immediate | op<N> encodes IMM6. |
1 | 1 | 1 | index2 mode | op<N> encodes RIO6. |
- opt<N> - type of operand op<N>, encoded with Register Selector RS2.
or selects other modes for encoding op<N> if specified in alt_opt<N>.
- op<N> - encoded with Register R6.
or with Register Index Offset RIO6 using index1 mode if specified in alt_opt<N>.
or with CNST6 if specified in alt_opt<N>.
or IMM6 if specified in alt_opt<N>.
or with Register Index Offset RIO6 using index2 mode if specified in alt_opt<N>.
Registers
- pa - primary attribute register. 32 bit long.
- sa - secondary attribute register. 32 bit long.
- o - output register. 32 bit long.
- r - temporary register. 32 bit long.
- i - internal register. 128 bit long.
Register Selector RS2
This encoding uses 2 bits to encode register type.
selector is encoded as:
1 | 0 | meaning |
---|---|---|
0 | 0 | r |
0 | 1 | o |
1 | 0 | pa |
1 | 1 | sa |
Note that internal registers are not encoded - they are reserved in Register R6
Register Selector Indexable RSI2
This encoding uses 2 bits to encode register type.
selector is encoded as:
1 | 0 | meaning |
---|---|---|
0 | 0 | r |
0 | 1 | o |
1 | 0 | pa |
1 | 1 | index<N> mode |
When index<N> mode is used - there has to be another field that encodes index expression with Register Index Offset RIO6
The way that index expression is buit:
<reg>[index1 * 2 + <offset>]
Example:
r[index1 * 2 + 8]
Register R6
This encoding uses 6 bits to encode register index.
register is encoded as:
5 | 4 | 3 | 2 | 1 | 0 | index |
---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 1 | 2 |
... | ... | ... | ... | ... | ... | |
1 | 1 | 1 | 0 | 1 | 0 | 116 |
1 | 1 | 1 | 0 | 1 | 1 | 118 |
1 | 1 | 1 | 1 | 0 | 0 | i0 (reserved) |
1 | 1 | 1 | 1 | 0 | 1 | i1 (reserved) |
1 | 1 | 1 | 1 | 1 | 0 | i2 (reserved) |
1 | 1 | 1 | 1 | 1 | 1 | i3 (reserved) |
index is calculated as: value * 2
Register expression is built as:
<reg><index>
Example:
sa68
Specific type of register can be selected with Register Selector RS2
For destination operand op0 specific type of register can be selected with Register Selector Indexable RSI2
Last 4 values are reserved for internal registers i0, i1, i2, i3
Register RI2
This encoding uses 2 bits to encode internal register.
1 | 0 | value |
---|---|---|
0 | 0 | i0 |
0 | 1 | i1 |
1 | 0 | i2 |
1 | 1 | i3 |
Register Index Offset RIO6
5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|
rt | offset | ||||
rt is encoded as Register Selector RS2
offset is calculated as: value * 2
offset is encoded as:
3 | 2 | 1 | 0 | offset |
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 2 |
... | ... | ... | ... | ... |
1 | 1 | 1 | 0 | 28 |
1 | 1 | 1 | 1 | 30 |
Immediates
Immediate IMM6
Some operands may act as immediate values which are encoded using 6 bits.
Indexes
Not sure what index<N>
expression means.
It can be used in 2 places:
- Alternative mode for operand 0
Index IDX6
Uses 6 bits to encode index<N>
expression.
Index is calculated as value * 2. Max index is 126.
Example:
mad.f32 index24, r0, r0, r0
Constants
Constant CNST6
Some operands may act as constant values which are encoded using 6 bits.
Constants are taken from table below.
Constants differ in 32 and 16 bit mode.
32 bit mode
bank 1 is used for channel 1 of op0
for other operands - only bank 0 is used for each channel
f32 mode - bank0:
|
f32 mode - bank1:
|
16 bit mode
Table for 16 bit mode does not have accurate values.
bank 1, bank 2, bank 3 is used for channel 1, channel 2, channel 3 of op0
for other operands - only bank 0 is used for each channel
f16 mode - bank 0:
|
f16 mode - bank 1:
|
f16 mode - bank 2:
|
f16 mode - bank 3:
|
Swizzles
Swizzle notation
There are 2 notations:
- text notation
- constant notation
When some of channels have constants - text notation is used
mul.f32 r0.xyzw, r0.h1xx, r0.xxxx
When all channels have constants - constant notation is used
mul.f32 r0.xyzw, {0.5, 1, 1, 0.5}, r0.xxxx
When channel is masked in text notation it is marked as -
mul.f32 r0.-y-w, r0.-x-x, r0.-x-x
When channel is masked in constant notation it is replaced with zero
mul.f32 r0.-y-w, {0, 1, 0, 0.5}, r0.-x-x
Register Swizzle RSWZ2
This encoding uses 2 bits to encode the swizzle.
Usually combinations are additinally controlled by 1 or more bits called swz_alt_op
This type of swizzling does not allow precise control on each channel as opposed to RSWZ3
Usually there is a predefined table of swizzles.
Swizzle expression is built as:
<reg><index>.<swizzle>
Example:
r22.x
Register Swizzle RSWZ3
This encoding uses 3 bits to encode the mask.
channel is encoded as:
2 | 1 | 0 | text notation | constant notation |
---|---|---|---|---|
0 | 0 | 0 | x | x |
0 | 0 | 1 | y | y |
0 | 1 | 0 | z | z |
0 | 1 | 1 | w | w |
1 | 0 | 0 | 0 | 0.0 |
1 | 0 | 1 | 1 | 1.0 |
1 | 1 | 0 | 2 | 2.0 |
1 | 1 | 1 | h | 0.5 |
swizzle expression is built as:
<reg><index>.<swizzle>
Example:
r22.x
Modifier and dest data format
At the moment it is not known which of the data format fields is dest and which is source.
This is the reason why term modifier is mixed with term dest data format.
0x00000000 - 0x08000000
Instructions
mad
Encoding
Higher 4 bytes
|
|
|
|
Lower 4 bytes
|
|
|
|
Notes
- x bits do not affect instruction or operands. might affect something else?
- what do index<N> mean. are these registers or something?
- looks like there is functionality to switch sign of index expression
- probably can move swizzle masking to generic section? if other instructions use same encodings.
Fields - instruction
data_format:
|
predicate:
|
Fields - operands
- swz_alt_op1 - alter op1 swizzle. consult Swizzles f32 or Swizzles f16.
- alt_opt0 - consult Operand 0
- abs_op1 - add abs modifier to op1. example:
abs(pa38)
- alt_opt2 - consult Operand N.
- alt_opt3 - consult Operand N.
- swz_alt_op3 - alter op3 swizzle. consult Swizzles f32 or Swizzles f16.
- op3_swz - op3 swizzle encoded with Register Swizzle RSWZ2.
- swz_alt_op2 - alter op2 swizzle. consult Swizzles f32 or Swizzles f16.
- swz_mask16 - mask swizzles. consult Swizzle masking.
- swz_mask32 - mask swizzles. consult Swizzle masking.
- swz_en - enables swizzling and controls swizzle masking. consult Swizzle masking.
- abs_op2 - add abs modifier to op2. example:
abs(pa20)
- neg_op2 - negate op2. example:
-pa86
- abs_op3 - add abs modifier to op3. example:
abs(r20)
- neg_op3 - negate op3. example:
-r86
- opt1 - when enabled - selects pa register type. when disabled - selects r register type.
- opt0 - consult Operand 0.
- opt2 - consult Operand N.
- opt3 - consult Operand N.
- op0 - consult Operand 0.
- op2_swz - op2 swizzle encoded with Register Swizzle RSWZ2. consult Swizzles f32 or Swizzles f16.
- op1_swz - op1 swizzle encoded with Register Swizzle RSWZ2. consult Swizzles f32 or Swizzles f16.
- op1 - encoded with Register R6
- op2 - consult Operand N.
- op3 - consult Operand N.
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
- alt_opt0, opt0, op0
- alt_opt2, opt2, op2
- alt_opt3, opt3, op3
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzles_f32 and Swizzles_f16.
Constants are taken from tables Constants.
Constants differ between 32 and 16 bit mode.
Swizzle masking
Masking is controled by control bits:
- control bits: swz_en, swz_mask32, swz_mask16
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
Encoding used in masking table:
value | meaning |
---|---|
0 | channel not selected |
1 | channel selected |
x | channel masked |
Masking table 32 bit mode:
swz_mask32 | swz_en | ch0 | ch1 |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 1 | 1 | 0 |
1 | 0 | x | 1 |
1 | 1 | 1 | 1 |
Masking table 16 bit mode:
swz_mask16 | swz_en | ch0 | ch1 | ch2 | ch3 |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
0 | 1 | 1 | 1 | 0 | 0 |
1 | 0 | x | x | 1 | 1 |
1 | 1 | 1 | 1 | 1 | 1 |
Swizzles f32
Swizzles of operand 1, operand 2 and operand 3 can not be precisely controlled and have predefined combinations.
Swizzles are controlled with bits:
- swizzle fields: op1_swz, op2_swz, op3_swz
- control bits: swz_alt_op1, swz_alt_op2, swz_alt_op3
Swizzles of operand 0 can not be controlled.
operand 0 | operand 1 | operand 2 | operand 3 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
Swizzles f16
Swizzles of operand 1, operand 2 and operand 3 can not be precisely controlled and have predefined combinations.
Swizzles are controlled with bits:
- swizzle fields: op1_swz, op2_swz, op3_swz
- control bits: swz_alt_op1, swz_alt_op2, swz_alt_op3
Swizzles of operand 0 can not be controlled.
operand 0 | operand 1 | operand 2 | operand 3 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
Examples
mad.f32 r0, r0, r0, r0 mad.f16 r0, r0, r0, r0
0x08000000 - 0x10000000
Instructions
mul.f32, add.f32, frc.f32, dsx.f32, dsy.f32, min.f32, max.f32, dot.f32
Encoding
Higher 4 bytes
|
|
|
|
Lower 4 bytes
|
|
|
|
Notes
- do predicates apply to all instructions, so that opcode2 is now found?
- test dot instruction and probably describe separately?
Fields - instruction
predicate:
|
opcode2:
|
Fields - operands
- op1_swz_c3x - operand 1 swizzling channel 3 bit 1, 2. encoded as RSWZ3. consult Swizzles - operand 1.
- alt_opt0 - consult Operand 0.
- op1_swz_c30 - operand 1 swizzling channel 3 bit 0. encoded as RSWZ3. consult Swizzles - operand 1.
- alt_opt1 - consult Operand N.
- alt_opt2 - consult Operand N.
- swz_alt_op2 - change op2 swizzle. consult Swizzles - operand 2.
- op2_swz - op2 swizzle encoded with Register Swizzle RSWZ2. consult Swizzles - operand 2.
- swz_mask3 - mask swizzle. consult Swizzle masking.
- swz_mask2 - mask swizzle. consult Swizzle masking.
- swz_mask1 - mask swizzle. consult Swizzle masking.
- swz_en - enables usage of swizzling. consult Swizzle masking.
- abs_op1 - add abs modifier to op1.
- neg_op1 - negate op1.
- abs_op2 - add abs modifier to op2.
- op1_swz_c2x - operand 1 swizzling channel 2 bit 1, 2. encoded as RSWZ3. consult Swizzles - operand 1.
- opt0 - consult Operand 0.
- opt1 - consult Operand N.
- opt2 - consult Operand N.
- op0 - consult Operand 0.
- op1_swz_c20 - operand 1 swizzling channel 2 bit 0. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c1 - operand 1 swizzling channel 1. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c0 - operand 1 swizzling channel 0. encoded as RSWZ3. consult Swizzles - operand 1.
- op1 - consult Operand N.
- op2 - consult Operand N.
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
- alt_opt0, opt0, op0
- alt_opt1, opt1, op1
- alt_opt2, opt2, op2
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzle_masking.
Constants are taken from tables Constants.
Constants correspond to table for 32 bit mode.
Swizzle masking
Masking is controled by control bits:
- control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
dot.f32 instruction has explicit swizzling in operand 1 and operand 2 so masking does not apply to these operands. number of channels is always 3.
Encoding used in masking table:
value | meaning |
---|---|
0 | channel not selected |
1 | channel selected |
x | channel masked |
Masking table:
swz_mask3 | swz_mask2 | swz_mask1 | swz_en | ch0 | ch1 | ch2 | ch3 |
---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | x | 1 | 0 | 0 |
0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 |
0 | 1 | 0 | 0 | x | x | 1 | 0 |
0 | 1 | 0 | 1 | 1 | x | 1 | 0 |
0 | 1 | 1 | 0 | x | 1 | 1 | 0 |
0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
1 | 0 | 0 | 0 | x | x | x | 1 |
1 | 0 | 0 | 1 | 1 | x | x | 1 |
1 | 0 | 1 | 0 | x | 1 | x | 1 |
1 | 0 | 1 | 1 | 1 | 1 | x | 1 |
1 | 1 | 0 | 0 | x | x | 1 | 1 |
1 | 1 | 0 | 1 | 1 | x | 1 | 1 |
1 | 1 | 1 | 0 | x | 1 | 1 | 1 |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
value |
---|
xyzw |
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Each channel of operand 1 can be precisely controlled with swizzle fields encoded as RSWZ3.
- op1_swz_c0, op1_swz_c1, op1_swz_c20, op1_swz_c2x, op1_swz_c30, op1_swz_c3x
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f32 instruction
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
- op2_swz, swz_alt_op2
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f32 instruction
swz_alt_op2 | op2_swz | value | ||
---|---|---|---|---|
0 | 0 | 0 | 0 | xxxx |
0 | 0 | 0 | 1 | yyyy |
0 | 0 | 1 | 0 | zzzz |
0 | 0 | 1 | 1 | wwww |
0 | 1 | 0 | 0 | xyzw |
0 | 1 | 0 | 1 | yzww |
0 | 1 | 1 | 0 | xyzz |
0 | 1 | 1 | 1 | xxyz |
1 | 0 | 0 | 0 | xyxy |
1 | 0 | 0 | 1 | xywz |
1 | 0 | 1 | 0 | zxyw |
1 | 0 | 1 | 1 | zwzw |
1 | 1 | 0 | 0 | yzxz |
1 | 1 | 0 | 1 | xxyy |
1 | 1 | 1 | 0 | xzww |
1 | 1 | 1 | 1 | xyz1 |
Examples
mul.f32 r0, r0, r0 add.f32 r0, r0, r0 frc.f32 r0, r0, r0 dsx.f32 r0, r0, r0 dsy.f32 r0, r0, r0 min.f32 r0, r0, r0 max.f32 r0, r0, r0 dot.f32 r0, r0.xxxx, r0.xxxx
0x10000000 - 0x18000000
Instructions
mul.f16, add.f16, frc.f16, dsx.f16, dsy.f16, min.f16, max.f16, dot.f16
Encoding
Higher 4 bytes
|
|
|
|
Lower 4 bytes
|
|
|
|
Notes
- do predicates apply to all instructions, so that opcode2 is now found?
- test dot instruction and probably describe separately?
Fields - instruction
predicate:
|
opcode2:
|
Fields - operands
- op1_swz_c3x - operand 1 swizzling channel 3 bit 1, 2. encoded as RSWZ3. consult Swizzles - operand 1.
- alt_opt0 - consult Operand 0.
- op1_swz_c30 - operand 1 swizzling channel 3 bit 0. encoded as RSWZ3. consult Swizzles - operand 1.
- alt_opt1 - consult Operand N.
- alt_opt2 - consult Operand N.
- swz_alt_op2 - change op2 swizzle. consult Swizzles - operand 2.
- op2_swz - op2 swizzle encoded with Register Swizzle RSWZ2. consult Swizzles - operand 2.
- swz_mask3 - mask swizzle. consult Swizzle masking.
- swz_mask2 - mask swizzle. consult Swizzle masking.
- swz_mask1 - mask swizzle. consult Swizzle masking.
- swz_en - enables usage of swizzling. consult Swizzle masking.
- abs_op1 - add abs modifier to op1.
- neg_op1 - negate op1.
- abs_op2 - add abs modifier to op2.
- op1_swz_c2x - operand 1 swizzling channel 2 bit 1, 2. encoded as RSWZ3. consult Swizzles - operand 1.
- opt0 - consult Operand 0.
- opt1 - consult Operand N.
- opt2 - consult Operand N.
- op0 - consult Operand 0.
- op1_swz_c20 - operand 1 swizzling channel 2 bit 0. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c1 - operand 1 swizzling channel 1. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c0 - operand 1 swizzling channel 0. encoded as RSWZ3. consult Swizzles - operand 1.
- op1 - consult Operand N.
- op2 - consult Operand N.
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
- alt_opt0, opt0, op0
- alt_opt1, opt1, op1
- alt_opt2, opt2, op2
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzle_masking.
Constants are taken from tables Constants.
Constants correspond to table for 16 bit mode.
Swizzle masking
Masking is controled by control bits:
- control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
dot.f16 instruction has explicit swizzling in operand 1 and operand 2 so masking does not apply to these operands. number of channels is always 3.
Encoding used in masking table:
value | meaning |
---|---|
0 | channel not selected |
1 | channel selected |
x | channel masked |
Masking table:
swz_mask3 | swz_mask2 | swz_mask1 | swz_en | ch0 | ch1 | ch2 | ch3 |
---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | x | 1 | 0 | 0 |
0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 |
0 | 1 | 0 | 0 | x | x | 1 | 0 |
0 | 1 | 0 | 1 | 1 | x | 1 | 0 |
0 | 1 | 1 | 0 | x | 1 | 1 | 0 |
0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
1 | 0 | 0 | 0 | x | x | x | 1 |
1 | 0 | 0 | 1 | 1 | x | x | 1 |
1 | 0 | 1 | 0 | x | 1 | x | 1 |
1 | 0 | 1 | 1 | 1 | 1 | x | 1 |
1 | 1 | 0 | 0 | x | x | 1 | 1 |
1 | 1 | 0 | 1 | 1 | x | 1 | 1 |
1 | 1 | 1 | 0 | x | 1 | 1 | 1 |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
value |
---|
xyzw |
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Each channel of operand 1 can be precisely controlled with swizzle fields encoded as RSWZ3.
- op1_swz_c0, op1_swz_c1, op1_swz_c20, op1_swz_c2x, op1_swz_c30, op1_swz_c3x
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f16 instruction
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
- op2_swz, swz_alt_op2
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
masking does not apply for dot.f16 instruction
swz_alt_op2 | op2_swz | value | ||
---|---|---|---|---|
0 | 0 | 0 | 0 | xxxx |
0 | 0 | 0 | 1 | yyyy |
0 | 0 | 1 | 0 | zzzz |
0 | 0 | 1 | 1 | wwww |
0 | 1 | 0 | 0 | xyzw |
0 | 1 | 0 | 1 | yzww |
0 | 1 | 1 | 0 | xyzz |
0 | 1 | 1 | 1 | xxyz |
1 | 0 | 0 | 0 | xyxy |
1 | 0 | 0 | 1 | xywz |
1 | 0 | 1 | 0 | zxyw |
1 | 0 | 1 | 1 | zwzw |
1 | 1 | 0 | 0 | yzxz |
1 | 1 | 0 | 1 | xxyy |
1 | 1 | 1 | 0 | xzww |
1 | 1 | 1 | 1 | xyz1 |
Examples
mul.f16 r0, r0, r0 add.f16 r0, r0, r0 frc.f16 r0, r0, r0 dsx.f16 r0, r0, r0 dsy.f16 r0, r0, r0 min.f16 r0, r0, r0 max.f16 r0, r0, r0 dot.f16 r0, r0.xxxx, r0.xxxx
0x18000000 - 0x20000000
Instructions
dot.f32, mad.f32
Encoding - dot.f32
Higher 4 bytes
|
|
|
|
Lower 4 bytes
|
|
|
|
Encoding - mad.f32
Higher 4 bytes
|
|
|
|
Lower 4 bytes
|
|
|
|
Notes
- Only first constant for op0 can be controlled. Other channels have some predefined values. This has to be explained.
- Any way to use constants for op2?
Fields - instruction
opcode2:
|
predicate:
|
dot.f32
Fields - operands
- c3_en - enable channel 3 for swizzles for op1 and op2. by default dot.f32 has only 3 channels for op1 and op2. consult Swizzles - operand 1.
- alt_opt0 - consult Operand 0.
- alt_opt1 - consult Operand N.
- abs_op2 - add abs modifier to op2.
- swz_en_strange1 - force overrides swizzle masking with single channel.
- swz_en_strange0 - force overrides swizzle masking with single channel.
- swz_mask3 - mask swizzle. consult Swizzle masking.
- swz_mask2 - mask swizzle. consult Swizzle masking.
- swz_mask1 - mask swizzle. consult Swizzle masking.
- swz_en - enables usage of swizzling. consult Swizzle masking.
- neg_op1 - negate op1.
- abs_op1 - add abs modifier to op1.
- opt0 - consult Operand 0.
- opt1 - consult Operand N.
- op2i - encoded with RI2
- op0 - consult Operand 0.
- swz_alt_op2 - change op2 swizzle. consult Swizzles - operand 2.
- op2_swz - op2 swizzle encoded with Register Swizzle RSWZ2. consult Swizzles - operand 2.
- op1_swz_c3 - operand 1 swizzling channel 3. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c2 - operand 1 swizzling channel 2. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c1 - operand 1 swizzling channel 1. encoded as RSWZ3. consult Swizzles - operand 1.
- op1_swz_c0 - operand 1 swizzling channel 0. encoded as RSWZ3. consult Swizzles - operand 1.
- op1 - consult Operand N.
Constants
Specific operand may be used as float constant. This can be achieved with following groups of bits:
- alt_opt0, opt0, op0
- alt_opt1, opt1, op1
Float constants can only be used when swizzling is enabled for particular operand. Consider checking sections Swizzle_masking.
Constants are taken from tables Constants.
Constants correspond to table for 32 bit mode.
Swizzle masking
Masking is controled by control bits:
- control bits: swz_en, swz_mask1, swz_mask2, swz_mask3
Each channel can be masked with control bits. Combinations of control bits produce the following masking table.
dot.f32 instruction has explicit swizzling in operand 1 and operand 2 so masking does not apply to these operands. number of channels is 3 or 4 depending on c3_en.
Encoding used in masking table:
value | meaning |
---|---|
0 | channel not selected |
1 | channel selected |
x | channel masked |
Masking table operand 0:
swz_mask3 | swz_mask2 | swz_mask1 | swz_en | ch0 | ch1 | ch2 | ch3 |
---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
0 | 0 | 1 | 0 | x | 1 | 0 | 0 |
0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 |
0 | 1 | 0 | 0 | x | x | 1 | 0 |
0 | 1 | 0 | 1 | 1 | x | 1 | 0 |
0 | 1 | 1 | 0 | x | 1 | 1 | 0 |
0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
1 | 0 | 0 | 0 | x | x | x | 1 |
1 | 0 | 0 | 1 | 1 | x | x | 1 |
1 | 0 | 1 | 0 | x | 1 | x | 1 |
1 | 0 | 1 | 1 | 1 | 1 | x | 1 |
1 | 1 | 0 | 0 | x | x | 1 | 1 |
1 | 1 | 0 | 1 | 1 | x | 1 | 1 |
1 | 1 | 1 | 0 | x | 1 | 1 | 1 |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Swizzles - operand 0
Swizzles of operand 0 can not be controled and have predefined combinations described below:
value |
---|
xyzw |
Each channel can be masked with control bits. Masking is described in Swizzle_masking.
- swz_en, swz_mask1, swz_mask2, swz_mask3
Swizzles - operand 1
Each channel of operand 1 can be precisely controlled with swizzle fields encoded as RSWZ3.
- op1_swz_c0, op1_swz_c1, op1_swz_c2, op1_swz_c3
Channel 3 can be enabled with bit:
- c3_en
Swizzles - operand 2
Swizzles of operand 2 can not be precisely controlled and have predefined combinations described below and controlled by swizzle fields:
- op2_swz, swz_alt_op2
Channel 3 can be enabled with bit:
- c3_en
3 channels:
|
4 channels:
|
mad.f32
Fields - operands
- swz_alt_op3_2
- alt_opt0
- alt_opt1
- abs_op2
- op0_strange1
- op0_strange0
- swz_mask3
- swz_mask2
- swz_mask1
- swz_en
- neg_op1
- abs_op1
- neg_op3
- abs_op3
- swz_alt_op2_2
- opt0
- opt1
- op2i
- op0
- swz_alt_op2_x
- op2_swz
- swz_alt_op3_x
- op3_swz
- op3i
- swz_alt_op1
- op1_swz
- op1
Examples
dot.f32 r0, r0.xxx, i0.xxx mad.f32 r0, r0, i0, i0
0x20000000 - 0x28000000
Instructions: dot, mov, rsq, rcp, exp, log
Encoding:
|
|
|
|
Notes:
Having bit 3 in byte 2 set to 0 produces invalid instruction
Fields:
data_format:
|
predicate:
|
opcode2 (depends on op_sel):
|
|
Examples:
dot.f32 mov.f32 rsq.f32 rcp.f32 exp.f32 log.f32 dot.f16 mov.f16 rsq.f16 rcp.f16 exp.f16 log.f16
0x28000000 - 0x30000000
Instructions: dot, mov, rsq, rcp
Encoding:
|
|
|
|
Notes:
Having bit 3 in byte 2 set to 0 produces invalid instruction
Fields:
data_format:
|
predicate:
|
opcode2:
6 | 5 | 4 | value |
---|---|---|---|
0 | 0 | 0 | invalid |
0 | 0 | 1 | invalid |
0 | 1 | 0 | dot |
0 | 1 | 1 | invalid |
1 | 0 | 0 | invalid |
1 | 0 | 1 | mov |
1 | 1 | 0 | rsq |
1 | 1 | 1 | rcp |
Examples:
dot.f32 mov.f32 rsq.f32 rcp.f32 dot.f16 mov.f16 rsq.f16 rcp.f16
0x30000000 - 0x38000000
Instructions: rcp, rsq, log, exp
Encoding:
|
|
|
|
Notes:
modifier
should be omitted if data_format
matches modifier
.
Fields:
opcode2:
|
data_format:
|
modifier:
|
predicate:
|
Examples:
rcp.f32 rsq.f32 log.f32 exp.f32 rcp.f32.fx10 rsq.f32.fx10 log.f32.fx10 exp.f32.fx10 rcp.f16.f32 rsq.f16.f32 log.f16.f32 exp.f16.f32 rcp.f16.fx10 rsq.f16.fx10 log.f16.fx10 exp.f16.fx10 rcp.fx10.f32 rsq.fx10.f32 log.fx10.f32 exp.fx10.f32 rcp.fx10 rsq.fx10 log.fx10 exp.fx10
0x38000000 - 0x40000000
Instructions: mov, cmov, cmov8
Encoding:
|
|
|
|
Notes:
cond
is only applicable to cmov
and cmov8
since this is conditional move.
Fields:
opcode2:
|
cond:
|
data_format:
|
predicate:
|
Examples:
mov.i8 mov.i16 mov.i32 mov.fx10 mov.f16 mov.f32 cmov.eqzero.i8 cmov.eqzero.i16 cmov.eqzero.i32 cmov.eqzero.fx10 cmov.eqzero.f16 cmov.eqzero.f32 cmov8.eqzero.i8 cmov8.eqzero.i16 cmov8.eqzero.i32 cmov8.eqzero.fx10 cmov8.eqzero.f16 cmov8.eqzero.f32 cmov.ltzero.i8 cmov.ltzero.i16 cmov.ltzero.i32 cmov.ltzero.fx10 cmov.ltzero.f16 cmov.ltzero.f32 cmov8.ltzero.i8 cmov8.ltzero.i16 cmov8.ltzero.i32 cmov8.ltzero.fx10 cmov8.ltzero.f16 cmov8.ltzero.f32
0x40000000 - 0x48000000
Instructions: pack, (mov)
Encoding:
|
|
|
|
Notes:
when modifier
matches data_format
it shall be omitted since it has no effect in terms of packing.
furthermore instruction mnemonic shall be replaced to mov
Fields:
data_format:
|
modifier:
|
predicate:
|
Examples:
mov.u8 pack.s16.u8 pack.u8.s8 pack.s16.s8 pack.u8.o8 pack.s16.o8 pack.u8.u16 pack.s16.u16 pack.u8.s16 mov.s16 pack.u8.f16 pack.s16.f16 pack.u8.f32 pack.s16.f32
0x48000000 - 0x50000000
Instructions: this group only contains illegal instructions
Encoding:
|
|
|
|
0x50000000 - 0x58000000
Instructions: and.u32
Encoding:
|
|
|
|
Fields:
predicate:
2 | 1 | 0 | value |
---|---|---|---|
0 | 0 | 0 | |
0 | 0 | 1 | p0 |
0 | 1 | 0 | p1 |
0 | 1 | 1 | p2 |
1 | 0 | 0 | p3 |
1 | 0 | 1 | !p0 |
1 | 1 | 0 | !p1 |
1 | 1 | 1 | Pn |
Examples:
and.u32
0x58000000 - 0x60000000
Instructions: xor.u32
Encoding:
|
|
|
|
Fields:
predicate:
2 | 1 | 0 | value |
---|---|---|---|
0 | 0 | 0 | |
0 | 0 | 1 | p0 |
0 | 1 | 0 | p1 |
0 | 1 | 1 | p2 |
1 | 0 | 0 | p3 |
1 | 0 | 1 | !p0 |
1 | 1 | 0 | !p1 |
1 | 1 | 1 | Pn |
Examples:
xor.u32
0x60000000 - 0x68000000
Instructions: shl.u32
Encoding:
|
|
|
|
Fields:
predicate:
2 | 1 | 0 | value |
---|---|---|---|
0 | 0 | 0 | |
0 | 0 | 1 | p0 |
0 | 1 | 0 | p1 |
0 | 1 | 1 | p2 |
1 | 0 | 0 | p3 |
1 | 0 | 1 | !p0 |
1 | 1 | 0 | !p1 |
1 | 1 | 1 | Pn |
Examples:
shl.u32
0x68000000 - 0x70000000
Instructions: shr.u32
Encoding:
|
|
|
|
Fields:
predicate:
2 | 1 | 0 | value |
---|---|---|---|
0 | 0 | 0 | |
0 | 0 | 1 | p0 |
0 | 1 | 0 | p1 |
0 | 1 | 1 | p2 |
1 | 0 | 0 | p3 |
1 | 0 | 1 | !p0 |
1 | 1 | 0 | !p1 |
1 | 1 | 1 | Pn |
Examples:
shr.u32
0x70000000 - 0x78000000
Instructions: rlp.u32
Encoding:
|
|
|
|
Fields:
predicate:
2 | 1 | 0 | value |
---|---|---|---|
0 | 0 | 0 | |
0 | 0 | 1 | p0 |
0 | 1 | 0 | p1 |
0 | 1 | 1 | p2 |
1 | 0 | 0 | p3 |
1 | 0 | 1 | !p0 |
1 | 1 | 0 | !p1 |
1 | 1 | 1 | Pn |
Examples:
rlp.u32
0x78000000 - 0x80000000
Instructions: this group only contains illegal instructions
Encoding:
|
|
|
|
0x80000000 - 0x88000000
Instructions: add.fx8
Encoding:
|
|
|
|
Fields:
predicate:
2 | 1 | value |
---|---|---|
0 | 0 | |
0 | 1 | p0 |
1 | 0 | p1 |
1 | 1 | !p0 |
Examples:
add.fx8
0x88000000 - 0x90000000
Instructions: add.fx8, sub.fx8
Encoding:
|
|
|
|
Notes:
Having bits 2, 3 in byte 2 set to 1 produces invalid instruction
Fields:
opcode2:
|
predicate:
|
Examples:
add.fx8 sub.fx8
0x90000000 - 0x98000000
Instructions: add.fx8, sub.fx8, min.fx8, max.fx8
Encoding:
|
|
|
|
Notes:
Having bit 0 in byte 2 set to 1 produces invalid instruction
Fields:
opcode2:
|
predicate:
|
Examples:
add.fx8 sub.fx8 min.fx8 max.fx8
0x98000000 - 0xA0000000
Instructions: mad.u8
Encoding:
|
|
|
|
Fields:
modifier:
|
predicate:
|
Examples:
mad.u8 mad.sat.u8
0xA0000000 - 0xA8000000
Instructions: mad
Encoding:
|
|
|
|
Fields:
data_format:
|
modifier:
|
predicate:
|
Examples:
mad.u16 mad.u16.sat mad.i16 mad.i16.sat
0xA8000000 - 0xB0000000
Instructions: mad
Encoding:
|
|
|
|
Fields:
data_format:
|
modifier:
|
predicate:
|
Examples:
mad.u32 mad.u32.sat mad.i32 mad.i32.sat
0xB0000000 - 0xB8000000
Instructions: this group only contains illegal instructions
Encoding:
|
|
|
|
0xB8000000 - 0xC0000000
Instructions: this group only contains illegal instructions
Encoding:
|
|
|
|
0xC0000000 - 0xC8000000
Instructions: this group only contains illegal instructions
Encoding:
|
|
|
|
0xC8000000 - 0xD0000000
Instructions: mad.u8
Encoding:
|
|
|
|
Fields:
modifier:
|
predicate:
|
Examples:
mad.u8 mad.sat.u8
0xD0000000 - 0xD8000000
Instructions: mad
Encoding:
|
|
|
|
Notes:
Having bit 5 in byte 1 set to 1 produces invalid instruction
Fields:
modifier:
|
data_format:
|
predicate:
|
Examples:
mad.u32.s0 mad.i32.s0 mad.u32.s1 mad.i32.s1
0xD8000000 - 0xE0000000
Instructions: this group only contains illegal instructions
Encoding:
|
|
|
|
0xE0000000 - 0xE8000000
Instructions: tex
Encoding:
|
|
|
|
Fields:
dim:
|
func:
|
modifier:
|
data_format:
|
predicate:
|
Examples:
tex1D tex1D.f16 tex1D.f32 tex1D.minp tex1D.minp.f16 tex1D.minp.f32 tex1DBias tex1DBias.f16 tex1DBias.f32 tex1DBias.minp tex1DBias.minp.f16 tex1DBias.minp.f32 tex1DReplace tex1DReplace.f16 tex1DReplace.f32 tex1DReplace.minp tex1DReplace.minp.f16 tex1DReplace.minp.f32 tex1DGrad tex1DGrad.f16 tex1DGrad.f32 tex1DGrad.minp tex1DGrad.minp.f16 tex1DGrad.minp.f32 tex2D tex2D.f16 tex2D.f32 tex2D.minp tex2D.minp.f16 tex2D.minp.f32 tex2DBias tex2DBias.f16 tex2DBias.f32 tex2DBias.minp tex2DBias.minp.f16 tex2DBias.minp.f32 tex2DReplace tex2DReplace.f16 tex2DReplace.f32 tex2DReplace.minp tex2DReplace.minp.f16 tex2DReplace.minp.f32 tex2DGrad tex2DGrad.f16 tex2DGrad.f32 tex2DGrad.minp tex2DGrad.minp.f16 tex2DGrad.minp.f32 texCube texCube.f16 texCube.f32 texCube.minp texCube.minp.f16 texCube.minp.f32 texCubeBias texCubeBias.f16 texCubeBias.f32 texCubeBias.minp texCubeBias.minp.f16 texCubeBias.minp.f32 texCubeReplace texCubeReplace.f16 texCubeReplace.f32 texCubeReplace.minp texCubeReplace.minp.f16 texCubeReplace.minp.f32 texCubeGrad texCubeGrad.f16 texCubeGrad.f32 texCubeGrad.minp texCubeGrad.minp.f16 texCubeGrad.minp.f32
0xE8000000 - 0xF0000000
Instructions: lda32, ldl32, ldt32
Encoding:
|
|
|
|
Notes:
index
is only applicable when fetch
modifier is specified
Fields:
modifier:
|
index:
|
opcode2:
|
predicate:
|
Examples:
lda32 ldl32 ldt32 lda32.fetch1 lda32.fetch2 lda32.fetch3 lda32.fetch4 lda32.fetch5 lda32.fetch6 lda32.fetch7 lda32.fetch8 lda32.fetch9 lda32.fetch10 lda32.fetch11 lda32.fetch12 lda32.fetch13 lda32.fetch14 lda32.fetch15 lda32.fetch16 ldl32.fetch1 ldl32.fetch2 ldl32.fetch3 ldl32.fetch4 ldl32.fetch5 ldl32.fetch6 ldl32.fetch7 ldl32.fetch8 ldl32.fetch9 ldl32.fetch10 ldl32.fetch11 ldl32.fetch12 ldl32.fetch13 ldl32.fetch14 ldl32.fetch15 ldl32.fetch16 ldt32.fetch1 ldt32.fetch2 ldt32.fetch3 ldt32.fetch4 ldt32.fetch5 ldt32.fetch6 ldt32.fetch7 ldt32.fetch8 ldt32.fetch9 ldt32.fetch10 ldt32.fetch11 ldt32.fetch12 ldt32.fetch13 ldt32.fetch14 ldt32.fetch15 ldt32.fetch16
0xF0000000 - 0xF8000000
Instructions: sta32, stl32, stt32
Encoding:
|
|
|
|
Notes:
index
is only applicable when fetch
modifier is specified
Fields:
modifier:
|
index:
|
opcode2:
|
predicate:
|
Examples:
sta32 stl32 stt32 sta32.fetch1 sta32.fetch2 sta32.fetch3 sta32.fetch4 sta32.fetch5 sta32.fetch6 sta32.fetch7 sta32.fetch8 sta32.fetch9 sta32.fetch10 sta32.fetch11 sta32.fetch12 sta32.fetch13 sta32.fetch14 sta32.fetch15 sta32.fetch16 stl32.fetch1 stl32.fetch2 stl32.fetch3 stl32.fetch4 stl32.fetch5 stl32.fetch6 stl32.fetch7 stl32.fetch8 stl32.fetch9 stl32.fetch10 stl32.fetch11 stl32.fetch12 stl32.fetch13 stl32.fetch14 stl32.fetch15 stl32.fetch16 stt32.fetch1 stt32.fetch2 stt32.fetch3 stt32.fetch4 stt32.fetch5 stt32.fetch6 stt32.fetch7 stt32.fetch8 stt32.fetch9 stt32.fetch10 stt32.fetch11 stt32.fetch12 stt32.fetch13 stt32.fetch14 stt32.fetch15 stt32.fetch16
0xF8000000 - 0xFF000000
Notes:
this instruction group is much more complex than others so description is given in form of "glued" truth tables instead of independent truth tables.
predicate 000
Instructions:
Encoding:
|
|
|
|
Fields:
opcode2:
|
|
Examples:
predicate 001
Instructions:
Encoding
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2:
|
|
Examples:
predicate 010
Instructions:
Encoding:
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2
|
|
|
|
Examples:
predicate 011
Instructions:
Encoding
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2
|
|
Examples:
predicate 100
Instructions:
Encoding
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2:
|
|
Examples:
predicate 101
Instructions:
Encoding
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2:
|
|
Examples:
predicate 110
Instructions:
Encoding
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2:
|
|
Examples:
predicate 111
Instructions:
Encoding
|
|
|
|
Notes:
predicate does not apply to all instructions
Fields
opcode2:
|
|
Examples: