Instruction alignmente effects

Instruction alignmente effects

Most 16-bit instructions can be replaced by a 32-bit instruction by adding .w at the end of the mnemonic, shifting subsequent code by 16 bits and providing the ability to influence alignment.

Comparison of between unaligned and aligned move instructions.

sample_a.s
sample_a:
    mov r8, r8
    mov.w r8, r8
    mov.w r8, r8
    mov.w r8, r8
    bx  lr

Minimal example of unaligned instructions.

sample_b.s
sample_b:
    mov.w r8, r8 // Changed
    mov.w r8, r8
    mov.w r8, r8
    mov.w r8, r8
    bx  lr

Minimal example of aligned instructions.

Performance comparison of unaligned and aligned move instructions.

ExampleSample aSample bB
Instructions executed44
LSU count00
CPI count10
Fold count(-) 0(-) 0
Cycle count54

While the operations are identical, and the same number of instructions are executed, the code from example is slower. Taking a closer look at the results the difference is in the CPI counter, even though we are not executing any variable execution time instructions.

Disassembly of unaligned moves (sample a).

adrbytesmnemonicoperantsexec_countaligned
0x80001b0c046movr8, r81True
0x80001b24fea0808mov.wr8, r81False
0x80001b64fea0808mov.wr8, r81False
0x80001ba4fea0808mov.wr8, r81False

Disassembly of aligned moves (sample b).

adrbytesmnemonicoperantsexec_countaligned
0x80001b04fea0808mov.wr8, r81True
0x80001b44fea0808mov.wr8, r81True
0x80001b84fea0808mov.wr8, r81True
0x80001bc4fea0808mov.wr8, r81True

Moves between lower registers can be encoded using 16 bits (narrow). Moves with immediate values can not be encoded as a 16-bit instruction, those moves will be forced to use the 32-bit (wide) encoding. However, a move setting flags movs using an immediate value can be encoded as a 16-bit instruction. Most instructions can be encoded in both ways and changes to the operands can affect the encoding and break the instruction alignment. Always check if instructions are aligned after assembling them when optimizing for time.

Visualization of the bytecode of unaligned and unaligned instructions.

Bytes encoding a single instruction are marked in the same weight (bold or normal). The weight is swapped at the start of each instruction.

AddressSample aSample b
0x80001b0c0464fea4fea0808
0x80001b408084fea4fea0808
0x80001b808084fea4fea0808
0x80001bc08084fea0808