10/05/02 20:10:58 rgIg+qs9
> - Improve fetch bandwidth by reducing code size.
> This is accomplished by:
> Replacing "packed double (PD)" and "packed integer (DQ)" forms of some
> SSEx/AVX instructions with "packed single (PS)" forms, where the "PS" form
> of the instruction is 1 byte shorter than the "PD" or "DQ" form.
> This applies to
> - use movaps instead of movapd/movdqa.
> - use movups instead of movupd/movdqu.
> - use xorps/andps/orpd instead of xorpd/andpd/orpd.
未だにこんなチューン必要なのかよ
何のための32B/clkの命令フェッチだよwww
ちなみにこれIntelプロセッサでは性能低下するコードの組み方の典型。
ああ、競合が遅くなるからある意味「最適化」なのか。