author | Chandler Carruth <chandlerc@gmail.com> | |
Fri, 3 Oct 2014 21:38:49 +0000 (21:38 +0000) | ||
committer | Chandler Carruth <chandlerc@gmail.com> | |
Fri, 3 Oct 2014 21:38:49 +0000 (21:38 +0000) | ||
commit | 91ea3e41ae46348d520e9cdf8123748d01b2a46a | |
tree | 1c46a7f4385502e0f2873ed9d35b86e2f67b7b67 | tree | snapshot (tar.xz tar.gz zip) |
parent | 69ee7cb4c3a7736574587d007b8002c5aa02914e | commit | diff |
[x86] Adjust the patterns for lowering X86vzmovl nodes which don't
perform a load to use blendps rather than movss when it is available.
For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.
This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219022 91177308-0d34-0410-b5e6-96231b3b80d8
perform a load to use blendps rather than movss when it is available.
For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.
This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219022 91177308-0d34-0410-b5e6-96231b3b80d8