

Or in older GCC, -mno-sse -mno-mmx and avoid any use of float or double types to avoid x87.

Most real-world kernels are built with options that stop the compiler from using SSE or x87 instructions on its own, for example gcc -mgeneral-regs-only. See for links, including to Intel's insn set ref manual. That you might have compiled your program with -mavx, -march=sandybridge or equivalent, causing the compiler to emit the VEX-encoded version of everything. My first thought on your old version of the question was Took me a sec to find, but explains how to alter CR0 and CR4 to allow SSE instructions to run on bare metal without #UD. In that case all instructions that touch xmm regs will fault. If you're running an ancient or custom OS that doesn't support saving XMM regs on context switches, it won't have set the SSE-enabling bits in the machine control registers. The answer to this question explains how to test it: if in protected mode, that is, when bit 0 ( PE) in CR0 is set to 1, then you can test bits 0 and 1 from the CS selector, which should be both 0.įinally, the custom OS must properly handle XMM registers during context switches, by saving and restoring them when necessary. Note that, in order to be able to write to these registers, if you are in protected mode, then you need to be in privilege level 0. Peter Cordes pointed me to the SSE OSDev wiki, which describes how to enable SSE by writing to both CR0 and CR4 control registers: clear the CR0.EM bit (bit 2) If an operating system did not provide adequate system level support for SSE, executing an SSE or SSE2 instructions can also generate #UD. If you're running an ancient OS that doesn't support saving XMM regs on context switches, the SSE-enabling bit in one of the machine control registers won't be set. Trying to execute any SSE instruction results in an interruption 6, illegal opcode (which in Linux would cause a SIGILL, but this isn't Linux), also referred to in the Intel architectures software developer's manual (which I refer from now on as IASDM) as #UD - Invalid Opcode (UnDefined Opcode).Įdit: Peter Cordes actually identified the right cause, and pointed me to the solution, which I resume below:


I have a Pentium M CPU and a custom OS which so far used no SSE instructions, but I now need to use them. (This question was originally about the CVTSI2SD instruction and the fact that I thought it didn't work on the Pentium M CPU, but in fact it's because I'm using a custom OS and I need to manually enable SSE.)
