It’s been a while since I posted something, but recently I did not work in anything at all, just playing some Zelda and reading some books (and yes, I will post about the books someday).
Right to the matter at hand. It has been noted to me that my C skills are rusty (god damn it!), so I decided to play a little with it implementing a small dissassembler for ARMv7 binaries (just ARM instructions, no thumb or anything else). And indeed, I found myself in some trouble.
Objective
I want to have a function to create bitmasks of uint32_t type. Something like
gen_bmask(int hi,int lo)
It has to match nicely with bit positions for uint32_t, the highest bit being number 31 and lowest bit 0.
gen_bmask(31,0)
-> 0xFFFFFFFF
gen_bmask(3,0)
-> 0x0000000F
gen_bmask(31,4)
-> 0xFFFFFFF0
etc..
What my function looks like
uint32_t gen_bmask(int hi,int lo){ uint32_t mask = ~(~0 << ((hi+1) - lo)) << end; return mask; }
On first instance, something like this should do the trick.
An example with hi=7 and lo=4
~0 //0xFFFFFFFF ~0 << ((7+1) -4) //0xFFFFFFF0 ~(~0 << ((7+1) -4)) //0x0000000F ~(~0 << ((7+1) -4)) << 4; //0x000000F0
A first direct approach to this:
#include
#include
uint32_t get_mask(int hi, int lo){
return (uint32_t) ~(~0 << ((hi+1) – lo )) << lo;
}
void main(int argc, char* argv[]){
printf("%02X \n",get_mask(7,0)); //expect 0xFF
printf("%02X \n",get_mask(7,4)); //expect 0xF0
printf("%02X \n",get_mask(15,0));//expect 0xFFFF
printf("%02X \n",get_mask(31,0)); //expect 0xFFFFFFFF
}
I just go directly compiling with gcc, and give it a test run
gcc testing.c
:
jordi@ThinkTwice:~/REPOS/armdiss$ ./a.out FF F0 FFFF 00
Checking the output… Good, good, good… WHAT?!
What is happening?!
That is what I would like to know! 🙂
On a smaller example:
printf("%02X \n",get_mask(31,0)); //expect 0xFFFFFFFF uint32_t tmp = ~(~0 << ((31+1) - 0)) << 0; printf("%02X \n",tmp);
testing.c:11:5: warning: left shift count >= width of type [enabled by default]
Compiling this time gives a nice hint, I am trying to shift with a higher width. The assumption here is that the result of shifting more than expected will still be 0x00000000 (nothing changes). The output of this new piece of code looks promising:
jordi@ThinkTwice:~/REPOS/armdiss$ ./a.out 00 FFFFFFFF
Look at that, seems that the compiler processed in compile-time the value, but apparently, when running, something is going wrong.
At this moment I could go to stack overflow and look for the answer (I am very confident that it’s already there) but I feel interested in working it out.
Clearly the problem falls into the shifting operation, so I will check what is the compiler doing with it.
gcc -O0 -o test.s -S test.c
pushq %rbx movl %edi, -12(%rbp) movl %esi, -16(%rbp) movl -16(%rbp), %eax movl -12(%rbp), %edx movl %edx, %ecx subl %eax, %ecx movl %ecx, %eax addl $1, %eax movl $-1, %edx movl %edx, %ebx .cfi_offset 3, -24 movl %eax, %ecx sall %cl, %ebx movl %ebx, %eax notl %eax movl %eax, %edx movl -16(%rbp), %eax movl %edx, %ebx movl %eax, %ecx sall %cl, %ebx movl %ebx, %eax popq %rbx popq %rbp
In the result (that I stripped to present the function) the two instructions highlighted are the shifting instructions in x86. A couple of sal
, as expected… but this is not working as I want when shifting in higher widths. The code seems okay, and there is nothing weird in it, so it is time to check the Intel documentation, and yes! in page 1295 I found my answer.
The count operand can be an immediate value or
the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is
limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used)Intel® 64 and IA-32 Architectures Software Developer Manuals
Here we have it! when I shift with 32, the mask transforms my shift immediate.
b 0010 0000
&
b 0001 1111
————
b 0000 0000
the sall 32 Rd
is actually a sall 0 Rd
in disguise!!
Fixed Function
I am very lazy, so.. If I can’t shift by 31+1=32, I will shift two times. One for the high value, and one extra shift with one.
uint32_t get_mask(int hi, int lo){ return (uint32_t) ~(~0 << (hi - lo ) << 1 ) << lo; }
Problem solved.
In this case I do not bother to check the input values. My precondition will be that the values are on range 0..31 and that hi > lo always.
This is been a nice afternoon, I did not open the intel reference manuals since I was studying 😛
References
Intel reference manuals ⇒GO