C Mask Creation & intel sal

It’s been a while since I posted something, but recently I did not work in anything at all, just playing some Zelda and reading some books (and yes, I will post about the books someday).

Right to the matter at hand. It has been noted to me that my C skills are rusty (god damn it!), so I decided to play a little with it implementing a small dissassembler for ARMv7 binaries (just ARM instructions, no thumb or anything else). And indeed, I found myself in some trouble.


I want to have a function to create bitmasks of uint32_t type. Something like

gen_bmask(int hi,int lo)

It has to match nicely with bit positions for uint32_t, the highest bit being number 31 and lowest bit 0.

gen_bmask(31,0) -> 0xFFFFFFFF
gen_bmask(3,0) -> 0x0000000F
gen_bmask(31,4) -> 0xFFFFFFF0


What my function looks like

uint32_t gen_bmask(int hi,int lo){
    uint32_t mask = ~(~0 << ((hi+1) - lo)) << end;
    return mask;

On first instance, something like this should do the trick.

An example with hi=7 and lo=4

~0                          //0xFFFFFFFF
~0 << ((7+1) -4)            //0xFFFFFFF0
~(~0 << ((7+1) -4))         //0x0000000F
~(~0 << ((7+1) -4)) << 4;   //0x000000F0

A first direct approach to this:


uint32_t get_mask(int hi, int lo){
return (uint32_t) ~(~0 << ((hi+1) – lo )) << lo;

void main(int argc, char* argv[]){
printf("%02X \n",get_mask(7,0)); //expect 0xFF
printf("%02X \n",get_mask(7,4)); //expect 0xF0
printf("%02X \n",get_mask(15,0));//expect 0xFFFF
printf("%02X \n",get_mask(31,0)); //expect 0xFFFFFFFF

I just go directly compiling with gcc, and give it a test run
gcc testing.c :

jordi@ThinkTwice:~/REPOS/armdiss$ ./a.out 

Checking the output… Good, good, good… WHAT?!

What is happening?!

That is what I would like to know! 🙂
On a smaller example:

    printf("%02X \n",get_mask(31,0)); //expect 0xFFFFFFFF

    uint32_t tmp = ~(~0 << ((31+1) - 0)) << 0;
    printf("%02X \n",tmp); 

testing.c:11:5: warning: left shift count >= width of type [enabled by default]
Compiling this time gives a nice hint, I am trying to shift with a higher width. The assumption here is that the result of shifting more than expected will still be 0x00000000 (nothing changes). The output of this new piece of code looks promising:

jordi@ThinkTwice:~/REPOS/armdiss$ ./a.out 

Look at that, seems that the compiler processed in compile-time the value, but apparently, when running, something is going wrong.
At this moment I could go to stack overflow and look for the answer (I am very confident that it’s already there) but I feel interested in working it out.

Clearly the problem falls into the shifting operation, so I will check what is the compiler doing with it.
gcc -O0 -o test.s -S test.c

	pushq	%rbx
	movl	%edi, -12(%rbp)
	movl	%esi, -16(%rbp)
	movl	-16(%rbp), %eax
	movl	-12(%rbp), %edx
	movl	%edx, %ecx
	subl	%eax, %ecx
	movl	%ecx, %eax
	addl	$1, %eax
	movl	$-1, %edx
	movl	%edx, %ebx
	.cfi_offset 3, -24
	movl	%eax, %ecx
	sall	%cl, %ebx
	movl	%ebx, %eax
	notl	%eax
	movl	%eax, %edx
	movl	-16(%rbp), %eax
	movl	%edx, %ebx
	movl	%eax, %ecx
	sall	%cl, %ebx
	movl	%ebx, %eax
	popq	%rbx
	popq	%rbp

In the result (that I stripped to present the function) the two instructions highlighted are the shifting instructions in x86. A couple of sal, as expected… but this is not working as I want when shifting in higher widths. The code seems okay, and there is nothing weird in it, so it is time to check the Intel documentation, and yes! in page 1295 I found my answer.

The count operand can be an immediate value or
the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is
limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used)

Intel® 64 and IA-32 Architectures Software Developer Manuals

Here we have it! when I shift with 32, the mask transforms my shift immediate.
b 0010 0000
b 0001 1111
b 0000 0000

the sall 32 Rd is actually a sall 0 Rd in disguise!!

Fixed Function

I am very lazy, so.. If I can’t shift by 31+1=32, I will shift two times. One for the high value, and one extra shift with one.

uint32_t get_mask(int hi, int lo){
    return (uint32_t) ~(~0 << (hi - lo ) << 1 ) << lo;

Problem solved.

In this case I do not bother to check the input values. My precondition will be that the values are on range 0..31 and that hi > lo always.

This is been a nice afternoon, I did not open the intel reference manuals since I was studying 😛


Intel reference manuals ⇒GO


Leave a comment

Filed under code, curious

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.