Skip navigation.
Home
The QNX Community Portal

View topic - Regarding ThreadCtl runmask

Regarding ThreadCtl runmask

anything that doesn't fit to other groups.

Regarding ThreadCtl runmask

Postby lullaby » Tue Feb 26, 2013 6:24 am

Hi all,

Regarding my previous query in http://www.openqnx.com/phpbbforum/viewt ... =7&t=13625
My code segment is like the following:-

*cpu_run_mask = 0x1;
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
ClockCycles();
Write to disk and perform other calculation.....
ClockCycles();
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);

I have a query. I understood from ThreadCtl help page that "By default, a thread's runmask is set to all ones, which allows it to run on any available processor. A value of 0x01 would, for example, force the thread to run only on the first processor."
My question is "If the first processor is not available for some reason, what would happen if ThreadCtl to lock to first processor is called?
Does this cause some system hang?
What happens if I replace the cpu_run_mask with the default value 0xFFFF (all ones)? Does this mean the following operations (within ThreadCtl block) should be run on an available processor at that time? Or is it equivalent to not putting any thread locking? I mean, if the run mask is set to the default all ones value, the thread will run on any CPU and thereby the processor affinity is not satisfied.
I am not clear with the concept of this run mask. Please help. Also I am trying to analyse the system hang issue. :cry:

Thanks,
Lullaby
lullaby
Active Member
 
Posts: 65
Joined: Fri Jun 15, 2012 6:19 am

Re: Regarding ThreadCtl runmask

Postby maschoen » Tue Feb 26, 2013 8:32 pm

lullaby wrote:Hi all,
*cpu_run_mask = 0x1;
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
ClockCycles();
Write to disk and perform other calculation.....
ClockCycles();
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);


I don't understand why you are calling ThreadCtl() a second time. The first call limits your thread to the first processor. The second call does the same thing.

I have a query. I understood from ThreadCtl help page that "By default, a thread's runmask is set to all ones, which allows it to run on any available processor. A value of 0x01 would, for example, force the thread to run only on the first processor."
My question is "If the first processor is not available for some reason, what would happen if ThreadCtl to lock to first processor is called?


Your use of the word "lock" already suggests a confusion. Yes, you are locking the thread to cpu 1. This is not the same as a mutex.

Does this cause some system hang?


No, it would only cause your thread to hang.

What happens if I replace the cpu_run_mask with the default value 0xFFFF (all ones)?


Where are you doing the replacement?

Does this mean the following operations (within ThreadCtl block) should be run on an available processor at that time?


This is really confusing, are you saying:

ThreadCtl( (Not real code) 0xffff)
Thread-block
ThreadCtl( 0xffff)

?????
This would do nothing.


Or is it equivalent to not putting any thread locking?


See, here the work "locking" is getting you into trouble.

Most programs would do one of two things with respect to cpu affinity.
1) Leave the default 0xffff
2) Set it at startup once

While there might be a hardware reason, in general, jacking around the affinity makes no sense.

I mean, if the run mask is set to the default all ones value, the thread will run on any CPU and thereby the processor affinity is not satisfied.

This is making something very simple, very confusing. You set the affinity mask to limit the thread to specific processors. That is all it does.

I am not clear with the concept of this run mask. Please help. Also I am trying to analyse the system hang issue. :cry:


Well now something that makes sense. Read my comments above and the documentation on ThreadCtl().
It doesn't sound like cpu affinity is something you want or need to do. In fact, it should be needed rarely.
I'd be curious to know why you think you need it.
maschoen
QNX Master
 
Posts: 2503
Joined: Wed Jun 25, 2003 5:18 pm

Re: Regarding ThreadCtl runmask

Postby lullaby » Wed Feb 27, 2013 4:53 am

Hi,
Thank you for the detailed reply. Please see the answers to your questions below:-

"I don't understand why you are calling ThreadCtl() a second time. The first call limits your thread to the first processor. The second call does the same thing."

Your use of the word "lock" already suggests a confusion. Yes, you are locking the thread to cpu 1. This is not the same as a mutex.


[Lullaby] >> I interpreted locking from QNX help page when I read the following. May be, my interpretation is wrong. Please correct me.
Symmetric MultiProcessing systems
This function, depending on the CPU architecture, returns a value from a register that's unique to each CPU in an SMP system — for instance, the TSC (Time Stamp Counter) on an x86. These registers aren't synchronized between the CPUs. So if you call ClockCycles(), and then the thread migrates to another CPU and you call ClockCycles() again, you can't subtract the two values to get a meaningful time duration.

If you wish to use ClockCycles() on an SMP machine, you must use the following call to “lock” the thread to a single CPU:

ThreadCtl(_NTO_TCTL_RUNMASK, ...)


I need to calculate the execution time of write system call. For that I use ClockCycles() before and after the write() call. Since my multithreaded application run on a multi-core machine, I implemented this thread locking mechanism. When I read ThreadCtl, I interpreted it's something like mutex lock-unlock. Also please read the following extract from help page of ThreadCtl:-
_NTO_TCTL_RUNMASK_GET_AND_SET
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, data)

Get and set the runmask (the processor affinity) to a proper value for the calling thread in a multiprocessor system. The data parameter is a pointer to the runmask. On input, the pointer to value is used to set the new runmask for the thread (see _NTO_TCTL_SET_RUNMASK for details). After the function has completed, the contents of *data will be replaced with the previous runmask for the thread. Calling ThreadCtl again with the same pointer will restore the runmask to the state before the call.


So my code somewhat look like:-
*cpu_run_mask = 0x1;
while(1){ .....
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
ClockCycles();
Write to disk and perform other calculation.....
ClockCycles();
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
.....
}

I mean by the first call, I am locking the thread to first processor and by the second, I am unlocking the thread.

No, it would only cause your thread to hang.

[Lullaby] >> In our case, when my application is running overnight in a quadcore machine, the whole system hangs. Even the system time is not updated. I am yet to find the issue. For analysing this issue, I suspect the ThreadCtl() function.

Where are you doing the replacement?

I mean, I just define the content of cpu_run_mask as 0xFF and build/run the application again.
Code look like:-
*cpu_run_mask = 0xFF;
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
ClockCycles();
Write to disk and perform other calculation.....
ClockCycles();
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);

This is really confusing, are you saying:

ThreadCtl( (Not real code) 0xffff)
Thread-block
ThreadCtl( 0xffff)

?????
This would do nothing.


[Lullaby] >> Yes, I mean the same. In my code, the above ThreadCtl block is in a while loop. As I need to calculate the ClockCycles after every write operation. Is this cause my system to hang after a long run? I interpret that after locking the operations within a ThreadCtl block, the operation will run on any available processor at that time if run mask is set as 0xFF.
If it is not so, could you please clarify a bit? So are you saying that the second statement is not needed. And if at all ThreadCtl is called anywhere once at the start of a thread, that thread will run only on one processor ( no matter if 0x1 or 0xFF is given as run mask). Is my new understanding correct? If so, can I rewrite my code as:-

On thread startup,
*cpu_run_mask = 0xFF;
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
:
:
:
while(1)
{
ClockCycles();
Write to disk and perform other calculation.....
ClockCycles();
:
:
}

and no need of calling the ThreadCtl() again, Is it right?

See, here the work "locking" is getting you into trouble.

Most programs would do one of two things with respect to cpu affinity.
1) Leave the default 0xffff
2) Set it at startup once

While there might be a hardware reason, in general, jacking around the affinity makes no sense.


So are you telling that:- If I call
*cpu_run_mask = 0xFF;
ThreadCtl(_NTO_TCTL_RUNMASK_GET_AND_SET, (void*)*cpu_run_mask);
It is the same as not locking the thread.
or could you please tell me for my special requirement of calculating clockcycles in every loop, does my above rewritten code with ThreadCtl() only at startup work?


By now, I think you are pretty clear of my requirement. Could you please clarify me if my interpretation is wrong?

Thanks,
Lullaby
lullaby
Active Member
 
Posts: 65
Joined: Fri Jun 15, 2012 6:19 am

Re: Regarding ThreadCtl runmask

Postby maschoen » Wed Feb 27, 2013 8:49 am

Well in spite of what the documentation says, ThreadCtl() is merely locking the thread to a cpu. It's not anything like a mutex lock.

Yes, if you want to use ClockCycles() on a multiprocessor, it would be a good thing to lock the thread to a single cpu.
You will be measuring the time between the call and return of the system write().
This measurement will be fairly pointless if you understand how the whole thing works.

If you open a file normally and write to it in this manner, then most likely the system will just copy the data to a cache and return. So in that case you will be measuring mostly the message passing data rate from one process (yours) to another (the file system). I say mostly because if the amount of data is small, then the overhead will dominate the measurement. Is that what you want to measure?

Of course if the cache is full, which could happen, then what happens is a little different. The file system will need to flush some "stale" data to disk, not necessarily yours, to free up space in the cache. It probably will not be flushing the same amount of data to disk as you have requested in your write. In this case, your measurement will be almost meaningless.

There are some file settings that will force the data to disk immediately, but if you measure that, you are not measuring the real throughput of the file system.

In other words, it's fairly hard to measure the speed of the file system, because it operates asynchronously.

This all sounds suspiciously like a long and drawn out discussion that took place a few weeks ago in which we tried to tell the poster, that the only way to get a reasonable average I/O speed measurement would be to write a large amount of data. If you do that, there is no need to use ClockCycles(). The regular time functions will give you more than enough accuracy, and you can throw away the whole ThreadCtl() strategy.
maschoen
QNX Master
 
Posts: 2503
Joined: Wed Jun 25, 2003 5:18 pm


Return to General Programming

Who is online

Users browsing this forum: No registered users and 1 guest