Skip navigation.
Home
The QNX Community Portal

View topic - Kernel Dump msg S/C/F 11/2/11 - Page fault

Page 1 of 3

Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Sat Apr 23, 2016 7:29 pm
by sheran.vaz
Hi,

How to debug page fault reported by qnx kernel dump message?

Regards
Sheran

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Sun Apr 24, 2016 6:03 pm
by maschoen
I'm not sure how to read a kernel dump, the problem doesn't come up very often. Here are some things to consider the possibility of, from most to least likely.

1) Bug in a user written driver.

2) Failure of custom hardware.

3) Failure of standard hardware.

4) Bug in the QNX kernel.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 1:53 pm
by sheran.vaz
How to map .sym files to the kernel dump. Will it help debug the issue?

Regards
Sheran

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 2:19 pm
by nico04
maschoen wrote:I'm not sure how to read a kernel dump, the problem doesn't come up very often. Here are some things to consider the possibility of, from most to least likely.

1) Bug in a user written driver.

2) Failure of custom hardware.

3) Failure of standard hardware.

4) Bug in the QNX kernel.


I'll add QNX drivers to the list. Especially the ones manipulating memory like graphics drivers.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 6:24 pm
by sheran.vaz
Thnx for your inputs.

How to map .sym files to the kernel dump. Will it help debug the issue?

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 6:30 pm
by sheran.vaz
Everytime i get the same dump,

instruction[f007f0fe]
c3 e9 3c fc ff ff cd 28 c3 e9 34 fc ff ff b8 5c 00 00 00 f7 05 88 a7 09 f0 00

Looking for how to map the last executed instructions to the processes.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 9:21 pm
by sheran.vaz
The registers edi, edx, eax, eip, efl, cs and ss are all same for every crash.

edi = 00000010 (SAME)
esi = efead7ac
ebp = effcef2c
exx = efef1988
ebx = efead708
edx = f007f0fe (SAME)
ecx = effcee80
eax = 00000000 (SAME)
eip = f007f0fe (SAME)
cs = 0000001d (SAME)
efl = 00001246 (SAME)
esp = effcee80
ss = 00000099 (SAME)

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 9:44 pm
by maschoen
Based on this, there is a little information I can provide. Your eip is f007f0fe. The high bits on this address indicate you are running at the highest protection level which means either procnto is running, or you are in the kernel. The code segment 0000001d is the code segment of proctnto.

Given that it repeats, it seems mostly likely caused by an interrupt handler bug. Have you written any drivers that use interrupts?



sheran.vaz wrote:The registers edi, edx, eax, eip, efl, cs and ss are all same for every crash.

edi = 00000010 (SAME)
esi = efead7ac
ebp = effcef2c
exx = efef1988
ebx = efead708
edx = f007f0fe (SAME)
ecx = effcee80
eax = 00000000 (SAME)
eip = f007f0fe (SAME)
cs = 0000001d (SAME)
efl = 00001246 (SAME)
esp = effcee80
ss = 00000099 (SAME)

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 10:05 pm
by sheran.vaz
Just using the bins and drivers of the QNX 6.5.0 SP1. Do not have any custom drivers using interrupts.

Regards
Sheran

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 10:25 pm
by sheran.vaz
when i do a "pidin backtrace" i can see :

1-01 f007f0f9
1-02 f007f0f9
1-03 f007f0f9
1-04 f007f0fc
1-05 f007eeb4
1-06 f007eeb4
1-07 f007f0fc
1-08 f007eeb4
1-09 f007eeb4
1-11 f007eeb4
1-13 f007eeb4
1-14 f007eeb4
1-15 f007eeb4
1-17 f007eeb4
1-18 f007f0fc

does it mean, it belong to thread 4 of procnto-smp. The target is dual core with hyperthreading enabled, therefore 4 CPUs.
The thread 4 is a special idle thread with priority 0 for the 4th CPU.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Mon Apr 25, 2016 10:33 pm
by sheran.vaz
sorry f007f0fc should be the address of the some routine and i can see it being called by multiple threads in different instances. It is not linked to the the thread.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Tue Apr 26, 2016 1:52 am
by sheran.vaz
virtual Address of procnto-smp Code Segment ==> f0018000
Routine pointed by Instruction Pointer ==> f007f0fe
Offset of the Routine pointed by Instruction Pointer ==> 670fe

If i take a objdump of procnto-smp then

Code: Select all
....
....
000670e4 <__Ring0>:
   670e4:       b8 02 00 00 00          mov    $0x2,%eax
   670e9:       f7 05 00 00 00 00 00    testl  $0x400,0x0
   670f0:       04 00 00
   670f3:       ba fe 70 06 00          mov    $0x670fe,%edx
   670f8:       74 0a                   je     67104 <__Ring0+0x20>
   670fa:       89 e1                   mov    %esp,%ecx
   670fc:       0f 34                   sysenter
[color=#4040FF] 670fe:       c3                      ret
   670ff:       e9 fc ff ff ff          jmp    67100 <__Ring0+0x1c>
   67104:       cd 28                   int    $0x28
   67106:       c3                      ret
   67107:       e9 fc ff ff ff          jmp    67108 <__Ring0+0x24>

0006710c <SchedCtl>:
   6710c:       b8 5c 00 00 00          mov    $0x5c,%eax
   67111:       f7 05 00 00 00 00 00    testl  $0x400,0x0[/color]
   67118:       04 00 00
   6711b:       ba 26 71 06 00          mov    $0x67126,%edx
   67120:       74 0a                   je     6712c <SchedCtl+0x20>
   67122:       89 e1                   mov    %esp,%ecx
   67124:       0f 34                   sysenter
   67126:       c3                      ret
   67127:       e9 fc ff ff ff          jmp    67128 <SchedCtl+0x1c>
   6712c:       cd 28                   int    $0x28
   6712e:       c3                      ret
   6712f:       e9 fc ff ff ff          jmp    67130 <SchedCtl+0x24>
.....
.....



The highlighted code, seems similar to the instruction in the kernel dump, But not exactly same.

Code: Select all
0:  c3                              ret
1:  e9 3c fc ff ff             jmp    0xfffffc42
6:  cd 28                        int    0x28
8:  c3                              ret
9:  e9 34 fc ff ff             jmp    0xfffffc42
e:  b8 5c 00 00 00         mov    eax,0x5c
13: f7 05 88 a7 09 f0


Does this mean the kernel was executing ring0() and schedctl() routines?

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Tue May 10, 2016 3:14 am
by sheran.vaz
Have a qt GUI application, which has a bunch of numbers getting updated every 200ms. Hogs shows utilisation of 70% for this process. Not sure if it is really required for the CPU utilisation to go that much. And can it be this GUI process causing the page fault as it should be accessing the video buffers 0xa0000.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Tue May 10, 2016 5:16 am
by maschoen
sheran.vaz wrote:Have a qt GUI application, which has a bunch of numbers getting updated every 200ms. Hogs shows utilisation of 70% for this process. Not sure if it is really required for the CPU utilisation to go that much. And can it be this GUI process causing the page fault as it should be accessing the video buffers 0xa0000.


I don't know if you are using qt under screen (usually 6.6 or later) or under gf (6.4-6.5). Either way, qt is not writing directly to 0xa0000, but rather is using a screen or qf call.

The way qt works is that it renders graphics to a ram image and then blits it using the screen or gf interface. Depending on what you are updating and how smart qt is, that data could be a small or large amount of data.

One thing you might consider, is why you need to update a number 5 times a second. True the human eye can detect changes this rapidly, but it's a number, not a bad guy you need to shoot. One easy enhancement would be to check to see if the number has changed before updating.

Re: Kernel Dump msg S/C/F 11/2/11 - Page fault

PostPosted: Tue May 10, 2016 6:15 am
by sheran.vaz
I'm not sure if my GUI is actually contributing to the page fault, but is it possible to reduce the GUI's CPU utilisation by keeping my application as is? Also I don't see any utilisation in io-display driver.