BULLETIN NUMBER:  RH-SWB-002

 

DATE ISSUED:  October 31, 2007

 

DATE CLOSED:  N/A (this problem will never be solved)

 

AFFECTED SYSTEMS:  iHawk systems using Nvidia chipsets on motherboard

                           NOTE: this does not apply to Nvidia graphics controllers

 

RELEASE LEVEL:  RedHawk Linux 4.1 and all 4.1 updates

 

EXPLANATION:  IRQ0 shielding broken on some Nvidia based iHawks.

 

 Timer overrides on some Nvidia chipsets do not work.  The breakage can

 depend on the BIOS version.

 

 The result is that IRQ0 is programmed as an XT-PIC IRQ.

 

 XT-PIC is legacy 8259 (PIC) mode and does not support multiprocessor IRQ

 handling.

 

 It has been observed that this prevents shielding IRQ0 on CPU0 (presumably,

 the boot cpu). There may be other problems associated with this such as

 constant generation of parallel port interrupts that also can not be

 shielded.

 

 

 The condition is fairly easy to spot in /proc/interrupts:

 ---------------------------------------------------------

 

           CPU0       CPU1       CPU2       CPU3

  0:      92639        937        190         69          XT-PIC  timer

 

                

 This will be accompanied by the following dmesg output (pre-4.1.11):

 --------------------------------------------------------------------

 

 Nvidia board detected. Ignoring ACPI timer override.

 

            <snip>

 

 ..MP-BIOS bug: 8254 timer not connected to IO-APIC

 failed

 

 

 The 2.6.15.4 linux kernel assumes _ALL_ nvidia chipset timer overrides

 are bogus and therefore always forces the "acpi_skip_timer_override"

 policy, whenever the boot code discovers an nvidia chipset.

 

 This may, or may not be the correct action, and has been observed to

 change depending on the BIOS rev.

 

 The acpi_skip_timer_override policy has been relaxed in later kernels,

 depending on the presence of an HPET timer and/or the _exact_ chipset

 id.

 

 More recent motherboards with Nvidia chipsets likely support ACPI

 timer overrides.

 

RESOLUTION:  There are a few workarounds

 

 1.  Update to RedHawk 4.2 or later

 

 RedHawk 4.2 and later kernels are "tickless" and do not use global timer

 interrupts (IRQ0), so this problem should not affect RedHawk 4.2, even though

 there may be similar boot messages and IRQ0 progammed as an XT-PIC interrupt.

 

 2.  Update to RedHawk 4.1.11 or later

 

 The RedHawk 4.1.11 default behavior is to allow ACPI timer overrides.  This

 policy should be the most appropriate for more recent motherboads.

 

 The RedHawk 4.1.11 default behavior can be overriden by booting with the kernel

 boot parameter:

 

                        acpi_skip_timer_override

 

 The correct policy to allow or disallow ACPI timer overrides must be

 determined for each individual system.

 

 *** NOTE:  The default 4.1.11 nvidia chipset policy may break some systems!

            The breakage can be reversed by using the boot parameter.

 

 

 The 4.1.11 boot messages now contain the following messages:

 

 Default:

 --------

 Nvidia board detected. Allowing ACPI timer override.

 If you have timer trouble try acpi_skip_timer_override.

 

 Using the acpi_skip_timer_override boot parameter:

 --------------------------------------------------

 Nvidia board detected. Ignoring ACPI timer override.

 WARNING: acpi_skip_timer_override may break shielding of IRQ0.

 

 

 The correct choice can be verified by looking at /proc/interrupts:

 ------------------------------------------------------------------

 

           CPU0       CPU1       CPU2       CPU3

  0:        137          3         24      37812    IO-APIC-edge  timer

 

 

 Notice the above timer interrupt is correctly programmed as "IO-APIC-edge".

 

 

 3.  Upgrading and/or downgrading the BIOS rev may or may not change the

     result.

 

 CAUTION:  ALLWAYS HAVE A BACK UP COPY OF YOUR CUURENT BIOS BEFORE ATTEMPING

 TO CHANGE THE VERSION.