Choose style:

Author Topic: System hangs when idle, ACPI errors in dmesg  (Read 2477 times)

0 Members and 1 Guest are viewing this topic.

Offline xanadu

  • Jr. Member
  • **
  • Posts: 66
  • Karma: 4
  • New Forum User
    • View Profile
  • Peppermint version(s): Peppermint 6
System hangs when idle, ACPI errors in dmesg
« on: June 13, 2015, 01:00:09 pm »
I'm not sure if this is hardware related or a software bug. I have been getting random system hangs while the computer is idle (usually after 8+ hours of idling). When this happens, it stops responding to input and SSH sessions disconnect. If I go to check on the computer (rather than logging in via SSH), sometimes I can still see the desktop, everything is just frozen, other times I turn on the monitor and nothing happens at all. To recover, I have to press Alt+SysReq+REISUB in order to reboot.

I suspect the CPU is overheating, (sensors command reports PCI temp as 86C when idle) but I'm seeing various errors related to ACPI as well. I did some searching, and there have been bug reports filed that seem related to this.

Also, I don't think these problems are specific to Peppermint. I ran Peppermint 3 for 2 years with only a few system hangs, and now that I've upgraded to 5, it hangs every day. I also tried running Peppermint 6 and Chromixium (iirc, a fork of Ubuntu 14.0.2 LTS), and they also ran into the same problems. This is why I'm not sure if it's a kernel bug or a hardware issue.

I guess what I'd like to know is how to proceed to diagnose this. Here's the relevant output from dmesg:

Code: [Select]
 75.766045] nouveau  [  PTHERM][0000:01:00.0] temperature (90 C) hit the 'fanboost' threshold

Code: [Select]
[    0.370082] pci 0000:00:1f.0: address space collision: [io  0x0800-0x087f] conflicts with ACPI CPU throttle [??? 0x00000810-0x00000815 flags 0x80000000]
[   16.575813] ACPI Warning: 0x00000828-0x0000082f SystemIO conflicts with Region \GLBC 1 (20131115/utaddress-251)
[   16.575828] ACPI Warning: 0x00000828-0x0000082f SystemIO conflicts with Region \SACT 2 (20131115/utaddress-251)
[   16.575838] ACPI Warning: 0x00000828-0x0000082f SystemIO conflicts with Region \SSTS 3 (20131115/utaddress-251)

I'm going to try re-seating the CPU and re-applying thermal paste at some point. Even if that isn't the solution, it should help reduce the CPU temp.
« Last Edit: June 13, 2015, 03:37:10 pm by xanadu »

Offline PCNetSpec

  • Administrator
  • Hero
  • *****
  • Posts: 26468
  • Karma: 2866
  • "-rw-rw-rw-" .. The Number Of The Beast
    • View Profile
    • PCNetSpec
  • Peppermint version(s): Peppermint 10
Re: System hangs when idle, ACPI errors in dmesg
« Reply #1 on: June 13, 2015, 04:51:39 pm »
do you know how to temporarily apply a kernel boot parameter for a single boot ?

if so you may want to try the
processor.nocst=1
boot parameter .. and see if that helps bring down the CPU temps and get rid of the ACPI warnings.

If you don't know how to do this .. ask.

and keep an eye on temps whilst testing this parameter



FYI
http://www.linuxtopia.org/online_books/linux_kernel/kernel_configuration/re91.html
(FADT = Fixed ACPI Description Table)
and
http://www.hardwaresecrets.com/article/Everything-You-Need-to-Know-About-the-CPU-C-States-Power-Saving-Modes/611

Fixed ACPI Description Table
WARNING: You are logged into reality as 'root' .. logging in as 'insane' is the only safe option.

Team Peppermint
PCNetSpec

Online VinDSL

  • Global Moderator
  • Hero
  • *****
  • Posts: 5513
  • Karma: 968
  • Peppermint Mod
    • View Profile
  • Peppermint version(s): Developmental Builds
Re: System hangs when idle, ACPI errors in dmesg
« Reply #2 on: June 13, 2015, 04:54:52 pm »
Sorry, in advance, for being rudimentary, but...

For years, around the advent of summer (in the northern hemisphere), I start reading about heat-related issues on various sites.

Here in Arizona, I've had one CPU (portable) and one GPU (desktop) shutdown my machines, in late May.

Taking them out in the garage and blowing out the 'dustbunnies' with a 125 psi air compressor took care of the probs (always does).

Just saying...

Offline PCNetSpec

  • Administrator
  • Hero
  • *****
  • Posts: 26468
  • Karma: 2866
  • "-rw-rw-rw-" .. The Number Of The Beast
    • View Profile
    • PCNetSpec
  • Peppermint version(s): Peppermint 10
Re: System hangs when idle, ACPI errors in dmesg
« Reply #3 on: June 13, 2015, 05:20:35 pm »
Well yeah, for SURE that should be the first step (and worth mentioning) .. check the heatsink and fan are clear and working.

I just took that as a given that he'd know that when he started talking about thermal paste ;)

Sometimes it's easy to forget others may read a topic too .. so yeah worth mentioning



One other thing needs mentioning now - if you're gonna blast 125PSI through your heatsink and fan .. lock the fan in position first .. you can seriously damage a fan over spinning it.
« Last Edit: June 13, 2015, 05:59:34 pm by PCNetSpec »
WARNING: You are logged into reality as 'root' .. logging in as 'insane' is the only safe option.

Team Peppermint
PCNetSpec

Online VinDSL

  • Global Moderator
  • Hero
  • *****
  • Posts: 5513
  • Karma: 968
  • Peppermint Mod
    • View Profile
  • Peppermint version(s): Developmental Builds
Re: System hangs when idle, ACPI errors in dmesg
« Reply #4 on: June 13, 2015, 05:38:35 pm »
[...] you can seriously damage a fan over spinning it.

Yup!  I do it all the time, on fans I don't care about - ones I'm replacing anyway.

I've never borked a bearing, doing this, but a blade or two usually breaks off the hub, rendering them useless and out of balance.  It's great fun, really!   ;D

On wrecking bearings, one thing I might mention -- make sure to blow the dust in the same direction as the air normally flows.  I've accidentally ruined several fans, in the past, before I got wise to this.  I *think* it contaminates the lubricant in the bearing(s), insuring failure shortly thereafter, but that's just an educated guess. 

Offline AndyInMokum

  • Global Moderator
  • Hero
  • *****
  • Posts: 4853
  • Karma: 1028
  • "Keep on Rockin' in the Free World"
    • View Profile
  • Peppermint version(s): PM 9 & PM 10 (64-bit)
Re: System hangs when idle, ACPI errors in dmesg
« Reply #5 on: June 13, 2015, 05:52:01 pm »
Quote
...One other thing needs mentioning now - if you're gonna blast 125PSI through your heatsink and fan .. lock the fan in position first .. you can seriously damage a fan over spinning it.

Yeah, too many people either don't realize or forget that the fan is connected to a little electric motor.  If you spin up the fan with a blast of air.  It generates it's own potential difference and induces a current in a closed circuit.  This can damage components.  I use a very high tech solution.  A toothpick inserted between the fan vanes.  Blast away to your heart's content  ;)!!
« Last Edit: June 13, 2015, 05:53:50 pm by AndyInMokum »
Backup! Backup! Backup! If you're missing any of these -  you ain't Backed Up!
For my system info please L/click HERE.

Offline PCNetSpec

  • Administrator
  • Hero
  • *****
  • Posts: 26468
  • Karma: 2866
  • "-rw-rw-rw-" .. The Number Of The Beast
    • View Profile
    • PCNetSpec
  • Peppermint version(s): Peppermint 10
Re: System hangs when idle, ACPI errors in dmesg
« Reply #6 on: June 13, 2015, 06:05:31 pm »
On wrecking bearings, one thing I might mention -- make sure to blow the dust in the same direction as the air normally flows.  I've accidentally ruined several fans, in the past, before I got wise to this.  I *think* it contaminates the lubricant in the bearing(s), insuring failure shortly thereafter, but that's just an educated guess.

Oddly I do it the other way .. I see it like this - If you blow in the same direction all you're doing is forcing the dust blanket that forms on the fan side of the heatsink fins further into the heatsink.
I blow backwards whist blocking the fan from spinning with a pin .. then use the pin to remove any chunks of dust through the inlet slots.
the back of the fan (including its bearing/lubricant) should be protected from dust blowback by the motor which is underneath it
WARNING: You are logged into reality as 'root' .. logging in as 'insane' is the only safe option.

Team Peppermint
PCNetSpec

Offline xanadu

  • Jr. Member
  • **
  • Posts: 66
  • Karma: 4
  • New Forum User
    • View Profile
  • Peppermint version(s): Peppermint 6
Re: System hangs when idle, ACPI errors in dmesg
« Reply #7 on: June 24, 2015, 05:49:23 pm »
I re-applied thermal paste (Noctua brand), and cleaned out an absurd amount of dust that was compacted in between the fan and the heatsink. This was the 2nd time in the last couple of weeks that I opened it up and cleaned out the dust. The first time, in which I cleaned out most of the dust, had no effect.

This last time seemed to do the trick, though (the dust was really blocking the air flow from the heatsink to the fan). This brought down the temps by about 10C (it idles at ~75C during hot summer days, idles around 55-65 at night when it's cooler). Since doing this, I have not experienced any system hangs. I suspect that there was some overheating issues, as the temps were approaching the warning threshold before I cleaned out the case and re-applied the paste. They are still higher than I would like, but probably typical given than it's 82-88F in this room most days in the summer (also, it's a 10 year old computer).

I have been hesitant to run it all night, as it has been hanging while I'm asleep. Since it seems to run fine during the heat of the day, I will leave it running 24/7 for  few days to see what happens.

How do I pass parameters to the kernel? I'd like to try your suggestion to see if it has any effects on the CPU temperature. I haven't done this in ages. I thought I remember either editing grub's config or dropping into a command line prompt from the boot menu and passing the parameters that way, but that was many years ago now.
« Last Edit: June 25, 2015, 01:48:22 am by xanadu »

Offline PCNetSpec

  • Administrator
  • Hero
  • *****
  • Posts: 26468
  • Karma: 2866
  • "-rw-rw-rw-" .. The Number Of The Beast
    • View Profile
    • PCNetSpec
  • Peppermint version(s): Peppermint 10
Re: System hangs when idle, ACPI errors in dmesg
« Reply #8 on: June 26, 2015, 12:21:25 pm »
See here for how to apply the kernel boot parameter for a single boot:
http://forum.peppermintos.com/index.php/topic,1917.msg18411.html#msg18411

obviously you're replacing "nomodeset" with "processor.nocst=1"

For now remember

a) keep an eye on temps while you test this parameter.
b) at any time you can reboot and that setting will automagically be undone.
c) if you reboot and want it applied again, you'll hhave to reapply it manually

If it *does* help, let me knaow and I'll explain how to apply it permanently
WARNING: You are logged into reality as 'root' .. logging in as 'insane' is the only safe option.

Team Peppermint
PCNetSpec

Offline mattosensei

  • Member
  • ***
  • Posts: 222
  • Karma: 28
  • Linux with 'L' plates
    • View Profile
  • Peppermint version(s): 6
Re: System hangs when idle, ACPI errors in dmesg
« Reply #9 on: July 01, 2015, 06:29:22 pm »
On wrecking bearings, one thing I might mention -- make sure to blow the dust in the same direction as the air normally flows.  I've accidentally ruined several fans, in the past, before I got wise to this.  I *think* it contaminates the lubricant in the bearing(s), insuring failure shortly thereafter, but that's just an educated guess.

Oddly I do it the other way .. I see it like this - If you blow in the same direction all you're doing is forcing the dust blanket that forms on the fan side of the heatsink fins further into the heatsink.
I blow backwards whist blocking the fan from spinning with a pin .. then use the pin to remove any chunks of dust through the inlet slots.
the back of the fan (including its bearing/lubricant) should be protected from dust blowback by the motor which is underneath it

Hmm the fact I tend to use the vacuum cleaner hose to try and remove the dust off the fans on my main desktop is probably proof enough that I shouldn't be hanging out on the Advanced forum...  :-[ ::) :P

Hadn't crossed my mind that forcing the fan to spin would cause a charge that might damage a circuit, or that the ball bearings can be affected.   :-\