FPSNetwork  

Go Back   FPSNetwork > Technically speaking > Hardware troubles
Home Forums FAQ Members List Calendar Arcade Mark Forums Read

Reply
 
Thread Tools Display Modes
  #1  
Old 08-02-2007, 02:04 PM
andyofne's Avatar
andyofne andyofne is offline
Commander in Briefs
 

Join Date: Aug 2006
Location: Omaha.
Posts: 772
andyofne is on a distinguished road
Dell Precision Workstation with unusual crashing events

In the office I have a Dell PWS 530 running Windows 2000 Pro with a 1.8Ghz Xeon processor, 1GB of RDRAM, a Matrox G550 dual head video card, and a SCSI hard drive.

About two weeks ago the system started crashing unexpectedly.

There are no entries for the crashes in the event viewer. No message from the BIOS shows up when you restart (saying things like System shut down was caused by a thermal event), and there is no real rhyme or reason to what causes the system to shut down.

I've run the Dell Diagnostics CD on the system for literally hours and everything reported back fine. I ran memtest on the RAM and it came up clean.

I've also:

- Switched hard drives out
- Switched processor/heat sinks
- Switched video cards
- Tried to install a clean copy of Windows 2000 on a new hard disk (fails)
- Tries to install RedHat Linux on a new hard disk (fails)
- Tried removing extra devices (nothing really extra installed)
- Ordered a replacement motherboard (refurb) only to find out it was the wrong revision and doesn't support my available processors.

I cannot duplicate the shut down event either. It will some times run fine for a while and then some times it shuts down as soon as you double click on a desktop icon.

It seems to be able to stay up and running fine for hours so long as you don't touch it.

It will run in Safe Mode seemingly fine for extended periods of time. However, I can't run the applications I need to use in Safe Mode.

Right now, I'm running a SLAX live linux CD to see how long I can run the system this way.

So, to recap: the system just instantly shuts off and does not report anything , any where. There is no indication what is causing it.

Up to this point, the system has run fine for the last 4 years.

It has a mission critical application, vendor installed, that I cannot reinstall on another machine without the vendor's support.

My boss decided not to pay the annual maintenance agreement so now it will cost between $5,000 and $10,000 to get the system reinstalled or repaired through the vendor.

Has anyone ever seen a system drop dead like this?

I do not believe this is heat related because I created a 'heat' situation and the system shut down BUT posted a message on reboot saying "The system was shut down due to a thermal event". I do not see this message when the system crashes 'normally'.

If you can think of any other tests I can run before I get the replacement motherboard next week, I'll likely try whatever you can suggest.

Thank you, that is all.
Reply With Quote
  #2  
Old 08-02-2007, 02:36 PM
*Xx~Vlad~xX*'s Avatar
*Xx~Vlad~xX* *Xx~Vlad~xX* is offline
Been 'round..
 

Join Date: Aug 2006
Location: Mobile
Posts: 53
*Xx~Vlad~xX* is on a distinguished road
Have you tried: c:\>chkdsk /f from the command prompt to see if there are any bad blocks that the Dell Diag didn't catch? I have seen systems do this when the OS tires to page to a sector on disk w/ a bad block or two. Sometimes it would simply shutdown or BSOD while simply opening an application, or saving a txt file.

I wonder if a Ghost of the old drive onto a new drive would work (being that Ghosting might present a challenge w/ SCSI HD's) I have never tried ghost w/ SCSI drives before, just SATA/EIDE.

Also sounds like a short somewhere on the system board or the Power Supply not dishing out the right amount of AC to the system board.
Reply With Quote
  #3  
Old 08-02-2007, 02:52 PM
Ghanzafar Ghanzafar is offline
Ok, so I've posted
 

Join Date: Feb 2007
Location: Catonsville
Posts: 27
Ghanzafar is on a distinguished road
Also do not forget to check the power supply make sure the voltage is correct. Many problems come from a faulty power supply. The power supply might work for a while but after a while the circuits heat up and crashes can occur. Alot of shit today is made by cheap labor and poor workmanship it could well be a cold sodi joint. It is hard to test the power supply unless you have a tester and voltage meter (try swapping it out and see if that is the problem).
Reply With Quote
  #4  
Old 08-02-2007, 03:12 PM
andyofne's Avatar
andyofne andyofne is offline
Commander in Briefs
 

Join Date: Aug 2006
Location: Omaha.
Posts: 772
andyofne is on a distinguished road
Quote:
Originally Posted by *Xx~Vlad~xX* View Post
Have you tried: c:\>chkdsk /f from the command prompt to see if there are any bad blocks that the Dell Diag didn't catch? I have seen systems do this when the OS tires to page to a sector on disk w/ a bad block or two. Sometimes it would simply shutdown or BSOD while simply opening an application, or saving a txt file.

I wonder if a Ghost of the old drive onto a new drive would work (being that Ghosting might present a challenge w/ SCSI HD's) I have never tried ghost w/ SCSI drives before, just SATA/EIDE.

Also sounds like a short somewhere on the system board or the Power Supply not dishing out the right amount of AC to the system board.
Well, I've actually switched SCSI disks to a new, working disk and it still crashes during the installation phase at the point where it tries to save the configuration... seconds before it finishes.

I have made a ghost image but I haven't tried to put it on the new disk because the system doesn't stay up long enough or accept an OS.
Reply With Quote
  #5  
Old 08-02-2007, 03:14 PM
andyofne's Avatar
andyofne andyofne is offline
Commander in Briefs
 

Join Date: Aug 2006
Location: Omaha.
Posts: 772
andyofne is on a distinguished road
Quote:
Originally Posted by Ghanzafar View Post
Also do not forget to check the power supply make sure the voltage is correct. Many problems come from a faulty power supply. The power supply might work for a while but after a while the circuits heat up and crashes can occur. Alot of shit today is made by cheap labor and poor workmanship it could well be a cold sodi joint. It is hard to test the power supply unless you have a tester and voltage meter (try swapping it out and see if that is the problem).
Agreed.

However, this is a Precision Work Station that has a special, heavy duty power supply. You can't simply put in another ATX supply.

I may be able to swap that out tomorrow with another PWS 530 but I'm not 100% certain about that. I'm going to give it a shot.

Also, the system runs fine in safe mode and it's been running fine off the linux live CD since my first post.

I don't know what I can do to "stress test" the system with the Live CD but I"m trying to make it crash.
Reply With Quote
  #6  
Old 08-02-2007, 03:25 PM
Ghanzafar Ghanzafar is offline
Ok, so I've posted
 

Join Date: Feb 2007
Location: Catonsville
Posts: 27
Ghanzafar is on a distinguished road
I had the same issue a while back. I replaced everything memory harddrive cpu and the last thing I replaced was the powersupply. Go figure the cheapest thing was the last replacement. I repacked the memory cpu and shipped it back to tigerdirect but kept the raptor harddrive.
Reply With Quote
  #7  
Old 08-02-2007, 04:35 PM
andyofne's Avatar
andyofne andyofne is offline
Commander in Briefs
 

Join Date: Aug 2006
Location: Omaha.
Posts: 772
andyofne is on a distinguished road
Well, after talking with Reaper on the phone, I took the board out of the case and examined the capacitors with a magnifying glass (I have old eyes) and I found one that looks like it may have 'burst'.

When I've seen blow capacitors in the past they've been visibly fattened around the middle or actually exploded with shredded paper and black 'burn' marks. This one looks like it puffed out a bit at the top but it's still 'sort of' functioning.



Looking directly down on the top you can clearly see that the capacitor is cracked open. It isn't clear from this angle.
Reply With Quote
  #8  
Old 08-02-2007, 09:36 PM
hoser's Avatar
hoser hoser is offline
Administrator
 

Join Date: Aug 2006
Posts: 541
hoser has disabled reputation
I told you motherboard two days ago.

Bastage.
__________________


Plan comparison chart

Ape shall not kill ape.
Reply With Quote
  #9  
Old 08-02-2007, 09:46 PM
andyofne's Avatar
andyofne andyofne is offline
Commander in Briefs
 

Join Date: Aug 2006
Location: Omaha.
Posts: 772
andyofne is on a distinguished road
I ordered a motherboard over a week ago, if you recall, but I got the wrong one.

So there.
Reply With Quote
  #10  
Old 08-02-2007, 10:12 PM
hoser's Avatar
hoser hoser is offline
Administrator
 

Join Date: Aug 2006
Posts: 541
hoser has disabled reputation
I told you motherboard over a week ago.
__________________


Plan comparison chart

Ape shall not kill ape.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 11:49 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.