Home > Hardware Error > Hardware Error Report And Decode Tool Herd 3.0 For Linux

In the case of correctable ECC memory errors, both reports should correctly identify the CPU slot and DIMM number on which the memory error occurred. For example, with the following command: herd -d -e 0 Identifying CPU and DIMMs With MCEs If an MCE occurred before HERD was installed on a system, use the HERD tool Share this on:TwitterFacebookGoogle+Download PDF version Found an error/typo on this page?About the author: Vivek Gite is a seasoned sysadmin and a trainer for the Linux/Unix & shell scripting. These dependencies include the openssl libraries or the OpenIPMI scripts. weblink

All rights reserved. For more information, see Chapter 2, Using the hd Utility on Oracle Solaris; Chapter 4, Using the hd Utility on Linux; or Chapter 3, Using the hd Utility on Windows. Jan 14 18:57:32 host herd: Please contact your hardware vendor Jan 14 18:57:32 host herd: CPU 0 4 northbridge Jan 14 18:57:32 host herd: Northbridge Watchdog error Jan 14 18:57:32 host Use this as a start to creating your own, # and remove these comments.

Sun xVM Ops Center manages multi-platform x64 and SPARC systems, and facilitates many aspects of compliance reporting (ITIL), data center automation, and enables the simultaneous management of systems. The utility resides in the /tools/linux/herd directory. AFAIK.

RAID Utilities RAID allows you to manage an array of disks for increased redundancy, recoverability, and performance. Reply Link nawab April 28, 2010, 8:28 pmif i run your script i am getting this error.. /etc/cron.hourly/mcelog.cron Usage: mcelog [-k8|-p4|-generic] [-syslog] [mcelogdevice] mcelog [-k8|-p4|-generic] -ascii Decode machine check error records Reports disk drive failures, Field Replaceable Units (FRU) information, and hotplug events to the host's service processor (SP).

For more information on the SIA, see the Sun Installation Assistant for Windows and Linux User's Guide. Node : BL280c-G6 1)plcg298: MCE 0 plcg298: HARDWARE ERROR. This utility performs IPMI functions with a kernel device driver or over a LAN interface. his comment is here For more information, go to docs.sun.com.

force_cpu Sets the CPU version information. MCE is nothing but feature of AMD / Intel 64 bit systems which is used to detect an unrecoverable hardware problem. Buy the Full Version You're Reading a Free Preview Pages 51 to 84 are not shown in this preview. why even bother.

After installation, the HERD daemon is automatically setup to run after system boot. http://www.farsiworld.ir/news/decoding_/dev/mem_image_using_C_LinuxQuestionsorg.html Scott Davenport on June 19th, 2009 Sun also puts out a Hardware Error Report & Decode (HERD) tool that does some additional processing on the mcelog. Buy the Full Version Documents similar to 820-1120-22Sun Solaris Command TipsComplete Sun Solaris CommandsFault.txt2012 Support Specialistse 2EnglishXp Installation Guide4MSP Example10.1.1.46fibrecat-sx-en.pdfDell SAS 6 Controllerw_wing03Configure RAID Using HP Array Configuration UtilityVSpace Server Adapters that appear in the Intel PROSet teaming wizard can be included in a team.

When you install the NIC Teaming supplemental software for your Sun server, Intel PROSet software configuration tabs are automatically added to the network adapters listed in Device Manager. http://fileupster.com/hardware-error/hardware-error-0502.html HERD supports a debug option (-d) that gives more system information, including the Opteron CPU identification data, for example: # herd -d -e 0x000008000000 2 cores found, family 15, model 5, The utility discussed in the post (mcelog) is pretty sweet, and provides a portion of the capabilities that are currently available in the Solaris FMA architecture. HERD reads the PCI configuration data of the system DRAM controllers from the corresponding files in that directory.

Starting the HERD Daemon All RPMs that are provided come with the appropriate SysV init scripts. Thanks very much. The BSoD and a kernel panic generated using a Machine Check Exception (MCE). check over here On systems that have a 128-bit configured DRAM interface, HERD can only identify DIMM pairs rather than individual DIMM modules.

It does not try to interpret the MCE data, just alert other apps.Linux Kernel panic source code.man mcelogMachine check exception support information for MS-Windows server 2003 and XP operating systems.

Copyright Follow him on Twitter. Controls connect/disconnect events and logs these events in syslog and, more importantly, in the service processor logs (SDR, FRU, SEL). x64 Servers Utilities Reference Manual C H A P T E R 7 Hardware Error Report and Decode Tool (HERD) 3.0 for Linux Hardware Error Report and Decode (HERD) 3.0

Sun xVM Ops Center Sun xVM Ops Center, part of Oracle Solaris Management Tools, is used to provision, update, and manage the systems. The ipmiflash utility provides methods to upgrade the ILOM service processor and BIOS remotely over the management network and locally from the server. The SunVTS software runs diagnostic tests and outputs log files that are used to determine the problem with the server. this content For more information, see the Sun LSI 106x RAID User's Guide.

Machine checks can indicate failing hardware, system overheats, bad DIMMs or other problems. It supports the same command-line options and uses the same format to report errors to the system log. To install the VMware or VMware ESX Server ISO image, you must first download an ISO image of the software installation CD. The RAID-0 volumes that are mirrored are called submirrors.

This is *NOT* a software problem! hd Utility is included in the SUNWhd package and is preinstalled on your server. Some MCEs are fatal and can not generally be survived without reboot and h/w replacement, but I was able to catch lots of bad h/w before crash with this tool.mcat -

Please contact your hardware vendor CPU 0 4 northbridge TSC aeffd2efa9f1db ADDR 65bc76a0 Northbridge Chipkill ECC error Chipkill ECC syndrome = 84ac bit32 = err cpu0 bit46 = corrected ecc error

All rights reserved. x64 Servers Utilities Reference Manual C H A P T E R 1 Applications and Utilities for x64 Servers This book describes some applications and utilities that NIC Teaming NIC teaming (also known as IEEE 802.3ad Link Aggregation) for Windows is the grouping of Network Interface Cards (NICs) into one logical interface to increase availability and enable load Although they should atleast indicate what architecture it works on.. You mention mcelog only works with 64-bit operating systems.

TABLE 1-1Supported Applications and Utilities by Platform Server (* -- EOL) HERD hd Utility RAID RAID 0/1 NIC Teaming NAM DCMU Sun Fire X2100* Sun Fire X2100 M2* -- S SLW Thischapter has the following sections:* Downloading HERD* About HERD* Installing HERD* Starting the HERD Daemon* Using HERD[/quote]http://download.oracle.com/docs/cd/E19962-01/820-1120-22/chapter7.html--HPT kingman 2011-08-26 18:35:14 UTC PermalinkRaw Message On Fri, 26 Aug 2011 18:16:15 +0000 (UTC), Note - HERD is supported on platforms with AMD processors. plcg298: Please contact your hardware vendor plcg298: CPU 11 BANK 5 TSC 7d0a8fb75c06bd [at 2934 Mhz 138 days 20:43:18 uptime (unreliable)] plcg298: MISC 1091 ADDR 61797b458 plcg298: MCG status: plcg298: MCi