I just love it when I'm building a high-end PC and something just happens to fail and you start looking where exactly the problem is occurring and what is causing the issue, preventing the system to work properly. That Dual Xeon CPU system with 20 cores and 40 logical threads, equipped with 64 Gigabytes of RAM suddenly starts to behave weird as soon as you build it, so you start testing the processors, the motherboard and the memory to see what might be causing it.
Building dual processor systems often can lead to some issues, sometimes they are software related, sometimes they are hardware related... it could be the motherboard BIOS, the two processors just not working well together, the memory controller not liking the RAM modules, or a hardware issue with some of the components. Finally when I have moved the Xeon processors on a single CPU X99 motherboard with 4 different RAM modules to test all four memory channels for each of the processors separately and then the problem suddenly appears...
The problem is one of the channels of the memory controller of one of the CPUs, so I'm finally relieved finding the problem, but it took me a couple of hours until I tested everything and figured where the issue lies. I have started with not all of the memory channels populated and that was a mistake on my side as everything was working just fine with memory on two out of the four memory channels. Fortunately the other processor was just fine, so only one needs to be replaced... that will take some more extra time, not very happy, but just the way things are.
If you have a question or want to add something, then please leave a comment below.
Did you like what you have just read? Check my other posts on steemit
If you like what I'm doing for Steem and on Steemit you can support me as a Witness