Today, many questions around this topic, computing system architecture must step away from the computing architectures based on the nearly sixty years old Neumann concept. There is a universal consensus on the fact, that the future architecture will be memory driven, not processor driven. The only question is that, what implementation method could be more successful. In this article I try to collect the main two concepts which I currently see really straightforward.
Method 1: Memory centric architecture
This week, I had the opportunity to attend to the HPE Discover event, where I saw the concept of the Machine concept. This concept centralizes memory to a fault tolerant, highly available layer, which is available to all processing unit, it doesn't matter which technology we want to use (GPU, CPU, FPGA, etc..) Of course each of the computing units have local memory, but every data is accessed online, from the collective persistent memory. You can find the presentation below:
Development is really an ongoing progress. Last year they presented a software simulated model, now they have a working hardware, which is composed of currently available parts, but every component can be changed when the technology will be generally available.
Method 2: Memory decentralized architecture
The other method is the IBM way, which the company started in early 2011 with the brain simulation SyNAPSE architecture. They built also first software simulation to their concept, and later they created their processor which named Truenorth. These processors can be used to build clusters from them. The main advantage of the architecture is the low power consumption, and high processing speed. As the whole world moving in the direction of Machine Learning, and Machine Intelligence, I think we need to optimize the most of the systems in these areas.
With this concept we are able to run billions of threads parallel. This is an extremely massive parallel computing. What I really appreciate in this concept is it's similarity to the human brain. A leading researcher from IBM presented a publication recently about the usage of this architecture for deep learning.
Link to the publication
My vision regarding these architectures
My main concern with HP concept, is the performance of the link to the memory. They also presented "Edge computing" concept, where everything should be processed at the edge before storing data to another place. If we combine the two concepts, processing has to be settled near to the memory. But this goes against the overall basic concepts. The main problem will always be the parallelism, and the bandwidth of the interconnects.
With the IBM method I can be familiar, but it needs to evolve in several levels, to become useful for future needs. Currently it's mainly used by DARPA, and recently there have been some publication regarding SAMSUNG use.
The only problem with this design, is the increase of the number of layers between synapses, but this similar to the human brain.
In my vision the solution might in the interconnect layer. We need to connect all synapses, which have memory, and processing unit to a giant crossbar, where connected synapses communicate with the same color of light. With this possible solution, we are able to connect a large number of units with the same cabling dynamically. Not only the diversity of connecting possibilities is the main goal, but the dynamic cluster reconfiguration between the synapse units.