Hyenas and Computer Chips

The above image was made with stable diffusion using the prompt 'hyenas and computer chips.'

Today I've been thinking about code. I have a working knowledge of web languages and an intermediate grasp of Python, but I'm definitely not a pro. More of a hobbyist. I make simple static sites for IPFS and play around with graphs. I've messed around a fair bit with machine learning and natural language processing, but in the pre-transformer era. Much has changed in the last few years. Getting caught up on how the new GPTs work has involved a steep learning curve.

As neural network architecture, transformers rely on positional encoding, attention, and self-attention. Because of the rate at which the technology is progressing, I'm hesitant to spend much too much time learning their specifics. But the impact transformers are having is undeniable. Here's a quote from a good page on the subject:

This is where Transformers changed everything. They were developed in 2017 by researchers at Google and the University of Toronto, initially designed to do translation. But unlike recurrent neural networks, Transformers could be very efficiently parallelized. And that meant, with the right hardware, you could train some really big models. How big? Bigly big. GPT-3, the especially impressive text-generation model that writes almost as well as a human was trained on some 45 TB of text data, including almost all of the public web.

Hyena

Transformers have input limits, in part because of their reliance on attention. According to this paper, "Attention is fundamentally a quadratic operation, as it compares each pair of points in a sequence. This quadratic runtime has limited the amount of context that our models can take." Basically, it would be nice to feed a large language model an entire textbook or even a vast collection of recorded keystrokes and have it predict the next item in the sequence, but this is beyond the GPTs' capabilities.

Enter Hyena, which promises to "significantly increase context length in sequence models." Hyena does this in part by leveraging the data dependence of the attention layer. It apparently replaces an important explicit computation with the definition of an implicit decomposition. That's probably as much as I'll ever need to know about it. While I doubt Hyena will blow away GPT, I do think there will be at least one awesome Hyena-based app by this time next year.

RISC-V

Every computer chip has an instruction set architecture (ISA). Two proprietary ISAs have basically dominated the market. For a long time, Intel’s x86 and ARM Ltd's ARM architectures were the only game in town. But in recent years, a new open source ISA has been quietly spreading through tech industry startups. It's called RISC-V, pronounced 'risk five.'

NASA recently embraced RISC-V for spaceflight computers. Alibaba now has a RISC-V laptop called the Roma listed for sale. If I had infinite time, I'd buy this just to play with it. According to this article, "The appeal of RISC-V extends beyond the fact that chip developers can access the architecture for free — it also has a modular setup that makes it easier to create chips that are optimized for certain functions without any unnecessary extras."

So Hyena reads more context input with less computing power and RISC-V is designed to run cleaner than its proprietary counterparts. Both are small but important steps in our technological development. In five years, I might have a RISC-V laptop, but I expect AI to have advanced much beyond transformers and Hyena. With AI, we've been in a stage of just going bigger, but we may see a breakthrough come from something more fundamental. A discovery in neuroscience, perhaps, applied to machine intelligence.

Read my novels:

Small Gods of Time Travel is available as a web book on IPFS and as a 41 piece Tezos NFT collection on Objkt.
The Paradise Anomaly is available in print via Blurb and for Kindle on Amazon.
Psychic Avalanche is available in print via Blurb and for Kindle on Amazon.
One Man Embassy is available in print via Blurb and for Kindle on Amazon.
Flying Saucer Shenanigans is available in print via Blurb and for Kindle on Amazon.
Rainbow Lullaby is available in print via Blurb and for Kindle on Amazon.
The Ostermann Method is available in print via Blurb and for Kindle on Amazon.
Blue Dragon Mississippi is available in print via Blurb and for Kindle on Amazon.

See my NFTs:

Small Gods of Time Travel is a 41 piece Tezos NFT collection on Objkt that goes with my book by the same name.
History and the Machine is a 20 piece Tezos NFT collection on Objkt based on my series of oil paintings of interesting people from history.
Artifacts of Mind Control is a 15 piece Tezos NFT collection on Objkt based on declassified CIA documents from the MKULTRA program.