This weekend, I have been preparing a system to produce some Hive Engine MongoDB dumps in order to help anyone who wants to start a new full node.
Although this system I have does not have lots of storage space, it has some cores to spare. So I have decided to give it a try and restore an 875GB dataset (which is how much my uncompressed MongoDB consumes for a full node of Hive-Engine).
Yes, this consumes TONS of CPU and takes a few hours to send over a 1Gbps link speed, but the outcome is rather nice! 1.65x in compression is really life-changing in many cases, if you have CPU to spare!
# zfs get all dapool/he_mongodb_bkp | egrep -e " logicalused" -e " compressratio" -e " compression" -e " used"
dapool/he_mongodb_bkp used 531G -
dapool/he_mongodb_bkp compressratio 1.65x -
dapool/he_mongodb_bkp compression zstd-11 local
dapool/he_mongodb_bkp usedbysnapshots 0B -
dapool/he_mongodb_bkp usedbydataset 531G -
dapool/he_mongodb_bkp usedbychildren 0B -
dapool/he_mongodb_bkp usedbyrefreservation 0B -
dapool/he_mongodb_bkp logicalused 875G -
And in my case, all I want is to dump a precise point-in-time MongoDB snapshot, so I really don't even care about the "day to day" performance of this.
FYI, this is coming (thanks for the patience)... both full node snapshots of HE and HE History at Hive block 103693705 and Hive Engine block 56676215 (for the history).
I really like what is trying to do, so I am going to pair up with some dedication and help the cause.
Will be providing the link again soon through the usual channel.
Note: Don't forget that when you send ZFS snapshots around, the target pool can have different options from the origin. This makes it super helpful (and it is actually the only way) for "changing" compression of datasets. So, if let's say, tomorrow I want to change from zstd-11 to something else, I just need to send the snapshot into a new dataset that has a different compression algorithm.
I haven't tried a bigger blocksize in ZFS, but it should give you even better results, in theory. I am using the default 128k here. And this was done on two NVMe system (overkill for the amount of compression) with Zen2 8-cores almost flat out all the time. If you have bigger stuff, then it will be much faster.
The thing about compression/decompression is that if you have more cores, they will be used! (assuming you have enough IO to/from the disk).
Anyways, that's my weekend geek time. I thought about changing a bit of the content gears to see if there is anyone interested.
Let me know if you like this kind of content
...and I will try to be a bit more diverse.
Please provide feedback in any "meaningful" way. If you don't understand any of this, and you are not looking for understanding, don't bother replying. You can, but you will be losing your time.
