DIY All-Flash NAS Vs. Commercial Hardware

[Jeff Geerling] has tried building his own network-attached storage before, but found that the Raspberry Pi just wasn’t able to keep pace with his demands. He’s back with a new all-flash NAS build, and put his new design to the test against proper store-bought gear.

His build is based around the ROCK 5 Model B, which is able to truck data around far faster than most other single-board computers. Internally, it can top 1 GB/sec without too much hassle. He decided to build a NAS rig using the board, putting it up against the turn-key ASUSTOR AS-T10G3.

Using OpenMediaVault to run the ROCK 5 as a NAS, [Jeff] was able to get decent performance out of the setup. With a 3-drive RAID 5 configuration, he recorded write and read speeds of 100 MB/sec and 200 MB/sec respectively, over a 2.5 Gbps network connection. There were also some spikes and curious performance wobbles. While speed was better than [Jeff]’s previous Raspberry Pi experiments, it wasn’t capable of double or triple the performance like he’d hoped. In comparison, the ASUSTOR solution was capable of much greater speeds. It topped out at 600 MB/sec write speeds, and 1.2 GB/sec on reads.

If you’re looking to build a high-performance DIY NAS, the ROCK 5 may be a better solution than most Raspberry Pi boards. However, if you want speed over all else, existing commercial NAS solutions really have the edge. Video after the break.

 

37 thoughts on “DIY All-Flash NAS Vs. Commercial Hardware

      1. It might not be worth doing. It’ll affect performance and consume a lot of space, but its benefit is not as complete as it might seem. When traditional disks completely fail, as they sometimes do, you can replace them and be okay as long as the rebuild time is less than the time before the next failure – modern spinning rust drives are too slow versus their very high capacity to be sure they’ll survive long enough to rebuild. But at least the fact that the drive is very obviously dead lets you know which data to trust and which to throw out. Solid state drives don’t fry very often, but they do wear out and there may be erroneous data that comes out before they get replaced.

        When you discover an error but the drives are all seemingly functional, with raid 5 you only have single parity information, and it is all striped across all drives. So there’s a chance it’s the data that’s incorrect and a chance it’s the parity information that’s wrong. You need an extra source of truth to figure out which it is. You can try rereading things and hoping for a different result, you can try and see which version looks valid, you can test the block and see if it looks bad, but if you had a second source of parity like in raid 6 you wouldn’t need to. We used to have drives with larger blocks that used the extra space for redundant data so the drive itself would know if its block’s data was bad. If a block was bad, it would report that, and it would make the job easier for the rest of the array. That and the better rebuild times made raid 5 a better option than it now is.

        (This explanation is not fully complete but gives at least some idea of the potential pitfalls.)

      2. Raid 5 and 6 are the worst for performance only exception is if you have an enterprise hardware raid card, even then don’t recommend it, 10 is the way to go unless you need the space and can’t get the extra disks for raid 10

  1. > In comparison, the ASUSTOR solution was capable of much greater speeds. It topped out at 600 MB/sec write speeds, and 1.2 GB/sec on reads.

    On a 2.5Gb/s network, this is completely impossible, since 1.2GB/s is ~10Gb/s. And it’s really hard on a 10Gb/s Ethernet link either.

    1. That’s kind of the point. The store-bought solution comes with 10Gb/s built in and it can saturate the link on read operations. The DIY option has 2.5Gb/s and even though the connection is 1/4 the theoretical speed the little SOC can’t fully utilize it.

      1. 1GB/s is 8 Gb/s
        Bytes vs bits.
        The article doesn’t keep them straight which makes the comparison difficult as it’s not clear what they are talking about.

        200MB/s is pretty good. 200Mb/s not so much.

        1. Usually in serial communications, 10 bits = 1 byte. Started with RS-232 which has a start bit, stop bit and 8 data bits, but the rule of thumb seems to carry over with ethernet, fibre, etc. in more recent times due to the average overhead of transmitting each byte. e.g. I have a 100Mb/s ISP contract (don’t laugh), and I get about 10 MBytes per second continuous throughput.

          1. Yes, serial ports basically converged on 8N1, which results in packing 8 physical bits into 10 single-level symbols, which means to get the maximum payload data rate you divide the baud rate by 10. And yes, a lot of future encodings *also* ended up with that same 8/10 ratio, although for very different reasons.

            Ethernet’s spec is *always* in payload rate, though. Fast Ethernet (100 Mbit/s) actually runs at 125 Mbit/s with 4b5b – 4 payload bits in 5 physical bits – encoding (with additional encoding for the twisted-pair version to improve the spectral efficiency). Which again works out to be 10 physical bits in 1 byte, but here, Fast Ethernet is really truly 100 Mbit/s = 12.5 MB/s.

            Later versions of Ethernet change the encoding, so the line overhead gets better, but the spec is payload rate. If I connect 2 FPGAs with a 10 Gbit/s capable PHY, I’ll get really close to 1.25 GB/s throughput.

            The total throughput you get, however, gets reduced by the overhead of the other layers: so what you’re saying is “the upper bound of Ethernet always seems to be about 90% throughput as a rule of thumb.” Which is a pretty good rough estimate.

        2. The article does in fact keep them straight. The DIY solution does 200MB/s reads which is roughly 1.6Gb/s. So pretty good but not saturating its 2.5Gb/s link. The ASUSTOR has a 10Gb/s network link and it manages 1.2GB/s which, considering overhead, is maxing out the connection.

          200MB/s is pretty good. 1200MB/s is better.

      2. I’m pretty sure the SOC isn’t the limit there. If it’s ever hitting ~2.1 Gbps like the article says, it’s fully capable of pegging 2.5GbE, which tops out at around ~2.3-2.4 Gbps after encoding and other overhead.

        1. From the video the DIY device is capable of hitting that rate briefly but not sustaining it. Without a very deep dive it’s hard to say what the bottleneck is. The drives should be good for that level of performance, especially in parallel, so my bet that the limiting factor is the SOC performance but we really cant’ be sure.

          Either way the point stands that you get way more performance per dollar with the off the shelf solution.

          1. I’m not sure what part you were looking at. Yes, he does say “the read speeds were a bit bursty” but it’s sustaining close to 300 MB/s (which is *2.4* Gbit/s!) for multiple seconds.

            That’s not going to be the SOC itself. It’s not like it gets tired or something (unless it’s getting too hot and throttling, which is a separate question).

            I’m much more convinced the issues were either with the drives or the software. For instance, it’s just not possible for the SOC to be the issue as to when the write speed dropped after *300 GB* of writing. It doesn’t have any place to cache 300 GB of data!

        2. I think the soc would be a limiting factor. It only appears to have 4 pcie lanes. One is used by the 2.5G Ethernet .
          The 6 drives are then going to be limited to all sharing 2 lanes. I don’t think you can configure it to supply 3 lanes in one channel, it’s 1x 2x or 4x only

          Internal bus limitations are a real thing, especially in these media/phone focused soc designs.

  2. ASUSTOR AS-T10G3 is a 10G network card, not a NAS device.

    The NAS itself is called ASUSTOR Flashstor 12 PRO (there is a smaller, cheaper version with 6 drives and 2×2.5G networking, the Flashstor 6).

  3. Honestly, just buy a Synology or whatever. I’m someone that does like building this sort of stuff, but as I moved on from my 20s now I’m just more focussed on getting the job done.

    Products like Synology are exceptionally good now, and while some could argue that they need x, y or z extra feature that they don’t offer…I think for most people’s “I want to store a bunch of data with redundancy & maybe back it up to the cloud” is most easily done by just buying something pre-built.

    Seagate Ironwolf drives are going strong, but damn are they noisy. Always nostalgic though cause the Synology has the only HDDs in the house.

    1. Until Synology decides to brick the compatible version of DSM because they decided they want more money from you.

      Or else they decide that any drives that aren’t theirs in drive bays MUST be failing and report as such during drive tests.

      Or else if they decide that 3 year old drives need to be replaced and show errors on drive checking, preventing users from seeing if there’s an actual problem with the drive (WD already did this by the way, expect others to follow).

      Or else they decide to use dependency licenses as an excuse to brick NAS boxes that are just a couple years old because they aren’t new enough.

      I could go on, but I think you can see my point.

      1. @Alex said: “I could go on, but I think you can see my point.”

        You make your point well – planned obsolescence comes to network storage! The problem however is HOW DO WE PROVE IT? We need simple drive emulators made by hackers that can be programmed to plug into M.2 or SATA interfaces and generate errors that look like they do in real life, but in a predictable way. Only then can we put these NAS “solutions”, hardware, software, or both, to test and see which are messing with us on purpose – for profit.

      2. Your point is pretty invalid, as all your points about HDDs are valid no matter what the box is or who makes it.

        Plenty of packages have been abandoned that have nothing to do with Synology, etc.

        Your point is pretty pointless…. Everything you stated can happen no matter if it’s homebrew or store bought appliances.

      3. Well, my Synology NAS from 2012 is still going strong, with frequent DSM updates. Sometimes I think of getting something faster but the thought just vanishes after a time because the thing is just so reliable. So I do wonder if this actually happened to you, or your point is a hypothetical one.

    2. I’ve been wary of commercially NAS solutions, especially RAID ones and especially ones that are marketed for home use (and in budget). I’ve also been wary of hardware RAID cards.

      If the hardware dies can I get my data back? How is it stored on the drive(s)? Is it some proprietary file or RAID system?

      With a homebrew solution with Linux and software RAID that’s not a worry. As long as the drive is still readable I can pop it in any computer and get the data.

      1. And best thing – if device dies after 5~7 years and you want to by same model to drop-in disks and get data – there can be surprise like this model is too old and not on stock. New one have other incompatible proprietary systems. Or at least reversed/changed bay numbering in firmware, to ruin data if disks connected. And compare it with DIY – just plug disks to whatever with same connectors – zpool import -f and good to go. Commercially storage solutions are great when it’s enterprise storage on warranty, if it work but warranty expired – just replace with new. If it dies on support – it’s HP/Dell/other’s problem to fix it. But it works differently when we look at home/soho HW – no support, no compatibility, no parts, no documentation, they can replace thing during 2y warranty but not data – give money and prey IMO. And add highest security like forgotten admin/admin in addition to expose it to cloud as bonus.

  4. The worst thing with all diy nand NAS – sata ssds is almost died – noname or low grade ssds and mx500 is the only exception comes to mind. Samsung and WD drops it at all in consumer drives. nvme are cheaper, faster and better now, but there is no silent, cheap low-power base to install at least 2-3 nvme drives even on pcie x1. To connect pcie switch to sbc/itx/nuc we need m.2 to full pcie and then m.2 carrier card – then diy case for all of this to secure all cables and risers connected in long chain. mATX can handle it without any problem, but nATX box for 2 little m.2 disks is a bit bigger than it can be. If there be things like ASM switch board with few m.2 and cable like U.2 or ribbon to sbc – it would be great for project like this. But nobody will do it – no mass interest for thing.

    1. But there is (depending on your definition of low power), it’s the x8 5 slot LGA 2011 ‘mining motherboard’. I ran it on a 2-core. The hardest part is that the cheapest adapters hold 4 NVMe drives, while the board is obviously 5 x8 slots. That is still 8 or 10 drives (you need a high speed network port somewhere. And they are right around $40. I got a 32GB LRDIMM (single memory slot).

      Of course it would have been interesting if that hard drive mining took off, we might have seen NVMe only motherboards. Cheapest way is still E-waste processors with lots of PCIe and bifurcation. Some desktop boards support “hyper M.2” x4 x4 x4 x4 bifurcation (4 NVMe to x16 cards are $25), as well as having a few NVMe onboard. That’s 5 NVMe direct to the processor, possibly 2.5GBe onboard, and another 3 NVMe off the PCH.

      1. It’s not something low-power for home, maybe ok for non-enterprise office. I have lga2011 as workstation and it too hot and hungry as just nvme to 2.5Gb Ethernet “adapter” without hard production load I’ll see something like i3-x100T, not HEDT.

        1. No, this is a 12v only powered 2-core only slim board “BTC x79 h61”. Not HEDT like what you are thinking. It was designed as bare minimum to get 5 x8 electrical slots using 2011 platform. Worth a look because you get bifurcation and if you shop around its only $40-50.

          Power it off an Xbox 12v power brick for example. The stock BIOS will only use 2 cores, but all the first gen E5 26xx processors are around $5 so you can pick from 3.6ghz 2637, or 2648L 2650L low power, or 20MB cache 2650.

          https://www.aliexpress.us/item/3256805393944754.html

    2. I almost forgot, look into a Lenovo P330 motherboard. x8 PCIe slot and dual NVMe drive bays on the motherboard. (M720q m920q m920x are technically the same motherboard and may or may not have ports actually soldered down, although sometimes you can solder them yourself)

      Use with EG 5400t processor, you’ll have 3 NVMe. You might need to breakout the Wifi socket to an x1 PCIe slot for the network, or use an M.2 2.5Gbe if such a thing exists.

  5. I recently started using a Synology DS1621+ (48 TB Raid 6) over a 10 Gbs connection after trying the DIY stuff. There are other commercial NAS appliances, but Synology is best known for their software (and their price reflects that) and not leading edge hardware innovation.

    NAScompares website and their YT channel compares lots of NAS options including DIY.

  6. The SoC is not the issue here. The RK3588 is pretty much a mini beast.

    Put a decent nvme ssd in that m.2 slot instead of sata + (probably) sata port muliplier) that this multi sata m.2 adapter uses, and it will be able to saturate both of the 2.5Gb network interfaces without raising a sweat.

    If this multi m.2 adapter used an x4 pci-e switch instead of sata + (probably) sata port muliplier, it might be worth getting– depending on the price (it would cost quite a bit more to make it with the pcie-switch, though).

  7. Still waiting for the day that someone just releases a small device with a bunch of M.2 etc. interfaces on it, and you just throw all your old NVMes etc. into it and it just “sorts it out”. Why can’t I just stick ten old drives in there and it just works it out and – after I select how much “security” I want – it just expands to an array of an appropriate size automatically. OMV doesn’t do this, as far as I’m aware.

    The closest I’ve found is the older ReadyNAS, but they aren’t that cheap, and they are usually limited in the number of bays. But when you put drives in, it doesn’t really care what they are, what size they are, it just rebuilds to the greatest capacity it can safely do, sometimes by creating an equal RAID5/6 partition on all the disks, and then any “extra” space on some of the drives is used as a RAID1/0 and joined on so you get more space with the same level of data-security. And it does it all with standard filesystems and mdadm and scripting, from what I can tell.

    Honestly, I have just bought a rack unit for 5 Pi’s, and I look at it and think “Why can’t they be made to just work it out?”. 5 Pi-like devices. Each attached to a bunch of disks of various kinds, speeds and sizes, and *THEY* should do the maths to figure out the best way to provide as much space as possible and tolerate the failure of a Pi, of a bunch of disks, etc. etc. etc. without me having to go near mdadm, lvm etc.

    Make it modular and if I’m running out of space, I just throw another disk at it. If I run out of ports, I throw another Pi at it. That would be proper storage clustering, but even Windows S2D / cluster storage doesn’t work that simply.

    1. The architecture used by Compellent SANs could be adapted pretty easily to do this. They used traditional RAID, but on virtualized devices — it would essentially divide each disk up into equal size partitions, and then dynamically create (and destroy) RAID sets of different types from those partitions. Each RAID set was always striped across the correct number of disks, but the specific disks in each RAID set were different. That meant that an 8TB drive could participate in 2x as many RAID sets as a 4TB drive.

      For a hobbyist-oriented NVMe-only system, you could probably standardize on 256GB chunks managed by LVM to simplify things.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.