If someone brought you an odd piece of electronic hardware and you wanted to identify it, you’d probably look for markings on the outside first. If that didn’t work out, you might look under the cover and read some markings on the board or key components. However, in a tough case, you might dump the firmware and try to guess what the device is or what it does by examining the code that makes it run. That’s kind of what [Ciro] did. Wanting to determine the bacteria in a water sample led to using relatively inexpensive DNA sequencing hardware to look at the DNA present in the samples. This would have been a huge undertaking for a well-funded lab just a few short years ago. Now it just takes a USB device and some software.
Of course, inexpensive is in the eye of the beholder. The micropore sequencer costs about $500 and has a one-time use consumable cost of about $500, although that’s enough to process about 10 human genomes. The technology depends on using a small pore only large enough to pass one strand of DNA at a time. Blocks of nucelotides cause different amounts of electrical current to flow through the pore.
A lot of blood cell counters work in a similar fashion, but have an easier mechanism as they look for the tiny aperture’s partial blockage. Determining which DNA components are passing through the pore requires very precise measurements.
Of course, such a thing is hardly plug and play. First, the water samples needed reduction using several filtration and separation techniques. Then a polymerase chain reaction clipped the part of the DNA that would allow for the identification of unique bacteria types. Truly, the work in concentrating the DNA samples seemed to dwarf the actual sequencing.
Honestly, we aren’t as knowledgeable about DNA science as we’d like to be, but we were impressed at the results you could get for less than you’d spend on a big PC. Polymerase chain reaction technology has become cheap and simple, maybe this is the next frontier on biohacking. We’ve looked at how you can get started with PCR several times.
I nearly sent an article on this in, it’s like a bar code reader for DNA, achieved by pulling it through a tiny hole, or nanopore. Some signal processing required to turn the tiny impulses into useful data. Super cool, cost to first sequence human genome $5 Billion, cost with this ~20 years later, 10 for $500 so $50 per… that’s knocking a zero off every couple of years. In 2 years it might be $5, in another two 50 cents.
.. and oh yah, the tech is being open sourced.
StackOverflow blog has a good explanation how these work: https://stackoverflow.blog/2021/02/03/sequencing-your-dna-with-a-usb-dongle-and-open-source-code/
These little MinION are pretty cool and they have come a long way since their inception when raw-read accuracy rates were <60% (i.e. ~40% of all base pairs would be called incorrectly). They are now around 80-90% which is great (and I've been hoping to give one a field test soon), but still orders of magnitude less accurate then the much more expensive platforms like Illumina. Of course biology is messy and there are all sorts of caveats to these numbers, but in general it explains why they are not industry standard yet.
You can mitigate this issue by over-sequencing only the most common things in your DNA library at the expense of missing the rarer sequences. This tradeoff between accuracy and depth is really the main limitation of nearly all DNA sequencing projects. And often times the rare stuff is the most interesting, but if your accuracy is poor it can be hard to know for sure if the rare thing you found is real…
Oh and finally there is some contention about using the term metagenomics for these 16S studies. Technically metagenomics is when you sequence the full genomic content of all the organisms in a sample. When you do amplicon sequencing on a single gene (16S) it really is "metaphylogenomics" or "metabarcoding". The sequencing companies like to call it metgenomics because it's more of a buzzword, but it's really not accurate. Overall cool project though! I'm looking forward to seeing the raw data when it's posted.
@Al Williams said: “Polymerase chain reaction technology has become cheap and simple, maybe this is the next frontier on biohacking.”
Oh goody, home-brew pathogen ‘gain of function’ research. Just add a wild donor bat and keep an eye on the mutations with PCR and the cheap open-source sequencer you bought on AliExpress. Yeah, there’s nothin’ to worry about here. Just remember to wear a mask.
Bats are old news, now it’s killer Chimp germs.
Chimps don’t need germs. They are dangerous enough already.
PCR has been cheap and simple since it was first invented and commercialized in the 80s… It’s just become increasingly obvious to the home crowd, but even that’s 10+ years old news
I don’t get it… microbiology labs have been able to identify bacteria for a very long time without doing DNA sequencing. Unless you have plans to modify the DNA, what’s the use? Speed? Checking for variants? Because you can?
The primary advantage of metagenomics is you don’t necessarily need to know what you’re looking for when you collect the samples. You get an “unbiased” (modulo experimental technique bias) sample of the genetic material, possible containing dozens of species yet to be formally identified. This can be very useful in studying things like the gut microbiome: you can compare composition across people and identify which microbial genetic material correlated with a particular phenotype, and once you identify something interesting go back and try to figure out what biological entity contributes it to your samples.
because who /wants/ to deal with enterotubes and other flowchart based trial and error methods of identification of environmental sample sources.
How do you identify the ones we don’t know how to culture?
“enough to process about 10 human genomes” ?? Perhaps that figure ought to be rechecked. The last group I recall performing a whole human genome seq used about 25-30 flowcells, that was 3 or 4 years ago and while output quantity and quality have both definitely improved not sure its gotten to 6 billion base pair yet.
I’m not a nanopore expert, but I suspect they imply 10 genomes at 1x coverage. My understanding is that in practice, you can get usable quality data for 1 sample from 1 of the flow cells.
You are right, sequencing 10 humans costs much more, be it with nanopore or the classical (shotgun) methods!
Shotgun-Sequencing a human whole genome at a sufficient precision (number of evidence for each position, a.k.a. read depth) still costs about $800. Nanopore sequencing is not suitable (yet) for whole genomes, as it has an error rate between 1 and 10 percent and hence needs a large read depth to be reliable.