Sequencing the Cannabis Genome
The August announcement that Massachusetts-based company Medicinal Genomics had sequenced the entire genome of Cannabis sativa L. received much national attention. While the development makes for attention-grabbing headlines—“Marijuana Genome Sequenced for Health, Not Highs,” “Science Cracks the Cannabis Genome”—how will it actually impact research and public health?
Medicinal Genomics founder Kevin McKernan became interested in decoding the cannabis genome when working with a clinical oncologist to sequence the DNA of cancer tumors and patients.
“As a result of this,” said McKernan, “[I] had a few friends with cancer ask about medical marijuana” (email, September 29, 2011).
Then he read Spanish scientist Manuel Guzmán’s research documenting that cannabinoids, some of the biologically active compounds in cannabis, have a favorable therapeutic index in cancerous cell cultures and animal models. McKernan said this “really drove it home,” as that finding is rare with most potential cancer drugs. Additionally, McKernan read Etienne de Meijer’s work emphasizing that the cannabis chemotype is strictly governed by genetics, “but we only knew CBD [cannabidiol] and THC [tetrahydrocannabinol] synthase sequences to date.”
“I naively figured we could sequence the whole genome for under $50k and that this had to be a priority,” he said. “Turned out to be a far more complicated genome than one could gather from the literature.”
Though other scientists and organizations have been working on sequencing cannabis, Medicinal Genomics is said to have produced the “largest known gene collection” at more than 131 billion bases of sequence. The sequence bases of C. sativa were made available on August 18th on Amazon EC2—a public cloud computing service—via Nimbus Informatics, an open source data management website. A data assembly is also available for download.