Thursday, August 30, 2007

Horizontal Gene Transfer from Bacteria to Eukaryotes

Can bacteria transfer genes to eukaryotes? Many people may remember the rather rash assertion of evidence that they can that was made in the initial human genome paper (and the subsequent debunking of those claims by people I used to work with at TIGR).

But just because one study was flawed doesn't mean that such horizontal transfers don't happen, and today in the advance publication section of Science a new study shows solid evidence that the endosymbiotic bacterium Wolbachia has integrated parts (in some cases quite large portions) of its genome into that of numerous strains of Drosophila, the wasp Nasonia, and the worm Brugia malayi. These aren't mere cases of "BLASTology" -- they were confirmed by PCR. Even more stunningly, RT-PCR suggests that in some cases the integrated genes are actually expressed.

Again, there is a TIGR/JCVI connection -- one of the co-first authors, Julie Dunning Hotopp, has a cube only several feet from mine -- and all today I could hear her talking on the phone with science reporters -- so congratulations to her (and best wishes that the reporters don't screw things up). Also, congratulations to all the other authors, particularly the other co-first author, Michael Clark, whom I haven't met.

The results of the study have many implications both theoretical and practical. From an evolutionary perspective, it is interesting to ponder if the ancestors of many current eukaryotic genes came from such bacterial integrations. And from a practical perspective, it really makes one wonder if discarding bacterial sequences during the assembly of eukaryotic genomic data as "obvious contamination" (as is commonly done) is really the right thing to do.

Sunday, August 26, 2007

Into the Great Wide Open

My mother has spent the past 25 years working in the field of association management. What's that, you might ask? Well, if you're a scientist you probably belong to one or more associations such as the American Society for Microbiology (ASM). And it isn't just science; most professional fields have such organizations. There's even the American Society of Association Executives (ASAE), to which the people (such as my mom) who run the other associations belong. Associations, regardless of their topic, tend to fulfill the same functions, one of which is publishing journals. And that leads me into the topic of this posting.

My mother recently sent me a hardcopy of an article from ASAE's "Associations Now" magazine. It's entitled "Into the Great Wide Open" and it consists of an interview of Patrick Brown, one of the co-founders of PLOS. It is really significant that a publication of the ASAE would run such an article, because traditionally, the enemies of open access have not just been commercial publishers like Springer and Elsevier, but also many non-profit associations with publishing divisions. So I was expecting a hostile attack on open access, but actually it's quite a fair interview. In fact, they even printed the following exchange:

One thing that you've mentioned several times and I think is a big concern for the society publishers—especially societies that use income from their journal to subsidize other aspects of what they do for their members—is financial sustainability. What can you tell association publishers to show them that this transition can be sustainable?

There's a bunch of issues there. Number one, a lot of societies that make that claim—I would encourage people to look at their Form 990s. I get great enjoyment out of reading the Form 990s of scientific societies that talk about how important it is to preserve the income from their journals to do all these wonderful things they do, when, very often, the wonderful things they do, taken in aggregate, don't add up to the cost of their chief executive officer.

But let's just take that at face value—that their only motivation is to do good for the world and for science and for their community. One of the questions is, how important are those things that you're trying to fund with profits in your journal, compared to the good that you do for your mission through publishing itself and making access as freely available as possible?

The whole interview is interesting reading. And don't miss the informative sidebar containing a glossary of various Open Access terms. I have to admit I have a hard time remembering what the Berlin Declaration, etc. are, and I imagine most people who aren't professionally involved in the Open Access movement do too.

Thursday, August 09, 2007

ISMB 2007 Vienna -- Part II Interesting Talks

I'd just like to comment on some talks that I also found particularly noteworthy or interesting.

Atul Butte gave an interesting talk on nosology. No, that's not the study of noses, but rather the classification of diseases. What Butte has done is cluster gene expression patterns from various diseases. And the results were surprising. For example, he discovered that cervical cancer expression clustered with that of an autoimmune disease andmuscular dystrophy clustered with some forms of heart disease. These connections were unexpected and may lead to better understanding of these diseases. I liked the idea (even though I am not normally a disease person) because I found the idea analogous to the way unexpected phylogenetic relationships can be informative.

Christian von Mering talked about phylogenetic analysis of samples from metagenomics projects. He is interested in this problem because contrary to what was generally believed prior to environmental sequencing, specific microbial species (& higher taxonomic levels) are not found everywhere it would be possible for them to live -- in other words, microbes have meaningful biogeography, just like plants and animals. Von Mering and colleagues have created an interesting pipeline for the interpretation of metagenomic data. Rather than try to analyze each of the millions of reads in detail, his method first identifies standard phylogenetically informative marker genes, then adds them to existing high quality alignments, and uses a custom phylogenetic program to test all possible phylogenetic positions on a reference tree. Both the limitation to markers and the custom phylogenetic component make this pipeline much more efficient than the typical phylogeny based methods.

Haipeng Li talked about a new maximum likelihood method for inferring positive selection and demographic history from chromosome-wide SNP data. His test case was Drosophila melanogaster and according to his model, the European population split off from the African lineage 16,000 years ago and underwent some sort of bottleneck. Current low levels of X-linked diversity (as opposed to the autosome) in the European population suggests a large excess of males in this population. This latter assertion was questioned by several members of the audience as it doesn't seem to be congruent with empirically measured numbers of males in the wild. Still, I find such studies fascinating. I would love to do a similar demographic study using bacteria, but I doubt we have enough data, even for things like E. coli or Bacillus anthracis.

And Alissa Resch, who works at the NCBI (just down the road from JCVI), also gave an interesting talk about positive selection. Odd that I'd have to travel thousands of miles to hear it though. Resch developed a new statistical test for positive selection that uses the rates of synonymous substitution in nearby intronic regions as a background. This allows detection of positive selection even at synonymous sites. She then applied it to orthologous gene pairs in mouse and rat.

Sunday, August 05, 2007

ISMB 2007 Vienna, Part I - Keynotes

I realize that I'm a bit late in covering a conference that occurred July 21-25, but I combined the conference with post-conference trips to Prague, Leipzig, and Berlin and have been busy at work since returning to Washington earlier this week.

There doesn't seem to have been much coverage of ISMB 2007 in the blogosphere (I've only found the coverage at Suicyte Notes,
and Fungal Genomes -- a few others had humorous posts about misadventures getting to Vienna, etc., but didn't cover any of the talks -- let me know if I missed any science-related postings).

In this first posting, I'll talk about my impressions of several of the keynote talks that made the most impression on me.

First of all, there was Michael Eisen's talk. Having worked with his brother, Jonathan, for some years, it was fun to finally see (and after the talk, briefly meet) the "other" Eisen. As would be expected from Mike's papers, his keynote largely dealt with the evolution of regulatory sequences in Drosophila. In the midst of this he made an offhand comment which generated both cheers and boos from the audience to the effect that bioinformaticians should stop working on microarray analysis methods "as nobody will be using microarrays in a couple of years". Well, we'll see.

Then, there was John Mattick's talk, which made me angry. Not because I disagree with his assertion that our traditional picture of gene regulation is incomplete, and that non-coding RNA-mediated gene regulation is going to be an important part of our revised picture, but I found his attitude towards non-eukaryotes (and even just non-mammals) annoying. According to John, the reason why "prokaryotes" are "simple" is that their gene regulation is just protein-based, while "higher organisms" require the use RNA as well. He even had graphs where "complexity" (how exactly is that defined objectively, now?) was the y-axis -- humans at the top, of course -- just like the medieval "Great Chain of Being" (minus the superhuman levels). The simple fact is that RNA based regulation isn't just limited to eukaryotes - non-coding RNA has been shown to be important in bacteria and archaea as well. Our picture of gene regulation in all domains of life is changing. Those on the quixotic quest to show that the evolutionary branch leading to humans is somehow "special" will have to look elsewhere.

And then there was Terry Speed's talk, which was one of those "historical" talks that some people see as a waste of time. Maybe it's because I once seriously considered becoming a historian of science, or perhaps it's just that I'm getting old, but I like such talks. And I learned a lot from it. For example, I always assumed that HMMs first entered biology from computer science in the context of sequence analysis/gene finding in the early 1990s, but Terry showed how people in pedigree analysis (including, if I recall from the talk, Eric Lander in his pre-genomics career) brought them into biology earlier.

I will write further on ISMB in subsequent posts.