While considering ways to disseminate RNAi screening data, I found out that PubChem now contains two RNAi screening datasets – AIDs 1622 and 1904. These screens reuse the PubChem bioaassay formats – which is both good and bad. For example, while there are a few standardized columns (such as PUBCHEM_ACTIVITY_SCORE), the bulk of the user deposited columns are not formally defined. In other words, you’d have to read the assay description. While not a huge deal, it would be nice if we could use pre-existing formats such as MIARE, analogous to MIAME for microarray data. That way we could determine the number of replicates, normalization method employed and other details of the screen. As far as I can tell all aspects an RNAi screen are still not fully defined in the MIAME vocabulary, and there don’t seem to be a whole lot of examples. But it’s a start.
But of course, nothing is perfect. Why, oh why, would a tab delimited format be contained within multiple worksheets of an Excel workbook!