Design of a whole genome Drosophilia chip
The design files are available to the public. If you use our
files or programs in your work then please cite NIH grant 5R24GM065513
A poster that we presented at the Indiana Bioinformatics conference in May 2004
is available in
GIF and
PNG formats.
There have been three designs done for the chip. Version 1 was done outside of Purdue; we
analyzed the chip design. Version 2 was Purdue's initial design. Version 3 -- in progress --
is a revised design by Purdue.
The following links represent our work for the past 8 months.
Version 1 analysis done in
June & July, 2003
plus a subsequent one done in
late July, 2003.
As part of this analysis we did a preliminary
match to flybase.
Version 2 design
finished in Nov., 2003.
The annotation is a CSV (comma seperated)
file suitable for upload into Excel or SAS. To go along with the annotation
is a Fasta-format file of the
original source sequences.
There is a description (PDF) of the design process
and a
flowchart (PDF)
The Version 3 design is complete.
There were many design files
involved. The
processing steps are complex.
The final data [20773 probes including 5 negative controls]
is available in
FastA and CSV form. The
CSV file has additional information that the FastA file does not contains.
Since they came from various source files, the probe names are mixture of
nomenclatures. But they all follow the convention of 'A_B_C|P' where 'A',
'B' and 'C' are gene names and 'P' is the position. In the example there
are 3 genes that made up a common region from which the probe was created.
Most probes will come from a single gene and thus will have the form 'A:P'.
A handful of genes (or clusters) have more than one probe created for them;
the position number differentiates between these probes.
Examples of gene names: 'CG3856-PB', 'gi|41618549|tpg', 'negative-control-7', etc.
Agilent Annotation Program - Get latest auto-generated annotations as well as
the generator of these annotations.
|