Get some general background on DNA microarrays and protein microarrays first! NCBI has more in-depth scientific applications for you as well. A scientific overview of the opportunities and challenges when working with microarrays is available, too.
Probe design has to follow several principles to optimize their specificity and sensitivity:
Most commercial vendors have their own Microarray design platform, e.g. Agilent or Affymetrix.
However, third-party software, independent from any vendor is also available. Some software is available for free (such as Picky, developed here at Iowa State University; registration required) or commercial (such as Array Designer).
In this course, we will use Picky. You can download a trial-version for free. If you want to keep using it, your PI will have to request an academic license.
The team has provided excellent tutorials online, so we encourage you to walk through these now.
The original paper for Picky which includes additional background and design criteria for the probes was published in Bioinformatics. It contains valuable background-information if you want to use Picky for your own projects. The paper also elaborates on probe design concerns.
Many microarray experiments have been made available to the general public. Several excellent sources are available online to retrieve these datasets:
Okay, you've created an array and you used it in a project and you have some data... Or you simply downloaded one of the datasets from the above-mentioned databases. Now what?
In this course, we will in part rely on software that was designed to run under R, a wide-spread, popular, and freely available statistics software-package.
This means that you'll need to familiarize yourself with R! If you haven't installed R yet on your system, please do so now. [local copy]
An excellent introduction to R is provided by the development-team itself. [local copy]
However, this course is about microarrays! Garrett Dancik, a former graduate student in the BCB program, therefore wrote an excellent tutorial himself, along with excercises. He was nice enough to provide answers, too.
In order to work efficiently with microarray data in R, you'll also need to install BioConductor on top of R. During the installation, you screen will look somewhat like this. You can check if everything went well by typing in library("affy") at the command-line in R after installation of BioConductor. If you get an error message such as Error in library("affy") : there is no package called 'affy'. things didn't go so well and you should try to re-install the software.
Once you're through with installing BioConductor, you can install the sample data as per instructions.
There are three more additional libraries you need: Twilight, hgu95av2 and hgu95av2cdf.
If you want to see what libraries you've installed so far, you can use the library() command. Another useful R-tip: Ctrl+L will clear your screen.
Yay! You made it to this part of the course and you're still breathing (although you may experience some slight elevation in blood pressure).
Yes, that was hard. However the extent of pre-manufactured software that you need to install before actually getting to analyze your data is a testament to the extent in which microarrays have become standard tools to unravel biological and medical mysteries. It also means that you don't have any programming to do to: everything's been done for you by other people already. Isn't that great? Unfortunately, it does mean that you have some catch-up to do and occasionally will have to grind our way through software installation protocols as we just went through.
Ready to get started with some actual analysis? Try loading the following packages:
Now load the twilight module as well (we're not going to tell you everything).
If you get no error messages, you're ready to start with for the real work.
A first tutorial can be found here. Make sure to interpret and map the pathway-names to your system though!
After working through this, you're ready for second tutorial on estrogen. The estrogen tutorial can be found here [local copy].
By the way, .cel files are huge! The sample files that come with the estrogen library are 17243 x 19525 pixels in size. While you can open them in graphics packages such as GIMP, it is not recommended. You're better off leaving this to specialized software such as BioConductor or geWorkbench. The image-files are black & White.
A guided R-excercise is available here.
Explorase is an advanced package for R that was developed by Michael Lawrence at Iowa State University. A paper is available as well. [local copy]
Before we can use Explorase, we'll need to install it. To speed up the process, you can install GTK+ and GGobi indivually from local images. After this, please go to the Explorase website, click on the Download link and follow the instructions.
If all went well, close R and restart it. Start Explorase by typing library(explorase), followed by explorase().