A Gene Expression Experiment Practical
Total Page:16
File Type:pdf, Size:1020Kb
A Gene Expression Experiment – Practical
November 2008
Richard Mott
1. Repeat the analysis of the liver data set in the lecture (the commands are repeated here but you should refer to the lecture notes as well)
(i) Load the data (NOTE these are binary files)
> load("liver.exp.RData")
> load("liver.cov.RData")
(ii) threshold the data
>liver.median <- apply(liver.exp, 2, median )
> hist(liver.median, breaks=50)
>liver.subset <- liver.exp[,liver.median>6]
(If your computer has enough RAM, try using a bigger data set by decreasing the threshold 6 to eg 5)
(iv) Look for sex effects
> tfunc <- function( X, GENDER ) { tt <- t.test( X ~ GENDER ); return(tt$p.value) }
> sex.pvalue <- apply(liver.subset, 2,tfunc, liver.cov$GENDER )
> hist( sex.pvalue, breaks=50)
> sum(sex.pvalue<1.0e-5)
(v) Heritability
> anova.pvalue <- function( X, factor ) { a <- anova(lm( X ~ factor)); return(a[1,5])}
>family.pvalue <- apply( liver.subset, 2, anova.pvalue, liver.cov$Family )
Extra exercise – report the % variance explained instead of the p-value
(vi) Weight
> weight.pvalue <- apply( liver.subset, 2, anova.pvalue, liver.cov$EndNormalBW ) > hist(weight.pvalue,breaks=50)
(vii) GO analysis
>go1 <- read.delim("GO.Ensembl.01.txt")
>go1$transcript <- paste(“LIVER.express.”, make.names(go1$Transcript),sep=””)
>go2name <- read.delim(“GO2name.txt”, sep=”\t”)
>intersect <- colnames(liver.subset)[match(go1$transcript, colnames(liver.subset), nomatch = 0)]
> intersect <- unique(sort(intersect))
> liver.subset.intersect <- liver.subset[, match(intersect, colnames(liver.subset))]
> go.intersect <- go1[match(intersect,go1$transcript),]
> sex.ids <- colnames(liver.subset)[sex.pvalue<0.01]
> sex.intersect <- sex.ids[match(sex.ids,intersect,nomatch=0)]
> sex.idx <- go.intersect$transcript %in% sex.ids
> fisher.func <- function( X, sex.idx) { X <- as.factor(X) ; if ( nlevels(X) == 2 ) {f <- fisher.test(X, sex.idx); return (f$p.value)} else return(1) }
> fish <- apply( go.intersect[,4:ncol(go.intersect)], 2, fisher.func, sex.idx )
> fish[fish < 0.01]
> data.frame( pvalue=fish[fish<0.01], desc=as.character(go2name$desc[go2name$go %in% names(fish[fish<0.01])]))
2. Now repeat the analysis on the lung data set
3. Investigate these data sets using your own initiative. For example, are there differences between expression for different chromosomes? (you will need the file mouse.transcripts.genes.txt, use read.delim())