A Gene Expression Experiment Practical

Total Page:16

File Type:pdf, Size:1020Kb

A Gene Expression Experiment Practical

A Gene Expression Experiment – Practical

November 2008

Richard Mott

1. Repeat the analysis of the liver data set in the lecture (the commands are repeated here but you should refer to the lecture notes as well)

(i) Load the data (NOTE these are binary files)

> load("liver.exp.RData")

> load("liver.cov.RData")

(ii) threshold the data

>liver.median <- apply(liver.exp, 2, median )

> hist(liver.median, breaks=50)

>liver.subset <- liver.exp[,liver.median>6]

(If your computer has enough RAM, try using a bigger data set by decreasing the threshold 6 to eg 5)

(iv) Look for sex effects

> tfunc <- function( X, GENDER ) { tt <- t.test( X ~ GENDER ); return(tt$p.value) }

> sex.pvalue <- apply(liver.subset, 2,tfunc, liver.cov$GENDER )

> hist( sex.pvalue, breaks=50)

> sum(sex.pvalue<1.0e-5)

(v) Heritability

> anova.pvalue <- function( X, factor ) { a <- anova(lm( X ~ factor)); return(a[1,5])}

>family.pvalue <- apply( liver.subset, 2, anova.pvalue, liver.cov$Family )

Extra exercise – report the % variance explained instead of the p-value

(vi) Weight

> weight.pvalue <- apply( liver.subset, 2, anova.pvalue, liver.cov$EndNormalBW ) > hist(weight.pvalue,breaks=50)

(vii) GO analysis

>go1 <- read.delim("GO.Ensembl.01.txt")

>go1$transcript <- paste(“LIVER.express.”, make.names(go1$Transcript),sep=””)

>go2name <- read.delim(“GO2name.txt”, sep=”\t”)

>intersect <- colnames(liver.subset)[match(go1$transcript, colnames(liver.subset), nomatch = 0)]

> intersect <- unique(sort(intersect))

> liver.subset.intersect <- liver.subset[, match(intersect, colnames(liver.subset))]

> go.intersect <- go1[match(intersect,go1$transcript),]

> sex.ids <- colnames(liver.subset)[sex.pvalue<0.01]

> sex.intersect <- sex.ids[match(sex.ids,intersect,nomatch=0)]

> sex.idx <- go.intersect$transcript %in% sex.ids

> fisher.func <- function( X, sex.idx) { X <- as.factor(X) ; if ( nlevels(X) == 2 ) {f <- fisher.test(X, sex.idx); return (f$p.value)} else return(1) }

> fish <- apply( go.intersect[,4:ncol(go.intersect)], 2, fisher.func, sex.idx )

> fish[fish < 0.01]

> data.frame( pvalue=fish[fish<0.01], desc=as.character(go2name$desc[go2name$go %in% names(fish[fish<0.01])]))

2. Now repeat the analysis on the lung data set

3. Investigate these data sets using your own initiative. For example, are there differences between expression for different chromosomes? (you will need the file mouse.transcripts.genes.txt, use read.delim())

Recommended publications