Scientists and Sceptics: blogs, science and reproducible research

David Nichols

Department of Computer Science University of Waikato Hamilton, New Zealand Scientists and Sceptics: blogs, climate science and reproducible research or Watching the sceptics attack the global warming consensus by accusing climate scientists of not being proper scientists

or Blogs I’ve been reading for the past few years A longitudinal ethnographic observation of a networked community of practice Data Communities (TDCT)

• Starbucks – 1. regular user, no talk, usual drink , not part of community – 2. meet colleague, use their infrastructure, not part of community, bring my own community, library-like • Cheers – Talk to customers and bartenders, serendipities • Gary’s Old Towne Tavern – Access same beer/data – Hate the Cheers community, think they are wrong Topic Overview

• Global warming science • Sceptic Community: development & activities • Data and code access • Data quality detectives • Citizen scientists/sceptics: surfacestations.org • Who is in control of the data? • Reproducible research – Replicate or validate or verify or audit The start: MBH98 and MBH99

• Mann, M.E., Bradley, R.S. and Hughes, M.K., 1998. “Global-Scale Temperature Patterns and Climate Forcing Over the Past Six Centuries”, , 392, 779-787. (GS: 815 cites) • Mann, M.E., Bradley, R.S. and Hughes, M.K. 1999. “Northern Hemisphere Temperatures During the Past Millennium: Inferences, Uncertainties, and Limitations”, Geophysical Research Letters, 26, 759-762. (GS: 811 cites) Intergovernmental Panel on Climate Change (IPCC) Third Assessment Report, 2001, Summary for Policy Makers, Figure 1b, pg 3 HS in

http://www.climateaudit.org/wp-content/uploads/2007/11/gore_a1.jpg AIT DVD

Faulty y-axis labelling is corrected in the book version

http://www.climateaudit.org/wp-content/uploads/2007/11/gore_a2.jpg Gore, A. (2006) An Inconvenient Truth: The Planetary Emergency of Global Warming and What We Can Do About It.

• “But as Dr Thompson's thermometer shows, the vaunted Medieval Warm Period (the third little red blip from the left below) was tiny in comparison to the enormous increases in temperature in the last half-century - the red peaks at the far right of the graph. These global-warming skeptics - a group diminishing almost as rapidly as the mountain glaciers - launched a fierce attack against another measurement of the 1000 year correlation between CO2 and temperature known as the "hockey stick", a graphic image representing the research of climate scientist Michael Mann and his colleagues. But in fact scientists have confirmed the same basic conclusions in multiple ways with Thompson's ice core record as one of the most definitive.” http://www.climateaudit.org/?p=2328 • “It turns out that the Gore Hockey Stick has not derived from Thompson data at all; what it represents is a splicing of the MBH99 reconstruction (taken to 10-year averages) and a version of the CRU temperature history overlaid directly and merged with the MBH99 reconstruction. Thus the confirmation of MBH99 is ironically MBH99 itself.”

http://www.climateaudit.org/?p=2335 Sceptics’ Viewpoint

• The AIT incident is typical: – ‘Consensus’ science (deliberately) misrepresents and overstates the (AGW) science – Computer models are full of unrealistic assumptions – Data selection is partial and biased – Temperature records are unreliable – Temperature proxy records are unreliable – IPCC process is basically a (biased) literature review – Climate scientists are poor statisticians • many ‘results’ are not actually results and are artifacts of dubious statistical methodology and ...

• Many of the important results are – Not reproducible • Due to: – Poor methodological descriptions – Poor data descriptions – Lack of data sharing – Lack of code sharing – ‘club’ attitude which is biased against those who disagree with the consensus • Reviewing, data access In economics:

• “the book by Giles and Tedds (2002) estimated the size of the Canadian underground economy to be 3.4% of GDP in 1976, rising to 15.6% in 1995. The policy implications of such a result are staggering. Yet, when this work was reanalyzed and replicated, it was found that ‘the overall level of their estimates is a result of numerical accidents’ (Breusch 2005). The experiences of economists who have sought to replicate published research suggest that Panglossian faith in economists’ commitment to the scientific method is unwarranted.”

– Anderson et al (2008) Scientists’ Viewpoint

• Warming of the is unequivocal, as is now evident from observations of increases in global average air and temperatures, widespread melting of snow and ice and rising global average sea level – IPCC 2007 • And virtually all that happens on the sceptic blogosphere is irrelevant, uniformed etc. Sub-topics

blog – origin and sample issues • Data Detectives – finding errors in data sets • Access to data/code – archiving, IP, FoI • Data Misuse – poss. examples of data/code misuse • Citizen science – surfacestations project • More data access examples (journals) • UHI data fraud allegations • CA post title sampling Climate Audit blog

www.climateaudit.org Steve McIntyre

• MBH ‘Hockey Stick’ – Used by IPCC, AIT and many others

• In 2003 he emailed Mann Hockey Stick as backdrop to to get a copy of MBH98 IPCC press conference proxy data

http://images.ctv.ca/archives/CTVNews/img2/20070815/160_to_blogger_070815.jpg 2003 http://www.climateaudit.org/?p=3099 Michael Mann (Penn State)

• “All of the proxy data (tree- rings, coral, ice cores, and historical documents) used in Mann et al. (1998) has been available since May 2000 on this public website: Link rot note: Mann ftp://holocene.evsc.virginia.e moved to Penn State so all the Virginia URLs du/pub/MBH98” are dead

http://holocene.meteo.psu.edu/Mann/home/mann_treering.jpg Mann (2005) http://www.realclimate.org/Mann_response_to_Barton.pdf Bitter dispute about data access

• “The spreadsheet file they used was a complete distortion of the actual Mann et. al. proxy data set, and was essentially useless, particularly in the earlier centuries. The authors had access to the full data, which has been available on a public ftp site for nearly two years” - Mann • “It is self-evident that Mann’s comments are a pastiche of false statements. (1) We requested an FTP location, not an Excel spreadsheet. (2) The FTP site location is anonymous. At no point prior to the identification of the FTP site on Oct. 29, 2003 by David Appell, was the site identified for us nor did we have access to the full data. ” - McIntyre & McKitrick – Mis-use of ‘anonymous’ to mean unidentified

http://www.uoguelph.ca/~rmckitri/research/MM-nov12-part1.pdf So

• Data/code was not archived at publication with Nature (though there was an SI) – or in a single coherent location elsewhere – or for 5 years after (or 2 years) • Depending on who you believe • No-one had asked for the data before – From the authors • No-one had replicated MBH98 – Using the original data or code • ftp://holocene.evsc.virginia.edu/pub/MBH98/TREE/ITRDB/NOAMER/pca- noamer.f • “The question presumes that in order to replicate scientific research, a second researcher has to have access to exactly the same computer program (or “code”) as the initial researcher. This premise is false. The key to replicability is unfettered access to all of the underlying data and methodologies used by the first researcher”

Mann (2005) http://www.realclimate.org/Mann_response_to_Barton.pdf Mann

“It also bears emphasis that my computer program is a private piece of intellectual property, as the National Science Foundation and its lawyers recognized ... Whether I make available my computer programs is irrelevant to whether our results can be reproduced. And whether I make my computer programs publicly available or not is a decision that is mine alone to make.” C Back transformation IF (NU .EQ. 0) GO TO 510 DO 500 KK=1,N K=N1-KK IF (B(K) .EQ. zero) GO TO 500 Q=-A(K,K)/CdABS(A(K,K)) DO 460 J=1,NU 460 U(K,J)=Q*U(K,J) DO 490 J=1,NU Q=dcmplx(zero,zero) DO 470 I=K,M 470 Q=Q+dCONJG(A(I,K))*U(I,J) Q=Q/(CdABS(A(K,K))*B(K)) DO 480 I=K,M 480 U(I,J)=U(I,J)-Q*A(I,K) 490 CONTINUE

Excerpt from: http://holocene.meteo.psu.edu/shared/research/MANNETAL98/METHODS/multiproxy.f “And neither McIntyre nor McKitrick is a trained climate scientist. According to the biographical data on their websites, Mr. McIntyre is a mining industry executive with no formal training in any discipline related to climate research and Mr. McKitrick is an economist with no scientific training, hardly credentials that lend force to their academic arguments” Mann response to Barton Data access dispute

• V. Complicated • Lots of heat and light – and little clarity • Dispute about both data and code – Accusations of incomplete and data containing errors – Lack of code access – Code implementation not matching the description in the code – “More importantly for efforts to verify MBH98, the files pcproxy.txt and pcproxy.mat show that collation errors were embedded at Professor Mann’s UVA site long before our request for data” M & M Archive Malleability Accusations

• “On November 8, 2003, we re-visited this site and discovered the following changes: (1) the file pcproxy.mat had been deleted from Mann’s FTP site; (2) the file pcproxy.txt no longer was displayed under the directory, although it could still be retrieved through an exact call if one knew the exact file name; (3) without any notice, a new file named “mbhfilled.mat” was inserted into the directory.” Materials Complaint to Nature

“We have sought such clarification from Professor Mann without success. With reference to the policies stated at http://www.nature.com/nature/submit/polici es/index.html, in particular item number 6, we are writing to advise you of a persistent refusal to comply with the guidelines and other issues.”

Nov 2003 http://www.climateaudit.org/correspondence/nature.031117.htm “... we requested other particulars on the computational methodology from Professor Mann and were refused. Accordingly, we attempted to assess the impact of the data problems by following the methodology publicly disclosed in MBH98. Professor Mann then criticized us for failing to replicate previously undisclosed details of his methodology.” “... We have been systematically and deliberately stymied by Professor Mann on the most elementary requests: a proper listing of his data series and the exact computational procedures used. In the process of trying to obtain this information we have concluded that the disclosure at the Nature SI [Supplementary Information] site is not merely inadequate, but in some cases it contradicts what is now revealed at the University of Virginia FTP site.“

http://www.climateaudit.org/correspondence/nature.031117.htm Data archive under researcher control

– Moves about (link rot), no PURL/DOI/Handle – Not associated with publication venue (Nature) – No change history, no versioning • Impossible to verify the accusations • Can’t do a primary source historical study – No guarantee of longevity Leading to ...

• Michael E. Mann, Raymond S. Bradley & Malcolm K. Hughes, (2004) Corrigendum: Global-scale temperature patterns and climate forcing over the past six centuries, Nature 430, 105. doi:10.1038/nature02478 – Lists some data set errors – Has much larger Supplementary Information (SI) than MBH98 • McIntyre, S.; McKitrick, R. (2005) Hockey sticks, principal components, and spurious significance, Geophysical Research Letters 32: L03710. doi:10.1029/2004GL021750 • RealClimate blog realclimate.org – “Climate science from climate scientists” • Climate Audit blog climateaudit.org – Set up by McIntyre – Become a focal point for AGW sceptics CA Vocab

• Hockey Stick • Hockey Team → Team • Hey it’s climate “science” → hey • Mannian statistics • Mannomatic • Starbucks Hypothesis • whether a dendrochronologist can have a latte in the morning and still carry out a sampling program Is CA weird or useful?

“It took me a while to decide what climateaudit was about: a group of "trial lawyers" looking to poke holes in greenhouse warming theory motivated by trying to preserve a fossil fuel based economy, or a bonafide group of scientific skeptics focusing on the applications of statistics motivated by the need for a better assessment of uncertainty in policy relevant science. I am convinced that SteveM and a number of the other principals are in the bonafide skeptic category. However, there is a substantial amount of noise on the site that gives the impression of "trial lawyers". Knee jerk reflexive characterizations of a scientists' policy views or meddling simply because they accept greenhouse warming as a theory do not help. And to "bonafide skeptics" in the group, I hope that you can engage in a meaningful diaglogue about Wegman's recommendations, outside of the context of the "pissing match" with Mike Mann.”

Judith Curry, Chair, School of Earth and Atmospheric Sciences, Georgia Institute of Technology http://www.climateaudit.org/?p=852 Figure 6.10 b, Chapter 6, Palaeoclimate, Working Group I Report "The Physical Science Basis" IPCC Fourth Assessment Report, 2007 http://www.ipcc.ch/ Data Quality Detectives

• Sceptics believe climate scientists are poor at checking their data is free from errors – Or that their processing hasn’t introduced extra errors • They have spotted several errors in climate science data sets – And take great pleasure from ‘showing up’ the proper scientists Where’s the rain?

• “the instrumental precipitation record for the New England gridcell used in MBH98 did not match any historical data from the area or from the citation, but did match historical precipitation from Paris, France”

• "The rain in Maine falls mainly in the Seine“ – “In Mannian statistics, incorrect geographical locations "don't matter" because of teleconnections”.

http://www.climateaudit.org/?p=954 And the rain in Spain?

• “he flipped latitude and longitude and it's really from Spain. The flipping is very odd because the coordinates are correct in the PNAS Supplementary Information (the SI where the Briffa MXD series are wrong), but they are reversed in rtable1209 which seems to be used for the plot.” • “rain in Spain falls mainly in the plain - but in the plains of Kenya.”

http://www.climateaudit.org/?p=3791 NASA GISS 2007

• Recently it was realized that the monthly more-or- less-automatic updates of our global temperature analysis had a flaw in the U.S. data. We wish to thank Stephen McIntyre for bringing to our attention that this flaw might be present. ... we included improvements that NOAA had made in station records in the U.S., their corrections being based mainly on station-by-station information about station movement, change of time-of-day at which max-min are recorded, etc.

http://data.giss.nasa.gov/gistemp/updates/200708.html • Unfortunately, we didn't realize that these corrections would not continue to be readily available in the near-real-time data streams ... Obviously, combining the uncorrected GHCN with the NOAA-corrected records for earlier years caused jumps in 2000 in the records at those stations, some up, some down http://www.climateaudit.org/?p=4318 Russia is hot

• “NASA has just reported record warmth in October throughout Russia, with many sites experiencing similar temperatures in October as in September ... Many stations had exactly the same monthly temperatures in October as in September.”

NASA GISS Surface Temperature Analysis Oct 2008 initial Error corrections

• Some processing error had copied the Sept data into Oct • The corrected version was released – And then corrected again:

http://www.theregister.co.uk/2008/11/19/nasa_giss_cockup_catalog/print.html • “Current staffing from the GISTEMP analysis is about 0.25 FTE [Full-Time Employee] on an annualised basis (I’d estimate - it is not a specifically funded GISS activity)” - Gavin Schmidt

http://www.climateaudit.org/?p=4325 Antarctica is cold

• Steig et al 09 analyses Antarctic weather stations Attribution

• Sun Feb 1st – McIntyre notes on CA one weather station (Harry AWS) looked weird – Actually mixed up with another station’s data Gill • Later on Sun: someone else notifies British Antarctic Survey (BAS) of errors in Harry data set – Which they fix • Coincidence? Gavin Schmidt at RC

• [Response: People will generally credit the person who tells them something. BAS were notified by people Sunday night who independently found the Gill/Harry mismatch. SM could have notified them but he didn’t. My ethical position is that it is far better to fix errors that are found than play around thinking about cute names for follow-on blog posts. That might just be me though. - gavin]

Dirty Harry 4: When Harry met Gill Attribution Wars

• “the person who tells them something” – Was actually Gavin Schmidt • CA crowd accuse Schmidt of stealing the recognition for error detection – He appears to have worked late into SuperBowl Sunday evening • Quickly CA crowdsourcing identifies several other data errors – BAS page turns largely red with corrections – CA post ‘Carnage’ Data Quality summary

• Crowdsourced (sceptical) data quality inspection works – People who are convinced errors are likely to be present are very persistent in looking for them • Error metadata: – Corrections on personal servers and not with the published paper • http://faculty.washington.edu/steig/nature09data/ – Corrections not integrated with online data sets – Initially BAS don’t keep old versions online – restore old data after complaints • Errors are used for publicity – Whether or not they have an impact on results • Detected errors reinforce sceptics beliefs that climate scientists don’t do data quality properly Data Archiving

• Accusations of poor practice – and their consequences Data Gathering

• Physical capture • Processing to a data set • Various adjustments – e.g. For Time of day of observations • Further data sets – Some archived – e.g. ITRDB – http://www.ncdc.noaa.gov/paleo/treering.html

http://www.crrel.usace.army.mil/sid/personnel/perovichweb/HotraxWeb/images/coring1.jpg Prof. Lonnie Thompson

– Distinguished University Professor, School of Earth Sciences, The Ohio State University. – US National Medal of Science, Tyler Prize, Science Advisory Board for An Inconvenient Truth, one of Time’s Heroes of the Environment (2008)

• “There's not much in climate science that annoys me more than the sniveling acquiescence of government bureaucrats in Lonnie Thompson's flouting of data archiving policies.” – McIntyre (2008) – http://www.climateaudit.org/?p=2686 • I've been trying since 2003 to get detailed sample information from Lonnie Thompson on his tropical ice cores, some drilled 20 years ago. – http://www.climateaudit.org/?p=1552

– Thompson, L. G., E. Mosley-Thompson, H. Brecher, M. Davis, B. Leon, D. Les, P. N. Lin, T. Mashiotta, and K. Mountain (2006) Abrupt tropical climate change: Past and present. Proc. Nat. Acad. Sci. USA, 103, 10,536–10,543 PNAS policy

• Unique Materials: Authors must make Unique Materials (e.g., cloned DNAs; antibodies; bacterial, animal, or plant cells; viruses; and computer programs) promptly available on request by qualified researchers for their own use. • Databases: Before publication, authors must deposit large data sets (including microarray data, protein or nucleic acid sequences, and atomic coordinates for macromolecular structures) in an approved database and provide an accession number for inclusion in the published paper. When no public repository exists, authors must provide the data as Supporting Information online or, in special circumstances when this is not possible, on the author’s institutional web site, provided that a copy of the data is provided to PNAS. http://www.pnas.org/site/misc/iforc.shtml#submission • I request that you ensure that Thompson et al comply with your data policy by forthwith archiving the large datasets used in the PNAS article for each individual ice core (Dunde, Dasuopu, Guliya, Puruoganri, Quelccaya, Sajama, Huascaran) and for the entire suite of isotopes and chemistry. In addition, because the discrepancies may result from changing algorithms for dating the ice cores, I further request that the dating procedure for each core be made available under your Unique Materials policy. – http://www.climateaudit.org/?p=1552 PNAS response

• I was able to reach him via phone the other day, however, and can now address your query. According to Dr. Thompson, the data you seek have all been deposited in the archive you specifically mentioned as well as being mirrored on his own website. Let me know if you have any further questions. • Unfortunately, the following response from Dr Thompson is simply false: "According to Dr. Thompson, the data you seek have all been deposited in the archive you specifically mentioned as well as being mirrored on his own website“ • I am perfectly aware of the highly incomplete summary information archived at WDCP and at Dr Thompson’s website. ... You can readily verify for yourself that Dr Thompson’s answer is false. • In a responsive data archive, you could identify the sample number, top, bottom, isotope, chemistry and other indicators. Since several thousand samples were taken for each core, there would be several thousand lines in the archive. ... • I re-iterate my request that PNAS ensure that Thompson comply with PNAS policies on these data sets. • The smoothness of the Dunde data used here contrasts with the smoothness in other data sets - here's a spaghetti graph that I've shown previously. Obviously it's not that the underlying Dunde data is all that smooth; it's that Yang et al 2002 has used a "grey" version available in 50 year intervals, the most recent values being …1840, 1890, 1940, 1990. ... Yang's use of this smoothed 50-year version shows once again the impact of Thompson's abysmal archiving practices. Had Thompson properly archived his data, then Yang would presumably have used a sensible version of the data.

http://www.climateaudit.org/?p=2686 ‘Grey’ Data

• To his credit, Thompson has collected unique data. To his shame, Thompson has failed to archive data collected as long as 20 years ago. This would be bad enough if the versions were consistent in all publications on Dunde. But Thompson seems to have tinkered with his results over the years so that there has been an accumulation of inconsistent versions, compromising any ability to properly use this unique data. Needless to say, mere compromising of the data hasn't stopped climate scientists from using Thompson data.

http://www.climateaudit.org/?p=2686 http://www.climateaudit.org/?p=2686 • The figure below shows 5 inconsistent Thompson versions, two from Thompson articles in 2003 and 2006 and the others from grey versions archived or obtained from multiproxy authors. The most detailed information comes from the MBH98 archive (annual); the Thompson 2006 PNAS article (red) archived 5-year averages; the Thompson 2003 Clim Chg article (blue) archived 10 year averages (only after I caused a data policy to be implemented at Climatic Change and complained about Thompson); the low frequency curves are from Yang (email) and Crowley (email). I've done a re-scaling estimate for the Crowley version, as he had lost his original data and only had a rescaled and smoothed version.

http://www.climateaudit.org/?p=1396 • A caveat for twq: the Dunde archiving situation is a fiasco. There are multiple inconsistent versions of Dunde (and other Thompson ice cores such as Guliya). Last year we discussed three inconsistent Guliya versions used in peer-reviewed 2006 articles - Dunde is just as bad. • I just noticed that Yao et al 2006 introduces yet another inconsistent Thompson version. I've complained to Science, NSF, NAS and gotten nowhere. So twq, before you start using Dunde data, you should ponder which Dunde you're using. • Every Dunde sample datum should be archived immediately. Thompson should be asked to reconcile the various versions. In business, if a company presented rolling inconsistent versions of their financial statements, you know what the outcome would be. The acquiescence of Science, NSF and Ralph Cicerone of NAS in these shenanigans is a disgrace.

http://www.climateaudit.org/?p=1396 The (in)famous Jones quote

• “Even if WMO *World Meteorological Organization] agrees, I will still not pass on the data. We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it.” 21 Feb 2005 Professor Phil Jones Director of the (University of East Anglia, UK) to “amateur climate researcher” (and sceptic) Warwick Hughes http://en.wikipedia.org/wiki/Phil_Jones http://www.climateaudit.org/?p=403 Jones quote is totemic

page vii of Patrick J. Michaels, Robert C. Balling, Jr. (2009) Climate of Extremes: Global Warming Science They Don't Want You to Know, Cato Institute. ALA Code of Ethics • We provide the highest level of service to all library users through appropriate and usefully organized resources; equitable service policies; equitable access; and accurate, unbiased, and courteous responses to all requests. • We uphold the principles of intellectual freedom and resist all efforts to censor library resources. • We respect intellectual property rights and advocate balance between the interests of information users and rights holders. • We distinguish between our personal convictions and professional duties and do not allow our personal beliefs to interfere with fair representation of the aims of our institutions or the provision of access to their information resources. Clauses I, II, IV & VII “Climate scientists should think about data quality more often, says Jones, so that there is no opportunity for incorrect data to sow seeds of doubt in people’s minds about the reality of climate change.”

http://www.climateaudit.org/?p=3119 Accusations of Data Misuse

• Tiljander lake sediment • Baker speleothem • Algorithm/Technique misuse – PCA / Joliffe • TILJANDER, MIA, M. SAARNISTO, AEK OJALA, and T. SAARINEN. 2003. A 3000-year palaeoenvironmental record from annually laminated sediment of Lake Korttajärvi, central Finland, Boreas 32, no. 4: 566-577

• "Since the early 18th century, the sedimentation has clearly been affected by increased human impact and therefore not useful for paleoclimate research". – Tiljander PhD, pg 24 • By flipping the data opposite to the interpretation of Tiljander et al, Mann shows the Little Ice Age in Finland as being warmer than the MWP, 100% opposite to the interpretation of the authors and the paleoclimate evidence. The flipping is done because the increase in varve thickness due to construction and agricultural activities is interpreted by Mann et al as a "nonlocal statistical relationship" or "teleconnection" to world climate.

http://www.climateaudit.org/?p=3967 SU967 speleothem

Orientation in Mann et al 08

Orientation in original publication

Proctor C. J., Baker A., Barnes W. L. & Gilmour M. A. (2000) “Andy [Baker] reported that A thousand year speleothem proxy narrow widths are associated record of North Atlantic climate from with warm, wet climate” Scotland. Climate Dynamics, 16: 815- 820 http://www.climateaudit.org/?p=5766 The PCA Dispute

• MBH98 paper describes standard PCA • 2003: code is released • “MM05 noted that MBH98 normalized their data unconventionally prior to the PCA, by centering the time series relative to the instrumental-period mean, 1902– 1980, instead of relative to the whole available period” – Von Storch and Zorita, 2005, GRL • On Real Climate blog authors defend use of decentred PCA by reference to Joliffe presentation – Joliffe is PCA expert In 2008 Joliffe becomes fully aware of issue...

“... the author says that Wegman is ‘just plain wrong’ and goes on to say ‘You shouldn’t just take my word for it, but you *should* take the word of Ian Jolliffe, one of the world’s foremost experts on PCA, author of a seminal book on the subject. He takes an interesting look at the centering issue in this presentation.’ It is flattering to be recognised as a world expert, and I’d like to think that the final sentence is true, though only ‘toy’ examples were given. However there is a strong implication that I have endorsed ‘decentred PCA’. This is ‘just plain wrong’.”

http://www.climateaudit.org/?p=3601 “It *the presentation+ certainly does not endorse decentred PCA my main concern is that I don’t know how to interpret the results when such a strange centring is used? Does anyone? What are you optimising? A peculiar mixture of means and variances? It therefore seems crazy that the MBH hockey stick has been given such prominence and that a group of influential climate scientists have doggedly defended a piece of dubious statistics.” “Misrepresenting the views of an independent scientist does little for their case either. It gives ammunition to those who wish to discredit climate change research more generally.” – Prof. Ian Joliffe, Sept 2008 Citizen Scientists: surfacestations.org

• Sceptics believe the surface temperature record is unreliable • Specifically they think Urban Heat Islands (UHI) and micro-site biases have contaminated the data • 2 approaches: – Show the weather stations are unreliable – Prove the UHI adjustments are unreliable surfacestations.org

• Embarked on a survey of every weather station in the USA – Started April 2007 – United States Historical Network • 70% complete – 854/1221 – By Feb 2009 A “Good Site”

http://gallery.surfacestations.org/main.php? g2_itemId=564&g2_imageViewsIndex=1

Orland, CA, USA http://gallery.surfacestations.org/main.php?g2_itemId=56 Photographs and metadata

Marysville, CA, USA http://gallery.surfacestations.org/main.php?g2_itemId=831&g2_imageViewsIndex=1 Summary

• Many of stations violate the guidelines for placing/maintaining weather stations – Distance from buildings, heat sources, asphalt • Effect on data is probably unimportant • Snapshot data – Historical metadata is sometimes missing • Often used as publicity to imply that temperature records are faulty • Impressive crowdsourced (meta)data gathering – Probably doesn’t achieve what they want (UHI) data

• IPCC cites Jones et al 91 • Jones 91 uses lots of Chinese station data • 2 later reports are claimed to be inconsistent with Jones et al 91 Fraud allegations

• Keenan asks Wang for UHI data, Wang refuses • Keenan uses UK Freedom of Information Act request to get the data from a research collaborator • Leading to... • Kennan, D.J. (2007) "The fraud allegation against some climatic research of Wei-Chyung Wang", Energy & Environment, 18(7/8) 985–995. • Reports alleged fraud to SUNY Albany • Initial inquiry which says a full investigation is needed Full investigation

• University performs investigation and clears Wang • “I am confused about the purpose of your letter of May 23rd. Your letter says that the Investigation Committee has completed its work and asks me for comments. Yet I am not allowed to see the report on the investigation, nor learn anything about the investigation's deliberations, until after my comments have been submitted. I believe that this can be fairly described as Kafkaesque.” – Kennan June 2008 • Keenan believes that the report is a cover-up and that SUNY Albany has not followed their own research integrity processes • Keenan sends a report to Public Integrity Bureau at the Office of the Attorney General of New York State, “alleging criminal fraud” • Also: “I report the fraud and the university's apparent cover up to the Office of Inspector General at the DOE” and the Research Foundation of SUNY – Alleging fraud and malfeasance by SUNY Vice President for Research Prof. Lyn Videka http://www.informath.org/apprise/a5620.htm Prof. Eric Poehlman

• College of Medicine, University of Vermont • On June 28, 2006, Poehlman was ordered to serve a year and a day in federal prison for using falsified data in federal research grants (NIH, USDA) • government prosecutors stated that Poehlman had defrauded agencies out of $2.9 million. • first academic in the USA to be jailed for falsifying data in a grant application

http://en.wikipedia.org/wiki/Eric_Poehlman http://ori.hhs.gov/misconduct/cases/press_release_poehlman.shtml Would open data have helped?

• Uncertain, possibly • Statistical data mining could have picked up the pattern of fabricated data – Forging realistic data is difficult • “In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.”

http://www.nytimes.com/2006/10/22/magazine/22sciencefraud.html?_r=1 More Data Access Issues

• Faculty and journals “We get considerable criticism from paleoclimate scientists that complying with requests for data and methods sufficient to permit replication is much too onerous and distracts them from "real work". However, the problem is not our request, but that any request should be necessary in the first place. In my opinion, a replication package should have been archived at the time of original publication so that any subsequent researcher can replicate the results without needing to contact the original author.”

http://www.climateaudit.org/?p=350 “As a further point on data: I would much prefer that climate scientists comply with a "best practices" standard e.g. that of leading econometric journals, in which the journals require them to archive functioning code and data as used as a condition of review. This entire song and dance engaging in quasi- litigation for data is a total waste of time.”

http://www.climateaudit.org/?p=6160 “CA readers are well aware that Phil Jones of CRU has jealously refused to provide the surface temperature data set used in the prominent HadCRU temperature index, going so far as to repudiate Freedom of Information requests. Efforts to obtain this data have been chronicled here from time to time. As a result of the previous round of FOI inquiries, we managed to get a (mostly) complete list of station names (but not data) - even the names being refused in the first round of polite requests and subsequent FOI inquiries.”

http://www.climateaudit.org/?p=5962 “Part of the refusals are defensiveness. However, I think that some of the refusals are simply litigation strategy. The more time that I have to spend fighting to get the data, the less time I have to actually analyse the data. As long as the conversation is limited to the journals, they can get away with this, because (1) the journals either do not have or do not enforce proper policies; (2) the journals will reject comments referring to data issues as "non-scientific" e.g. the reviews of this part of our Santer comment. The popularity of climate blogs makes this strategy less effective. Every time that one of these guys refuses a data request, I'll publicize it. It gets the frustration off my chest and places the refusal in the sunlight. The refusers hate the exposure. The journals hate the exposure as well.”

http://www.climateaudit.org/?p=6160 “The public judges the refusals entirely differently than the "peers" and whenever this sort of issue arises, there's a lot of bad publicity for the scientist in question and a lot of piling on at the blogs. The scientist invariably blames me for making an issue of the data refusal, rather than looking in the mirror and asking whether he ought to have just archived the data in the first place or, at a minimum, when asked. Sometimes, the scientists lose track of what actually happened. In the case at hand, I obviously sent an email to Steig requesting data (and code). And yet only a few days later, Steig demanded that I be disciplined for criticizing the failure to provide data without even asking for it - a patently untrue allegation.”

http://www.climateaudit.org/?p=6160 “what we are talking about here? About *a+ scientist who didn't provide source code, data and method for independent parties to replicate his results (what is most basic requirement for anything to qualify as "scientific result"). And we are arguing back and forth if the request from one blogger was "gracious" enough or not, to "justify" refusal of scientist not only to disclose the data and method, but to break communication with third party? Don't we somehow assume that Steig has moral right to refuse disclosing the data without dire consequences for his scientific credibility? Is it really possible that nuances in Steve's tone are more important that Steig's refusal to give him the data and description of method used? I am afraid with such an approach we would get nowhere. Let's face reality - Community are not nice people but corrupted people ready to manipulate science in order to get public attention, funding and prestige. They demonstrated this many times before.”

Comment at CA http://www.climateaudit.org/?p=6160#comment-343729 Dear Sir, The manuscript Cycles and shifts: 1,300 years of multi-decadal temperature variability in the Gulf of Alaska?, by Rob Wilson, Greg Wiles, Rosanne D’Arrigo, Chris Zweck contains a table providing the location of the dendrochronological series, so that any laboratory can go to the place and duplicate the work. Archiving raw data is a normal process and should follow accepted practices, but this is not the responsibility of journal editors. Sincerely Jean-Claude Duplessy [Editor, Climate Dynamics]

http://www.climateaudit.org/?p=1447 “AGU has perfectly good data citation policies, essentially prohibiting the use of "grey" data. Unfortunately these are not followed at AGU publications, like GRL or JGR. This is a slightly different issue than the replication archive as a full data citation includes the URL for a digital source version - citation of print publications for digital sources is not adequate under AGU policies for obvious reasons, but is still usual paleoclimate practice.”

http://www.climateaudit.org/?p=350 Today I finally received a reply this time to "Dr McIntyre" stating that it was not the policy of the International Journal of Climatology to require that data sets used in analyses be made available as a condition of publication" and the matter was now "closed". Dear Dr McIntyre DELETED BY REQUEST Regards Glenn McGregor My original request to McGregor was for a copy of the data policies of the journal. I guess that his answer is that there is no policy. In the present situation, I notified McGregor that Santer had already refused to provide the requested data. Now McGregor says that I am "encouraged to communicate directly with the authors". http://www.climateaudit.org/?p=4742 “Dear Dr McIntyre; In response to your question about data policy my position as Chief Editor is that the above paper has been subject to strict peer review, supporting information has been provided by the authors in good faith which is accessible online (attached FYI) and the original data from which temperature trends were calculated are freely available. It is not the policy of the International Journal of Climatology to require that data sets used in analyses be made available as a condition of publication. Rather if individuals are interested in the data on which papers are based then they are encouraged to communicate directly with the authors. With this email I consider this matter closed. Regards, Glenn McGregor”

http://fabiusmaximus.wordpress.com/2009/01/22/peer-review/ Sceptics’ Requirements

• Open data and code from publicly-funded research – Or research used in public policy decisions • Mandatory Universal URL citing of data sets – With versions – Available at publication date – Preferably available at review stage as well • Attribution for error detection Ideas

• Try replicating your own work from a few years ago • Try replicating some other research from around campus • Try embedding GSLIS students into UIUC labs with data and get them to curate the data • & think about access and replication issues Climate Audit post titles

• A sampling of blog post titles from climateaudit.org Post titles

• Connolley co-author: "Unfortunately we have deleted all the NetCDF files…“ • Glenn McGregor: Data Archiving not required by the International Journal of Climatology • Supplementary Information and Flaccid Peer Reviewing • Help UCAR Find the Lost Cities of Chile • Top Fifteen Reasons for Withholding Data or Code • Bring the Proxies Up to Date!! • Replication #3: What if a step is not replicable? • An Example of MBH "Robustness" • Why peer reviewed publication is not enough Post titles 2

• Jacoby's "Lost" Gaspé Cedars • Eli Rabett Explains Why RealClimate Scientists Can't Update the Proxies • Dunde: Will the Real Slim Shady Please Stand Up? • Dirty Harry 4: When Harry Met Gill • Gavin's "Mystery Man" Revealed • Is Gavin Schmidt Honest? • Gavin Schmidt: "The processing algorithm worked fine” • Survivor Season 8: the Hockey Team - the Mann Overboard Episode • Spot the Hockey Stick! Post titles 3

• Making Hockey Sticks the Jones Way • The Wikipedia Spaghetti Graph and the Hockey Team • Re-Fried Greenland Ice Cores • Is Briffa Finally Cornered? • Briffa Archives Tree Ring Data! • Why did Steig use a cut-off parameter of k=3? • How IPCC AR4 authors defended the Briffa data deletions • WSJ: Hockey Stick "on ice“ • Fortress Met Office • East Anglia Refusal Letter Post titles 4

• Santer Refuses Data Request • Santer's Boss Seeks to "Clarify Mis-Impressions“ • Gore Scientific "Adviser" says that he has no "responsibility" for AIT errors • More on NSF Data Archiving Policies • You Can't Make This Stuff Up • Cunning IPCC Bureaucrats • Did IPCC Review Editor Mitchell Do His Job? • Where did IPCC 1990 Figure 7c Come From? • Mannian CPS: Stupid Pet Tricks • Materials Complaint on Moberg: Update • More Changes at the Mann 2008 SI Post titles 5

• Potential Academic Misconduct by the Euro Team • Low Head and the Gnomes of Norwich • Fortress CRU #2: Confidential Agent Ammann • CRU Reveals Station Identities • A Try for Thompson Data at PNAS • Is Hughes in compliance with data archiving requirements? • Santer and the Closet Frequentist • Juckes and the Pea under the Thimble (#1) • Hugues Goosse and the Unresponsiveness of Juckes • Juckes - Meet the Durbin-Watson Statistic • "Mannian" PCA Revisited #1 • Errors Matter #3: Preisendorfer's Rule N