ewsletter June 2006 Number 115 N statistical society of australia incorporated ASC/NZSA 2006 – Statistical Connections!

Professor Peter Hall of the Australian National University will talk on “PCA For FDA”. In the closing plenary session on Thursday, Professor Xiao-Li Meng of Harvard University will talk on “(Data) Size Does Matter, But You Might Be In for a Surprise ...”. The remaining plenary session will be the Foreman lecture to be given by Professor Ray Chambers of the University of Wollongong. His talk will address Models and Auxiliary Information in Survey Sampling. The Foreman lecture will kick off a full day theme devoted to Official Statistics. The Foreman lecture will be followed by an invited session featuring Alastair Scott and Kirk Wolter. After lunch there will be an Official Statistics Forum featuring the New Zealand Government Statistician Brian Pink and the Australian Statistician Dennis Trewin, followed by a contributed session. Other contributed sessions on various aspects of official statistics will be held throughout the conference. The conference has attracted a wide range of leading invited speakers. The 25 Invited Speaker Sessions will ASC/NZSA 2006, the international joint conference feature leading international and local speakers presenting a between the SSAI and the NZSA to be held in Auckland wide range of the latest research in Statistical and Machine over July 3-7 has attracted substantial registrations to its Learning, Statistical Inference, Statistics in Ecology and the outstanding scientific program. Features of the program Environment, Statistics in Biological Sciences and Medicine, include 4 plenary addresses, a full day focussing on official Stochastic Processes, Statistical Inference Econometrics and statistics, over 50 invited talks and over 100 contributed Finance, Computational Statistics and High Dimensional talks as well as poster presentations. Complementing the Data Modelling. conference scientific program are workshops on specialist Support for the Contributed program has been topics and optional tours and social events. For up-to-date overwhelming. Over 30 contributed paper sessions and poster and detailed information on the 2006 Program visit: sessions have been organized to complement the Plenary . and Invited Sessions. Young Statisticians have organized an Apart from its scientific program, the ASC/NZSA 2006 invited and contributed sessions and there will be a focal conference will provide a great opportunity for statisticians point for them at the Welcome Reception. from Australia and New Zealand to renew or develop links. There will be a Welcome Reception, included in registration, Associated workshops on Distance Sampling, Stochastic on Monday evening to allow delegates to get to know each Processes and the R-Fest Workshops on R Graphics, other. The optional Gala Dinner on Tuesday night will be S programming and Bioconductor. Find out more on the held in the Auckland Room at SKY CITY. This will be an workshops at the following link: ideal opportunity to mingle and meet fellow delegates while enjoying a sumptuous three course meal and New Zealand A series of tours has been organized to allow delegates the wines. opportunity to experience first hand beauty of Auckland and The four Plenary Sessions will feature outstanding surrounding areas. Please visit the website: international speakers. On the first day, Professor David for more Donoho of Stanford University will deliver his lecture on details on these. “Sparsity in Estimation and Detection”. On the third day We look forward to seeing you in Auckland! SSAI Review of Statistics at Australian Universities – Follow-up

Since the review report was • Issues relating to Statistics in the various stakeholders (government published in December 2005, it has Schools departments, universities, professional been distributed to all SSAI members • Issues relating to the Management bodies) as well as statisticians along with a broad range of other and Organisation of Universities representing a broad cross section of view stakeholders and interested parties. • Issues relating to the University/ points. Two main roles are anticipated: An electronic copy is readily available Employer Interface firstly, the groups will connect various from the SSAI website address http:// networks together so that there will www.statsoc.org.au/Review0405/ • Issues relating to Marketing, be well-informed discussion at various ReviewofStatsFinalReport.pdf Further Communications and Lobbying. levels and secondly, they will act as hard copy booklets are available from SSAI is setting up two implementation forums where ideas and proposals can Jane Waslin at SSAI. advisory groups with one to focus be debated. As a professional body, The SSAI review was discussed by on the recommendations related to SSAI has a responsibility to ensure a number of people as part of their Statistics in Schools and a second that there is open constructive debate representations to the Review of group to focus on the second and third Mathematical Sciences during February main recommendations that concern along with a propensity to use objective this year. universities and the university/employer methods of validation. There are many ‘schools of thought’, passions, biases Discussions are planned with senior interface. The final recommendation and opinions and these must be heard DEST (Department of Education, will be looked after by a working party Science and Training) officials in that will focus on the more direct and considered but ultimately we must conjunction with the advisor (Dr aspects that SSAI and related bodies be swayed by what ‘works’. Of course Jade Sharples) to the new Minister of can do to improve the information defining what is meant by this is part of Education, Science and Training (Hon available to school students, university the debate and teasing out the various Julie Bishop MP). students and careers advisors. contexts will only help us all move There are four main recommendations The two main advisory groups will forward. in the SSAI review report: be made up of representatives from Neville Bartlett

In this issue ABS Statistical Scholarships ASC/NZSA 2006 1 awarded in Canberra SSAI Review of Statistics at Australian Universities 2 On Thursday 4 May 2006, four statistical subjects in their first and President’s Corner 3 students at the Australian National second years. These universities University (ANU) were awarded are the University of Adelaide, ABS Statistical Scholarships an Australian Bureau of Statistics the University of Wollongong, the awarded in Canberra 3 (ABS) Statistical Scholarship. These University of Queensland, and the students were Matthew Beale, ANU. Each scholarship is to the Editorial 4 Taleitha Coon, Adam Franklin and value of $4000, and is offered for Jessica Twigg. The ceremony and a full year of study in statistics. Three Doors 5 accompanying lunch were hosted by Application forms for students who Notice of AGM 6 the School of Finance and Applied wish to apply for the next round can Statistics (FAS) at the ANU, with be obtained from the Careers Office Update on AusCan Scholar visit 7 well over 100 persons in attendance. or College/Faculty staff from the After an introduction by Dr Michael relevant universities, or by contacting ASC/NZSA Financial Assistance for Martin of FAS, the awards were the ABS, for example by emailing Young Statisticians 7 presented by Rebecca Cassidy and to [email protected]. The Eileen Fahey of ABS. closing date for applications in 2006 ABS Centenary Celebrations Several ABS Statistical is early September. The ABS places Conclude 8 Scholarships are awarded each year no restrictions of future employment by the ABS to students at selected on winners of the award. Branch Reports 10 universities who have excelled in Borek Puza

2 SSAI Newsletter – June 2006 President’s Corner

Reduced student membership rates from 2006 ASC/NZSA 2006 At the end of last year the SSAI Executive Committee of I hope you have all Central Council took the decision to reduce annual membership seen the varied and fees for student members. This means that for $15 capitation interesting program for per year to the central body, student members receive copies the joint SSAI/NZSA of every issue of the SSAI Newsletter and electronic access Conference in Auckland to the Australian and New Zealand Journal of Statistics. I on 3-6 July. If you congratulate those branches that only charge this amount as haven’t already decided to they carry any additional cost at the local level. attend, you might like to reconsider that decision. SSAI encourages student involvement in its activities Contrary to my advice in and offers an opportunity for networking with experienced the last newsletter, the statisticians as well as other students and statisticians starting SSAI Annual General out in their careers. In recent years branches and sections of Meeting will be held on SSAI have organized workshops, short courses and symposia Wednesday, 5 July. Some of the major items for discussion on a wide range of topics. Members of SSAI are able to take will be the recommendations and resulting outcomes from advantage of reduced registration fees for all such events. our Review, the way we elect our office-bearers, capitation We also have an active Young Statisticians Section which is fees, and the future of our journal. If you have any issues that planning a conference for 2007. Young Statistician events you would like addressed, please contact me or our Executive are held regularly in each Branch – members and prospective Officer Jane Waslin. members should contact their local branch for details of Kaye Basford forthcoming events. There is a link to all branches from the E-mail: [email protected] SSAI website – www.statsoc.org.au.

The latest in statistics from Australia, New Zealand and the world online.

Published on behalf of the Statistical Society of Australia Inc. and the New Zealand Statistical Association

Your Research Your Journal Managing Editor: Professor Kerrie Mengersen Print ISSN: 1369-1473 Read it online at Online ISSN: 1467-842X Current Volume: 48 www.blackwell-synergy.com/loi/anzs Frequency: 4 issues per volume

SSAI Newsletter – June 2006 3 Editorial

Have you ever wondered what the process of producing the newsletter involves? After the deadline for copy passes, the Editors meet with the Society’s Executive Officer to collate the material sent in, select an item for the front page and edit items so that everything fits within a multiple of four pages. The material is sent to the printers who produce a draft version of the entire newsletter. This is proofread again by the editors, PO Box 5111, any necessary changes are made and the draft sent back to the printers for correction. Braddon ACT 2612 This is usually the last stage of the iterative process, and if the second draft looks fine it Phone (02) 6249 8266 is cleared for printing. The printing company also has a mailing house that uses mailing Fax (02) 6249 6558 address sheets provided by the Executive Officer that are sent with your copy. The plan Email: [email protected] is always to have the newsletter for a given month posted to members by the middle of Society Web Page the month. http://www.statsoc.org.au The Editors are grateful to Branch correspondents to supply meeting reports, to committee chairs who report on the work of committees within the Society, and to individuals who supply news of members. We are always on the lookout for more of Editors this sort of material, so don’t be shy! If you attended an interesting statistical event that took place in your area, let us know! If you have carried out an interesting or slightly Alice Richardson, School of ISE, University of Canberra, controversial data collection or analysis, let us know! The Newsletter can only print the PO Box 1, Belconnen ACT 2616 material that you send in. We look forward to a deluge of fascinating articles in the months to come. Michael Adena, Covance Pty Ltd We also hope you like the new design for the Newsletter, which gives a more modern PO Box 5125, Braddon, ACT 2612 look and feel.

Correspondence Please direct all editorial correspondence to Alice Richardson. Email: [email protected]

Disclaimer The views of contributors to this Newsletter should not be attributed Looking for a job? to the Statistical Society of Australia, Inc.

Subscriptions For a listing of current statistical The Newsletter of the Statistical vacancies in Australia or New Zealand visit: Society of Australia is supplied free to all members of the society. Any others wishing to subscribe to the http://www.statsci.org/jobs Newsletter may do so at an annual cost of A$30.00 for an issue of four numbers. With 255 positions listed last year. Advertising Advertising will be carried in the Newsletter on any matters which Do you have a job to the Editors feel are of interest to the members of the Society. For advertise on the website? details of advertising rates, etc. contact the SSAI Executive Officer at [email protected] Email a position description to Printer [email protected]. Listing is free! National Capital Printing 22 Pirie Street, Fyshwick ACT 2609 This service is proudly brought to you by:

DEADLINE FOR NEXT ISSUE: 10 August 2006 Statistical Society of Australia Inc.

4 SSAI Newsletter – June 2006 Three Doors with Borek Puza (Edition 6)

Welcome to the sixth edition of Three N, the number of persons who have a follows. Let X be the number of persons Doors. In the last edition I presented shared birthday. To 5 decimals, this pmf who share a birthday on 1 January, so The Birthdays Puzzle. Unfortunately, no- is given by: that X ~ Binomial (20,1/365), and define one formally emailed a correct solution f(0) = P(365,20)/36520 = 0.58856 Y = XI (X > 1), where I (.) is the standard to [email protected] (although f(1) = 0 (exactly) indicator function. Then the expected several persons had a go at the puzzle f(2) = 0.32320 number of persons who share a birthday and one submitted a correct solution to f(3) = 0.00559 on 1 January is me personally but deemed it ‘not clever f(4) = 0.07132 EY = EX - E{XI(X = 0)} - E{XI(X = enough’). Anyway, the solution to this f(5) = 0.00218 1)} puzzle is presented below. The next puzzle f(6) = 0.00823 = 20(1/365) - 0 - 20(1/365) follows. It has an interesting solution f(7) = 0.00033 (364/365)19 = 0.00278307. and was brought to my attention by f(8) = 0.00054 Hence the expected total number of Steve Clarke, who previously submitted f(9) = 0.00002 persons who have a shared birthday is a correct solution to The Chess Puzzle. f(10) = 0.00002 EN = 365EY = 1.0158, as before. Thanks Steve. f(11) = 0.00000 (meaning < 0.000005) The Passenger Puzzle The Birthdays Puzzle f(11) = 0.00000 (courtesy of Steve Clarke) In a room there are 20 persons. (a) Find ...... There is an aeroplane with 100 seats and the probability that exactly 6 of these persons f(20) = 0.00000. 100 passengers, each with a seat assignment. have a shared birthday (i.e. a birthday which The first passenger to enter the plane loses his is the same as at least one other birthday). (c) The expected number of persons seat assignment, and so picks a random seat. (b) Then find and tabulate to 5 decimals who have a shared birthday is Each of the following passengers, entering the complete probability distribution of the EN = 0f(0) + 1f(1) + 2f(2) + ... + 20f(20) one by one, takes their seat if available; number of persons with a shared birthday. = 1.0158. otherwise they pick randomly an available (c) Using (b), or otherwise, find the expected Discussion seat. What is the probability the 100th number of persons with a shared birthday. The above working for N ’s distribution passenger sits in their own seat? (It may be assumed that the 20 birthdays in (a) and (b) was rather tedious, and it For your chance to win a fabulous are distributed independently and uniformly does not generalise easily. There does mystery prize, please send your solution over a year with 365 days.) exist an explicit formula for N ’s pmf, f (n), to [email protected]. Solution to The Birthdays Puzzle but it involves Stirling numbers, is fairly References (a) Six persons can have a shared complicated, and so will not be presented Puza, B.D., and O’Neill, T.J. (2006). birthday in one of 4 ways: here. See Puza and O’Neill (2006) for Two birthday problems. Communications (i) all 6 on the same day more details. By contrast, N ’s expected in Statistics - Theory and Methods, 35, (ii) 4 on one day and 2 on another value can be obtained very easily, as 415-422. (iii) 3 on each of 2 days (iv) 2 on each of 3 days. Our task is to count the number of sample points corresponding to each way. As regards (i), there are 365 days in Conferences the year, C(20,6) = 20!/(6!14!) ways to choose 6 persons from 20, and P(364,14) ASC/NZSA 2006 Statistical Connections, Auckland, New Zealand = 364!/350! ways in which 14 persons 3-7 July 2006. www.statsnz2006.com can have different birthdays on 364 days. Thus the number of ways that 6 persons ICOTS July 2006 (even though it clashes with ASC, but then so does can have a shared birthday, with all 6 on Mathsport). URL is http://www.maths.otago.ac.nz/icots7/icots7.php. the same day, is (i) 365C(20,6)P(364,14). BioInfoSummer, ANU, Canberra Likewise, the other three numbers of Dec 4-8. There’s no website yet but the email is sample points are: [email protected] (ii) P(365,2)C(20,4)C(16,2)P(363,14) 8th Australasian Conference on Mathematics and Computers in (iii) C(365,2)C(20,3)C(17,3)P(363,14) (iv) C(365,3)C(20,2)C(18,2)C(16,2) Sport, Coolangatta, Queensland P(362,14). 3-5 July 2006. Contact: [email protected] Summing all 4 numbers and dividing ISI2007, Lisbon, Portugal by the total number of sample points, http://www.ine.pt 36520, we obtain the required probability, 0.00823. Diary note – SSAI Young Statisticians section is planning a conference for (b) Repeating the logic in (a) for April 2007 – see future issues or the SSAI website for updates. numbers other than 6, we can obtain the probability mass function (pmf ) of

SSAI Newsletter – June 2006 5 NOTICE The Annual General Meetings of The Statistical Society Of Australia Inc and The Australian Statistical Publishing Association Inc. will be held on Wednesday 5 July 2006 at 4.15pm at SKY CITY, Auckland, New Zealand

SSAI Annual General ASPAI Annual General Meeting — Agenda Meeting — Agenda

1. Apologies and Proxies 1. Apologies and Proxies Proxies must be given in writing as per form inserted Proxies must be given in writing as per form inserted in the June 2006 issue of SSAI Newsletter. Proxy with June 2006 issue of SSAI Newsletter. Proxy forms must be received by the SSAI Executive forms must be received by the SSAI Executive Officer for passing to the Secretary no later than Officer for passing to the Secretary no later than 24 24 hours before the time of the meeting. hours before the time of the meeting. 2. Confirmation of the Minutes – Minutes of the 2. Confirmation of the Minutes – Minutes of the meetings as circulated meetings as circulated 3. Matters arising 3. Matters arising 4. Reports 4. Presentation of the Annual Report by the Editor 4.1 President of the Australian and New Zealand Journal of Statistics 4.2 Treasurer 5. Presentation of the Annual Report by the Newsletter 4.3 Branches Editors 4.4 Sections 6. Treasurer’s Report 5. Accreditation 7. Appointment of signatories 5.1 Report from Accreditation Committee 8. Special Business 6. Conferences Consider and, if thought fit, approve amendments 6.1 ASC 2006 (joint with NZSA) to the Rules of the Australian Statistical Publishing 6.2 ASC 2008 Association Inc. (**) 7. Election of Section Chairs 7. Other business Nominations for Section Chairs should be received 8. Time and place of next meeting. at the SSAI office no later than 23 June 2006. Nomination Forms have been inserted in each copy of the June issue of SSAI Newsletter. All nominations require a seconder and a statement from the nominee that she or he is prepared to stand. (**) Copies of the amendments to the Rules of the 8. Appointment of signatories Statistical Society of Australia Inc. and of the Australian Statistical Publishing Association Inc. may be obtained 9. Special Business from the Society’s office, or viewed on the Society’s Consider and, if thought fit, approve amendments Web site at www.statsoc.org.au/WhatsNew/ to the Rules of the Statistical Society of Australia Incorporated. (**) 10. Other business 11. Time and place of next meeting.

6 SSAI Newsletter – June 2006 Update on AusCan ASC/NZSA Scholar visit Financial The inaugural AusCan Scholar, Dr Mu Zhu, is planning on visiting statistics groups in Australia between mid September and early November this year. An AusCan Scholar liaison group has recently been established in Australia to Assistance help communicate and coordinate Dr Zhu’s visit to statistical groups in specific cities including arranging opportunities for Dr Zhu to present seminars at branch for Young meetings wherever possible and appropriate. The group comprises representatives from each branch, see below: Statisticians QUEENSLAND VICTORIA Melissa Dobbie Debra Partington The March 2006 issue of the SSAI ([email protected]) ([email protected]) Newsletter contained advertisements SOUTH AUSTRALIA WESTERN AUSTRALIA for funding available to assist Margaret Swincer Aloke Phatak undergraduate students and young ([email protected]) ([email protected]) statisticians to attend the ASC/ ACT NEW SOUTH WALES NZSA Conference in Auckland in Brent Henderson Caro Badcock July this year. ([email protected]) ([email protected]) Tilman Davies from Western Australia was awarded funds from In his most recent proposed itinerary, Dr Zhu expects to visit Sydney, Brisbane, the CSIRO pool for undergraduate Perth, , Adelaide and Canberra. Specific dates of travel are still to be students. established and these will be communicated to the AusCan Scholar liaison group when available. Any individuals or groups interested in meeting and/or having Joesph De Livera, Joanne Potts Dr Zhu spend some time with them during his trip should contact their branch and Sharon Lau were awarded funds respresentative (listed above) as soon as possible to register your interest and to from the Data Analysis Australia help Dr Zhu develop a program that will allow him to fulfil the objectives of the pool for recent graduates. scholarship. Watch a future issue of the Melissa Dobbie Newsletter for their reports on their [email protected] experiences. CSIRO Mathematical and Information Sciences

Hosted By Scientific Program The Statistical Society of Australia (SSAI) and The A stimulating and cutting edge Scientific Program is being New Zealand Statistical Association (NZSA). developed to cover a wide range of topics relevant to all statisticians. The program will provide practical knowledge Contact Details and insights from prominent international and Australasian ASC/NZSA 2006 Conference Managers speakers and will address the latest developments in statistical GPO Box 128, Sydney NSW 2001 research, education and practice. Phone: +61 2 9265 0700 Fax: +61 2 9267 5443 Email: [email protected] Workshops Technical workshops that are of particular interest Expression of Interest to practitioners will be included in the Conference Program. The Scientific Program Committee is seeking If you are interested in attending the Conference, please register your interest potential workshop presenters. If you are interested in on-line www.statsnz2006.com contributing please contact David Scott at d.scott@ auckland.ac.nz.

SSAI Newsletter – June 2006 7 Australian Bureau of Statistics Centenary Celebrations Conclude

The Australian Bureau of Statistics has practical ability required for this type He was also a significant, caring and just wrapped-up its centenary celebrations, of work was not yet obtainable from generous mentor for several generations with a finale on December 8. Like all big universities in Australia. of young statistical officers in the Bureau. anniversaries, it was a time to reflect on In Informing a Nation, a young Ken “He was a legend, he had some god- the past and to look forward. Foreman is pictured during field work like qualities”, Trewin said. Since colonial times, Australia’s in Papua New Guinea. Wearing a well- One of the earliest examples of the official statistics have been regarded as worn khaki shirt and broad-rimmed hat, Bureau’s contribution to international first class. To a large extent, this has he resembles an Indiana Jones-like figure. statistical methods was the publication been because sound and contemporary Indeed, he saw some real-life adventure. The Mathematical Theory of Population. statistical methods have been embraced Foreman joined the Royal Australian Air It was released in 1917 as an appendix in the compilation of official statistics. Force in 1943 and became a navigator in to the Statistician’s Report for the 1911 Over the past 50 years, the statistical a bomber. He was badly injured when he Census. methods used by the ABS have been was shot down in a Lancaster in 1943 “It comprised 450 pages, so it was some heavily influenced by developments in and had to endure a long convalescence. appendix”, Trewin said. mathematical statistics. He returned to Australia, and graduated This publication had a twofold purpose Other Centenary celebrations include with a Bachelor of Economics with of attempting to establish suitable methods the release of the ABS history book, Honours at the University of Sydney, but for analysing census and demographic Informing a Nation, a National Statistics by mid-1952 he went overseas to study data, and actually using those techniques Day (8 December), the Centenary Ball in the US Bureau of the Census under to analyse the 1911 census results. and the annual Ken Foreman lecture. Morris Hansen and Bill Hurwitz, two of the most famous names in sampling at Trewin said that elements of this work In delivering the latter, the Australian the time. He also had the opportunity to are still used in demographic work in Statistician, Dennis Trewin, reviewed the meet W. Edwards Deming who provided Australia and elsewhere. “This really is a development of statistical methods in the inspiration for his work on quality book that has aged very well. It has stood official statistics over the past 100 years. control. the test of time”, he said. He also speculated how mathematical statistical techniques might impact on the Despite his wartime injuries, Foreman “Even from those early days, the ABS future work of the ABS. travelled overseas several times to study was interested in methodology ... even the latest developments in sampling in the though we were a small young country, He began by taking a look at the United States. In this way the Bureau’s our statistical services were regarded as legendary figure for whom the lecture sampling ability remained abreast of world class.” is named. The Bureau’s long history of developments in theory and technique commitment to research and methodology The ABS was involved in a world being made overseas. is in no small part attributable to the first when Roland Wilson, then second efforts of many statistical officers in As Foreman himself said in 1977: “The in charge at the Bureau, developed his pursuit of excellence, and none more application of mathematical statistical balance of payments statistics. Based on his PhD work, it was released as a chapter so than Ken Foreman. Foreman led the techniques to an increasing range of in the 1933 Year Book. widespread introduction of mathematical the Bureau’s operations has resulted in statistical methods, especially sample many notable innovations which, in total, Developments during the Foreman era surveys, to the work of the Bureau. have significantly altered the Bureau’s included: capabilities and economics”. As former Australian Statistician Bill 1952 – sample to develop income tax McLennan described him: “Foreman Today Foreman is remembered in statistics (led to creation of methodology was undoubtedly the father of statistical the Bureau with respect, awe, gratitude unit) methods at the ABS. He was the leader of and fondness. He raised the level of 1960s – development of quarterly the first methodology unit and continued methodological competence in the labour force survey to foster, expand and lead it until his Bureau, and was a key player in four 1969 – introduction of generalised retirement from the ABS in 1984”. important developments: survey systems • implementation of significant Foreman introduced the Bureau to 1972 – introduction of social surveys sampling techniques; formal sampling methodology. This 1977 – use of post-enumeration surveys included formalising the techniques used • design and implementation of the to adjust for the Census undercount. for current samples and standardising household survey system; Trewin pointed out that Australia had those to be employed for future sampling • pioneering work in development and pioneered the latter exercise. “The US work. application of seasonal adjustment; still has not got around to adjusting their He developed comprehensive training and Census undercount”, he said. for the benefit of those working in the • the introduction of data “I think we were the first country to do area, as the level of knowledge and management. it way back in 1977 ... we just thought it

8 SSAI Newsletter – June 2006 was common sense. There has been no • Emerging statistical outputs (e.g. criticism of it since then.” environmental statistics, longitudinal Society Secretaries “But it does mean that the way analysis) requiring the application of Central Council novel sampling methods President: Prof K. Basford population statistics are used in Australia Secretary: Dr D. Shaw is much better than in many other • Combining survey and administrative Email: [email protected] countries. Subsequently, countries such as data Canada have copied what we’ve done.” New South Wales • Extracting information from large President: Ms Caro Badcock The key development in the 1980s data bases (e.g. EFTPOS, scanner Secretary: Eric Beh was the introduction of synchronised data bases, tax data bases) Email: [email protected] sampling (variant of collocated sampling • Use of analytical- or model- Canberra to business surveys). based methods to produce official President: Dr Brent Henderson Trewin said that the reason the ABS had statistics Secretary: Dr Ray Lindsay been successful with its methodological • Data linking will be increasingly Email: [email protected] developments was that there had been used to provide official statistics Victoria a strong alignment between the ABS’s • Confidentiality and microdata access President: Dr Brian Phillips statistical and methodological areas. introducing new confidentiality Secretary: Dr Ann Maharaj Email: “Ken Foreman was very keen that challenges methodological people interact with the [email protected] • Internet and web-based surveys – a other business areas so that they could number of new challenges in both South Australia be seen and heard; and that they weren’t President: Ms Margaret Swincer survey design and form design working in a vacuum – they could actually Secretary: Ms Janine Jones understand the context of the work they • Small area statistics – satisfying the Email: [email protected] increased data had been doing”, he said. Western Australia “The other sense of alignment is that • International surveys – ensuring the President: Dr Brenton Clarke the other business areas … had a working data are really comparable. Secretary: Mr Martin Firth Email: [email protected] knowledge of methodology ... so they He said that the other half of the ABS could have meaningful communications charter is as a statistical leader. This Queensland with the methodologists. role would include producing manuals President: Dr Ross Darnell of good practice; training in statistical Secretary: Dr Helen Johnson “In some statistical offices, that does Email: [email protected] not happen. The methodologists are methods; and providing advisory services. poked away in the corner to do their own Despite such a proud history of thing and have a much smaller influence achievements over a century, the ABS Section Chairs will not rest on reputation. And that on the work of the office, than what our Statistics in the Medical Sciences methodologists have.” includes ensuring the calibre of its Mr Peter Howley staff remains world class. Trewin said He said that support for methodology [email protected] he was concerned about the supply of by senior ABS managers has been strong. Statistics in the Biological Sciences mathematical statistics graduates. “The past two Australian Statisticians Dr Simon Barry had a methodological background, so “We feel the supply is dwindling at the Email: [email protected] it [that support] was probably more same time as the demand is increasing”, Survey and Management straightforward then, but it actually he said. Dr Robert Clark happened right back from Keith Archer’s “The ABS has taken a strong interest Email: [email protected] days in the 1950s and 1960s”, Trewin in seeing what can be done. We want Statistical Education said. to try and get a more uniform approach Dr Michael Martin [email protected] “There was from the senior ABS to teaching of statistics at schools, so management, support for methodology you have a framework for developing Statistical Computing resource materials and a framework for Associate Professor Kuldeep Kumar development. They [methodologists] Email: [email protected] weren’t regarded with suspicion by the training of teachers of statistics.” Industrial Statistics senior managers, they were regarded as With such an enviable and sustained Mr Ross McVinish important.” record of quality, and a focus on the future, Email: [email protected] another generation should celebrate the Turning to the future, Trewin said that Young Statisticians as a statistical producer, the Bureau’s ABS’s second century with pride. Ms Janice Wooton methodological challenges included the This article originally appeared in Email: [email protected] following: AMSTAT News. Bayesian Professor Kerrie Mengersen Email: [email protected]

Further contact details for Society Secretaries and Section Chairs can be obtained by contacting the Society on (02) 6249 8266

SSAI Newsletter – June 2006 9 Branch Reports

SOUTH AUSTRALIA a current honours student in Statistics a weight which is calculated differently is now receiving informal support depending on whether an adverse event from one of the more senior Young has occurred or not. Thus the weight will Young Statisticians Dinner (SA) Statisticians with similar interests. This be negative for a successful outcome and is just one example of the importance positive (seen as a penalty) for a failure The first ever Young Statisticians event of networking, particularly for Young or adverse outcome. The weight forms in SA was held in March this year. The Statisticians. It is hoped that future events a log-likelihood ratio which is based fantastic turnout exceeded all expectations, for Young Statisticians will result in more on individual’s prior risk (explaining the with more than 20 people attending. The collaborations such as this and ultimately ‘risk-adjusted’ terminology) which is evening kicked off with pre-dinner drinks contribute to the future of Statistics in generally estimated by logistic regression. at a local pub, which gave everyone an Australia. The log-likelihood ratio is then used opportunity to introduce themselves to for testing whether the observations on new faces and catch up with old friends consecutive patients are more supportive in a relaxed environment. From there, SA AGM of a hypothesis of adverse events occurring people proceeded to Café Buongiorno At the SA Branch Annual General at the accepted benchmark rate – or at for dinner. The SA branch of the SSAI meeting A/Prof Peter Baghurst, some (pre-specified) higher rate. kindly sponsored the evening, with a epidemiologist from the Public Health variety of pizza provided for everyone Research Unit of the Women’s and Peter went on to discuss the best choice to share. This idea worked well and Children’s Hospital in North Adelaide for setting control limits for the cusum contributed to the casual atmosphere of gave an interesting talk about sensitive given that the traditional alpha levels of the evening. methods for monitoring the outcomes of 5% are not appropriate for sequential Everyone who attended the dinner health-care. testing. He introduced the Average Run was asked to complete a short survey, Some events such as the Bristol Length (ARL) which can be considered prepared by Penny Bennett, to assist with Paediatric Cardiac Surgery scandal, analogous to the type 1 error rate e.g. planning for future Young Statisticians and the notorious Dr Harold Shipman, when the process is in the H0 state the activities in SA. All suggestions were have given impetus to refining statistical ARL should be long before a ‘signal’ greatly appreciated and the difficulty now techniques to take case-mix into account; occurs (crosses the control limit) and is is deciding which suggestion should be and developing control charts to enable likely to represent a false alarm and if the taken up first! shifts in the incidence of adverse events process has shifted to the alternative state it should have short run lengths before The whole evening was a great success, to be detected much more rapidly than ever before. a ‘signal’ occurs. Therefore, a control with many people commenting on how limit far from zero will lead to longer much they enjoyed themselves. It gave Peter introduced a monitoring tool ARL’s and what is considered long will everyone an opportunity to find out called the risk-adjusted cumulative sum depend on the circumstance. The ARL where other Young Statisticians in SA or ‘cusum’ which has been developed can be approximated by Markov Chain are working or studying, the types of particularly for detecting adverse events methodology and is calculated based methodology being used and some of over time. The cusum is essentially a on a pre-specified alternative rate. Peter the statistical problems that people are score over time which begins at zero referenced a paper by Steiner (2000). faced with. As a result of the dinner, then accumulates over time by adding Various examples were shown throughout the talk. One example demonstrated the cusum by using fourteen anonymous Paediatric Intensive Care Units across Australia for the adverse event of death after forty-five minutes of admission in PICU. This clearly demonstrated the importance of risk-adjustment given different PICU’s had different exposures of risk. The talk was then wrapped up with interesting discussion of the cusum used in practice.

Incorporating LASSO Effects into a Mixed Model for QTL Detection The speaker for April was Scott Foster, a current PhD student at BiometricsSA. He began his talk by explaining what QTL (quantitative trait loci) are and why they are important. QTL are positions or regions of the genome that are statistically related to a quantitative trait. Usually this L to R: Amy Glen, Paul Eckermann, Jono Tuke and Sam Cohen at the Young trait is of economic importance, such as Statisticians Dinner in SA. accelerating breeding in agriculture. QTL

10 SSAI Newsletter – June 2006 Branch Reports Continued

can be used to narrow the search for March meeting of the WA branch of the annual load profile and set probabilities causative genes. SSAI. His talk was entitled “Statistical of outages for each generator. This In QTL experiments, populations are Analysis at Western Power”, and covered method was used until advances in created which are genetically diverse. some of the statistical techniques he has computing power allowed for Monte Genetic differences are assessed for given used at Western Power. Commercial Carlo simulation to simulate the outages positions of the genome. It is then possible confidentiality prevented Ross from going of each generator. to look for differences in traits based on into too much detail, but he was able to The second example of Ross’s work was differences in loci and hence identify present some of the methodologies he the weather correction of energy and peak QTL. Scott described a motivating data has used, with five different projects as load data. Weather correction of peak set, where the aim of the experiment was examples. loads is especially useful, as the peak loads to identify genes for traits of economic The first example involved was are strongly dependent on the weather, importance related to beef production, in calculating the unserved energy estimates and it can be difficult to determine any particular the birth weight of cattle. The from power system simulators. This long-term trends in the raw peak load statistical problem arises when trying to estimation is important when scheduling data. Two different methods are used for identify the genes which are associated maintenance of generators to ensure that weather-correcting the peak load values. with birth weight, while controlling for there is sufficient generation capacity to The first uses nominal distribution theory, experimental effects, such as gender and meet demand if any of the generators comparing 40 potential peak loads to the location. breaks down (a forced outage). This median of the yearly peak load. Similar The trait of interest can be modelled estimation also allows for the prediction to the unserved energy methodologies, as a linear combination of genetic and of future plant investment. Ross covered the second method of bootstrapping only experimental effects. However, there are two different methods that were used to became feasible with the advancement of many explanatory variables which may be estimate this unserved energy – firstly, computing technology. Ross described the included in the model, creating a model a method of cumulants, using a fixed bootstrapping of previous year’s weather selection problem. The LASSO (least absolute selection and shrinkage operator) may be used to produce estimates of genetic effects from a random model where the random effects are assumed Thinking Statistically to come from a double exponential distribution so some estimates will be Elephants Go to School ‘shrunk’ to zero. The level of shrinkage A UNIQUE TEXTBOOK depends on the dispersion parameter. By In order to include fixed and normally A new way to learn distributed random experimental effects Sarjinder Singh statistics using pictures, in the model, the LASSO effects can be incorporated into a standard linear St. Cloud State University jokes, and tales. mixed effects model. This type of model Department of Statistics A lot of learning with performs well in terms of identifying St. Cloud, MN 56301-4498 QTL while maintaining an acceptable fun through 676 pages. number of false positive results, relative to U.S.A. other currently available methods. Good for all ages After the talk, a farewell dinner was + Good for all libraries held for Scott at a Thai restaurant. He + Good for all majors is moving to Tasmania to take up a + Good for all schools position with the Tasmanian Institute of Agricultural Research. Scott has been + Good for you too a valued member of the SA branch of PLEASE HAVE A LOOK the Statistical Society for a number of years and his presence at meetings will be greatly missed. We wish him all the best for the move and the new job. Lisa Yelland

WESTERN AUSTRALIA Kendall/Hunt Publishing Company 4050 Statistical Analysis at Western Power Westmark Drive, P.O Box 1840 Dubuque Ross Bowden, a senior analyst at the Iowa 52004-1840, U.S.A. electricity retailing company Synergy (previously known as the Retail division www.kendallhunt.com of Western Power) presented at the

SSAI Newsletter – June 2006 11 Branch Reports Continued

results to obtain a distribution of possible NEW SOUTH WALES decision rule, based on properties of the peak load values for the day, week or independent components (IC) scores, month of interest. From this distribution divides the data into two contrasting of peak loads, a central estimate could Understanding the Illicit Drugs groups. then be calculated. Market (or Learning with Crucial to the success of the proposed Ross’s third example of his presentation Independent Components Analysis) learning method is the careful selection concerned queuing models and simulation. by Stuart Gilmour and Inge Koch. of the lower dimensional subspace and This work was performed on data from In the May 2005 meeting of the NSW the number of IC directions required for the decision making process, as the Call Centre at Western Power Branch, Stuart Gilmour from the National independent component analysis is a very relating staffing provisions, expected Drug and Alcohol Research Centre powerful technique which finds non- number of calls and estimated waiting and Inge Koch from the University of Gaussian directions even in Gaussian times. Ross described the problems with New South Wales presented interesting data. They found that a reduction from this simulation method, stating that the research using independent components analysis to model the illicit drugs market. 66 to 6 dimensions and a search for the method did not reproduce reality. 3 most non-Gaussian IC directions lead Illicit drugs such as heroin, cocaine and In the fourth example, Ross covered to a stable partitioning of the data into 2 amphetamines are a significant health the use of graphical representations of contrasting and interpretable groups. problem for some young Australians. survey data. Ross explained that he often Globally, researchers have spent Gilmour and Koch noted some has to explain statistical results from considerable energy trying to understand interesting results from their two survey data to staff without a statistical the size of user populations, the spread partitions. Partition 1 contained heroin background, and found that describing of these drugs through communities and and cocaine possession offences, series results graphically was an effective way the effect of changes in the markets. relating to overdose and psychoses and of explaining his point. He covered such Gilmour and Koch were interested in serious offences such as robbery with graphical tools as bands on graphics answering questions such as whether a weapon. These represented direct to quickly show differences between or not drug markets undergo epidemic measures of the illicit drug market. On categorical variables. Another graphical patterns and whether the markets can the other hand partition 2 appeared to technique Ross covered was the use of be divided into isolated, competing or represent indirect measures of the illicit bootstrapping with perceptual maps and cooperating partitions. Previous work has drug market. Possession offences for correspondence analysis, to represent used time series analysis to investigate the amphetamines appeared in partition 2 and were thought to represent the mixed two-dimensional projections as clouds of consequences of the heroin shortage, for nature of the amphetamine market, which boot-strapped results. example. is only partly connected to street-based The data presented comprised 17 The final example that Ross covered illicit drug markets, and partly to a more indicator data series collected over 66 was the method to determine the least- functional market connected to the party months between 1997 and 2002. The cost meter reading cycle, involving a drug scene. series included indicators such as drug balance between billing, collections, use or possession offences, prostitution In conclusion Gilmour and Koch found reading and credit costs. Ross explained offences, ambulance attendance to a that independent component analysis that this was a typical analysis problem at heroin overdose and break and enter provides a powerful tool for unsupervised Western Power, with more emphasis on dwellings. learning with non-Gaussian data. mathematics than statistics. They also found that the method was Unlike the time series framework, sensitive to parameter settings and that Ross finished the presentation with Gilmour and Koch treated the data as interpretation of the results depended a quick overview of analysis at Western 17 samples with 66 dimensions each. on good model selection. ICA was able Power. He explained that while most Analysis of such HDLSS (high dimension to identify a key subset of series that managers are statistically challenged, they low sample size) problems requires measured the direct effect of illicit drug do respect sophisticated analysis. For dimension reduction. Gilmour and Koch markets. It was also able to identify sub- analysts, while analytical skills are very proposed a 3 step learning approach populations of illicit drug users such as important, so are communication skills. (END) consisting of an Extraction step amphetamine markets with serious health Ross concluded his presentation is his which reduces the dimension, a Non- and social consequences from the “party normalising step, which finds the most usual manner, with a quick joke about drug” amphetamine markets. punk statisticians. non-Gaussian directions in the lower dimensional data, and a Decision step Petra Graham A number of questions were raised which separates the data in contrasting when the discussion was thrown open to groups. Principal component analysis is November 2005 the meeting and an informal discussion used in the first step. Combined with JB Douglas Postgraduate Awards continued immediately after the meeting a careful application of independent November 23rd 2005 saw the NSW and at the usual post-meeting dinner at a component analysis, this leads to a set of Branch hold its 6th annual JB Douglas nearby restaurant. non-Gaussian direction vectors spanning Postgraduate Awards day and annual Christopher Milne the lower dimensional space. A simple dinner. Once again the quality of the

12 SSAI Newsletter – June 2006 Branch Reports Continued

NSW Branch Annual Dinner: 100 years of official statistics – a review of the development of statistical methods over this time. After the presentation of the JB Douglas Postgraduate awards the NSW Branch held its annual dinner with guest speaker Dennis Trewin, head of the Australian Bureau of Statistics. Dennis shared with us the fascinating history of official statistics in Australia which began in 1905 with formation of the Commonwealth Bureau of Census and Statistics. Pre World War II developments included the release of the Mathematical Theory of Population (a 450 page appendix to the 1911 Year Book) by George Knibbs, the development of Balance of Payments by Roland Wilson, the creation of research officer positions and a development branch during the Presenters and special guests for the annual postgraduate awards day. Back row (from 1930’s. left): Daniel Smith, Alex Woolaston, Frank Tuyl and Nathan Pearce. Front row: Maree O’Sullivan, Maureen Morris, Christine Curran, Alun Pope and Jim Douglas. Following the Second World War the Bureau entered what is known as the Foreman Era – a period of substantial students from around the state was very tasks for courses in introductory development of statistical methods high and the judges (Dr Alun Pope, Mr statistics. for the official data collection. Under Dennis Trewin and Ms Melissa Cassar) After some deliberation the judges the leadership of Ken Foreman, the were given a difficult job of choosing the awarded Daniel Smith the Peter Wright methodology unit introduced and best presentation. prize for best student presentation with developed a variety of survey systems The day began with Maree O’Sullivan Nathan Pearce and Maree O’Sullivan including those for quarterly surveys of (Macquarie University, Statistics) winning joint second prize. We thank retail establishments, income tax statistics presenting her work on whether differing all who attended and our generous and social surveys. Other important microarray platforms produced concordant sponsors CSIRO, Macquarie University, developments during the Foreman Era results using data on paediatric acute Roche Pharmaceuticals, SAS Australia, included the introduction of statistical lymphoblastic leukaemia. Next Alex the University of Sydney school of quality control techniques for the census, Mathematics and Statistics, the University the introduction of seasonal adjustment Woolaston, from the University of New of Newcastle and the Wollongong and experimental design for survey England (Mathematics, Statistics and Statisticians without whom this event development and a report detailing the Computer Science) described methods would not have been possible. need for a data warehousing scheme. for eliminating the spatial trends found on cDNA microarray slides and Nathan Pearce from the University of New South Wales (Statistics) introduced novel algorithms for training and cross- validating support vector machines. After a quick afternoon tea Daniel Smith from the University of Sydney (Econometrics and Business Statistics) presented his research comparing binary Markov random fields via applications in image restoration followed by Frank Tuyl from the University of Newcastle (Statistics) who talked about the caution needed in choosing parameters for the Beta distribution in Bayesian intervals. Maureen Morris from the University of Wollongong (Statistics) completed the presentations describing her research in the development of effective assessment Branch members enjoying festivities at the annual dinner.

SSAI Newsletter – June 2006 13 Branch Reports Continued

The mid-1970’s saw the Bureau become an operation in July 1962. Suffering The importance of this book cannot be a statutory agency independent of other from extreme myopia he had very poor understated because it is here that Fisher government departments and known eyesight, which may have contributed described his new paradigm for analyzing as the Australian Bureau of Statistics. to his exceptional geometric ability and data. In it Fisher introduced the concepts Key statistical developments of the capacity to solve complex mathematical of randomization, replication, blocking post Foreman Era included using total problems entirely within his mind’s eye. and ANOVA to a mass audience. survey design that takes account of non- Concepts that are considered fundamental sampling errors as well as sampling errors, We heard a story about a lecture when Fisher was asked a hairy problem by in modern statistical analysis, but at the forms design, the integration of survey time were quite revolutionary. designs to improve cohesion of economic one of his students. Rather then fluff statistics and the further development of an answer using academic double speak, Chris Howden methods for time series analysis. Fisher stopped talking and pondered the QUEENSLAND Dennis reflected on how important question at the front of the lecture hall in it was for the Bureau to invest in complete silence, for 10 minutes! Until he methodology and that the strong at last said ‘yes, yes I think you’re right’ February Branch meeting alignment between statistical and and continued with his lecture. On 7th February Bob Mayer, Principal methodological areas resulted in Fisher was interested in a diverse range Biometrician, Department of Primary innovations that “… significantly altered of subjects that included statistical theory, Industries and Fisheries, Townsville spoke the Bureau’s capabilities and efficiencies”. inference and methods, experimental on ‘Developing biometrical expertise for Finally Dennis described a variety of design, evolution and genetics. He was PNG agricultural research’ at the IMB future challenges and directions for the also quite prolific and published several Biosciences precinct at the University Australian Bureau of Statistics including campus in St Lucia. the requirement of more graduates in hundred papers and seven books, which Bob has worked with the DPI&F mathematical statistics since current included the influential ‘Statistical biometry unit since graduation, in demand outstrips supply! Statistical Methods for Research Workers’ original Brisbane (briefly), Toowoomba and challenges include the use of analytical published in 1925. This book has been Townsville. Since 1999 a substantial part or model based methods for producing republished at least 14 times and in of his role has been as a member of an official statistics, the use for data linkage 5 different languages (French, German, over time and the combination of survey AusAID supported overseas aid project Italian, Japanese and Spanish). Anthony team helping develop agricultural research and administrative data to support highly recommended it as a good read official statistics, and the potential use of expertise in PNG. This has included that introduces the underlying principles web-based surveys. As the leader of the reviewing the existing statistical support of experimental design with a minimal of National Statistical system the Bureau for agricultural research, several visits aims to produce manuals of best practice mathematics. to PNG agricultural research stations, and standards as well as provide statistical training and advisory services. This recognises that other agencies are going to increasingly be producing statistical outputs for public use. After thanking Dennis Trewin for his interesting historical perspectives on the Australian Bureau of Statistics, branch members and guests celebrated the end of another year at the annual dinner. Petra Graham

Professor Anthony Edwards The NSW branch started the year by looking backwards, at the work and influence of RA Fisher. Professor Anthony Edwards was Fisher’s last undergraduate and was attached to the Statistical Laboratory at the University of Cambridge from 1968-70. Anthony not only talked about Fisher’s work but also gave us a rare glimpse into the man himself. Fisher was born in 1890 at East Finchley and moved to Adelaide in 1959 where he remained until his death following Bob Mayer with a map of the agro-ecological zones of Papua New Guinea.

14 SSAI Newsletter – June 2006 Branch Reports Continued

presenting courses on basic statistics and CANBERRA use of statistical software, guiding the professional development of staff to take on the biometrician role, and statistical Talk on bioinformatics by consulting to research projects. The talk Geoff McLachlan covered various aspects of his experiences on this project, including background to At the monthly meeting of the Canberra it and some of the special and on-going Branch of the SSAI on 21 February challenges associated with it. 2006, Professor Geoff McLachlan of Members and guests had a curry at a the Department of Mathematics and nearby restaurant after the meeting. Institute for Molecular Bioscience at the University of Queensland gave a talk March Branch meeting titled “Some Applications of Statistics in Bioinformatics”. Geoff has authored The AGM was held on the 21st March a long list of statistical papers and at the University of Queensland, and the following officer bearers were nominated five books, the most recent one being and appointed. “Analyzing Microarray Gene Expression Data” (2004; Hobokin, New Jersey: President Dr Ross Darnell Geoff McLachlan Wiley; co-authored with K.-A. Do and C. Secretary Dr Helen Johnson Treasurer Dr Melissa Dobbie Ambroise). He is also a chief investigator There are ways of dealing with this in the ARC Centre in Bioinformatics. COUNCILLORS: problem which are less restrictive than Geoff began by describing DNA using the Bonferroni correction, one Dr Peter Baker microarray technology, which was being the Benjamini-Hochberg procedure Prof. John A. Eccleston originally conceived to detect expressions which controls the false discovery rate (Immediate past president) (FDR) using an ordering of the many Mr James McBroom of thousands of genes simultaneously. observed p-values. Another approach Dr Michele Haynes Microarrays present special problems for Dr Miranda Mortlock statistics because the data are very highly involves minimizing the estimated Bayes’ Dr Tony Swain dimensional with very little replication, risk in the context of a two-component Dr Ian Wood and the challenge is to extract useful mixture model where the mixing proportions are the probabilities that After the meeting Dr Michael Bulmer, information such as gene functions, the genes are, or are not, differentially Lecturer, School of Physical Sciences, gene interactions, regulatory pathways, University of Queensland spoke on metabolic pathways, etc. That challenge expressed. With this approach, a local ‘Teaching Spaces for Learning Statistics has been increasingly taken up in recent FDR can be provided for each gene. The in Context’, and the talk was held in years, and Geoff presented a graph problem of simultaneous significance the new lecture space about which he showing how the number of papers testing, especially in the context of was talking. Dr Michael Bulmer is a published on microarrays skyrocketed microarray data, has recently received a Senior Lecturer at the University of from virtually none in 1997 to well over lot of attention from several prominent Queensland. He has recently been given 4000 in 2003. The graph also showed statistical researchers, including Professor an award for excellence in teaching at the how the proportion of those papers that Bradley Efron. university. His main research interests are are statistical grew dramatically to about One of the datasets which Geoff in symbolic computation, optimisation 90% in 2003. used to illustrate the two-component methods for statistics and operations The output of gene-expression mixture model comprised measurements research and mathematics education. levels from a microarray experiment on about 3,000 genes in breast cancer In 2005 the University of Queensland can be represented by a matrix, with tissues from women who were carriers opened a new “Collaborative Learning rows representing genes and columns of the hereditary BRCA1 or BRCA2 Centre”. A large statistics service course representing samples. Attention was gene mutations, with the aim of finding was taught in this space recently as part focussed on the case where the tissue genes which are differentially expressed of a trial for this Collaborative Learning between the two different tumour types. Centre. Michael treated us to a hands-on samples can be divided into two groups corresponding to two different Geoff also discussed an application of the session in which we got to do one of the theory to HIV data, and briefly discussed statistical activities that he had developed populations of cells, and the aim is to a second problem in bioinformatics: for this Centre. SSAI members were compare the gene expressions for those clustering of correlated gene profiles, with given tape measures, stop watches and two populations. However, separate an application to the yeast cell cycle. three small furry animals with which to hypothesis tests conducted to this end measure the acceleration due to gravity may be complicated by the potentially Borek Puza in groups of three. He discussed the large number of false positives. For pros and cons of this teaching space, and example, if there are 10,000 genes and Talk on deep vein thrombosis by discussed some of the research results the significance level is 5% then under Niels Becker from his study of its effectiveness. the null hypothesis of no difference one Following the AGM of the Canberra Miranda Mortlock would expect about 500 false positives. Branch of the SSAI on 28 March 2006,

SSAI Newsletter – June 2006 15 Branch Reports Continued

Professor Niels Becker of the National VTE than on a flight-free day. This is which involves replacing each missing Y Centre for Epidemiology and Population a serious concern for frequent flyers, but value by a random draw from the set of Y Health (NCEPH) at the Australian the elevated risk for infrequent flyers is values having the same X value. National University (ANU) gave a talk only slight. Ideally one would also like After dismissing the first two options titled “Analysis of a case-study, with to estimate the relevant ‘dose-response as inadequate (for example, the first application to air travel and the risk of curve’, but data on duration of flight are leads to highly biased predictions), Ray DVT”. The talk focussed on work done elusive. detailed the imputation option, some jointly by Niels, Chris Kelman and Agus variants of it, such as nearest neighbour Salim, as published recently in the British Talk on estimating the distribution of imputation (NNI), and multiple Medical Journal, Biostatistics and the hourly pay by Ray Chambers imputation (which is better than single Australian and New Zealand Journal of At the monthly meeting of the Canberra imputation). He then discussed the Public Health. A sumptuous Indian feast Branch of the SSAI on 2 May 2006 strong links between imputation and the concluded the evening’s program whose Professor Ray Chambers of the Centre technique of ‘weighting’; and this, in venue was the Faculty Suite and Seminar for Statistical and Survey Methodology turn, led him to consider a fourth option Room of the recently completed HW at the University of Wollongong gave involving estimation as an alternative to Arndt Building at the ANU. a talk titled “Estimation vs Imputation imputation. Niels began by describing deep vein for the Distribution of Hourly Pay”. The There are at least two ways to estimate thrombosis (DVT) and noting that it talk was a summary of research which an unknown population distribution is an illness which – whether due to Ray carried out whilst at the University function: the model-based prediction natural causes or due to air travel – has of Southampton, where he was Director approach, as in Chambers and Dunstan a very small risk of occurrence. For that of the Southampton Statistical Sciences (1986); and the model-assisted calibration reason it is difficult to obtain relevant Research Institute from 1995 to 2005. approach, as recently suggested by Harms data by following a cohort over time, and Ray recently gave similar lectures on this and Duchesne (2004). After describing one must sample cases disproportionately topic in Texas and Brazil. these two approaches, Ray produced results of a Monte Carlo study of various just to get enough of them. Niels then The focus of Ray’s talk was estimation estimators, using data from the 2002 UK discussed case-only studies, including of the distribution of hourly pay rate, New Earnings Survey (NES), as provided case-crossover studies, and their wide denoted Y, for employees in the United by the UK Office of National Statistics range of applications. Difficulties with Kingdom. This problem is of interest (ONS). In terms of bias, Ray found the case-crossover studies include choosing because: employment law in the UK estimator based on NNI to be the best a control period, possible bias if there dictates that all employees must be paid amongst those considered. However, this is a trend in exposure over time, and a an hourly wage greater than or equal potential wastage of information. approach was quite inefficient from a to a set minimum value; because this mean squared error (MSE) perspective, One of the studies on which Niels has ‘minimum wage’ is subject to change with a nonparametric version of the worked involves all cases of VTE (venous over time due to inflation; because the model-based prediction approach thromboembolism; which includes government must be able to assess the producing the best MSE performance. DVT) in the Western Australia hospital impact of proposed changes to the This finding was under the fairly system from 1981 to 1999. During this minimal wage on the national wage bill; reasonable assumption that the missing period there were 16,205 admissions by and, in turn, because such assessment Y values are missing at random (MAR). 13,184 distinct patients. These records requires knowledge of the distribution of Ray concluded by discussing variance and were linked to international travel wage rates. confidence interval estimation. This is data from DIMIA (the Department Unfortunately, Y cannot usually be difficult in the very unbalanced situations of Immigration and Multicultural and obtained for all sampled employees, explored in the simulation study, and Ray Indigenous Affairs) in the form of 2,217 because many are not paid hourly. In recommended it as a worthy topic for cases of VTE with a flying history among such cases it is possible to calculate a further research. holders of an Australian passport and derived hourly rate, denoted X, based on Borek Puza 2,979 cases with a flying history with total earnings and hours worked, which non-Australian passports. The resulting are available for all employees. However, dataset was used to address the question X and Y are typically not the same, of how the outcome variable – defined as even when both are available. There are MASA the time from the start of the observation at least three options for dealing with period until hospitalisation due to VTE missing Y values. The first option is to Model Assisted Statistics – is related to exposure to international ignore those values, meaning to base flight, to calendar time, and to personal and Applications inference only on the data for which An International Journal characteristics such as sex and age. Y is available. This leads to what Ray Several models were fitted to the data calls the ‘response-based estimator’ of the ISSN: 1574-1699 and it was found, for example, that sex distribution function. The second option and age are not significant predictors is to take the missing Y values as exactly www.iospress.com of the outcome variable. One of the equal to the available X values so as to Editor-in-Chief: Sarjinder Singh main conclusions of the study is that construct the ‘substitution estimator’ of Managing Editor: Stephen Horn Treasurer: Sylvia R. Valdes on an international flight a passenger is this function. And the third option is about 30 times more likely to initiate a to construct an ‘imputation estimator’,

16 SSAI Newsletter – June 2006