<<

OLYMPIC GAMES COUNT ANALYSIS

SUMMER AND WINTER

A Thesis

Presented to the

Faculty of

California State Polytechnic University, Pomona

In Partial Fulfllment

Of the Requirements for the Degree

Master of Science

In

Mathematics

By

Jiaxin Si

2018 SIGNATURE PAGE

THESIS: OLYMPIC GAMES MEDAL COUNT ANALYSIS SUMMER AND

AUTHOR: Jiaxin Si

DATE SUBMITTED: Fall 2018

Department of Mathematics and Statistics

Dr. Adam King Thesis Committee Chair Mathematics & Statistics

Dr. Hoon Kim Mathematics & Statistics

Dr. Alan Krinik Mathematics & Statistics

ii ACKNOWLEDGMENTS

First and foremost, I would like to show my deepest gratitude to my supervisor, Dr.

Adam King, a respectable, responsible and resourceful scholar, who has provided me with valuable guidance in every stage of the writing of this thesis. Without his enlighten- ing instruction, impressive kindness and patience, I could not have completed my thesis.

His keen and vigorous academic observation enlightens me not only in this thesis but also in my future study.

I shall extend my thanks to Mr. King for all his kindness and help. I would also like to thank all my teachers who have helped me to develop the fundamental and essential academic competence. My sincere appreciation also goes to the teachers and students from Cal Poly Pomona who participated in this study with great cooperation.

iii ABSTRACT

More than 35,000 have been awarded at the Olympics since 1896. The IOC

(International Olympic Committee) retrospectively awarded , silver, and to athletes based on their rankings. The dataset we used covers Summer Olympics from

1896 to 2012 and Winter Olympics from 1924-2014; each year includes a row for every

Olympic athlete that has won a medal since the frst games. Also, this dataset has each

IOC country’s population and GDP in 2012 to 2014.

This report has four main analysis parts. The frst part introduces the base information about Olympics. In the second part, we explore the basic analysis about Summer and

Winter Olympics. The third part consists of joint analysis of both the Summer and Winter the Olympic Games. The fourth part, involving the number of medals in the Summer

Olympics and the Winter Olympics, will be explored. At the same time, in order to reveal the relationship between the number of medals and the basic characteristics of each country average high temperature in winter and GDP per capita were introduced in 2012 and 2014. The relationship between the average high temperature in winter and

GDP per capita affects the number of medals a country obtains.

iv Contents

1 Introduction 1

1.1 The Olympic Game and Olympics Spirit ...... 1

1.1.1 Olympics Game ...... 1

1.1.2 Olympics Spirit ...... 2

1.2 The History of The Olympic Games ...... 2

1.3 The Importance of Olympic Games Statistics ...... 2

1.4 Economic and Social Impact on Olympic Games ...... 3

2 Exploratory Data Analysis 5

2.1 The Basic Statistical Analysis of Summer ...... 5

2.1.1 Statistics of Each Sport Event ...... 5

2.1.2 Statistics of Medals ...... 8

2.1.3 Statistics of the Historical Hosting Cities ...... 12

2.1.4 Statistical Analysis of the Characteristics of Winners ...... 13

2.1.5 Comparison of Medals on the Same Gender in Different Countries 17

2.2 The Basic Statistical Analysis of Winter ...... 17

2.2.1 Statistics of Each Sport Event ...... 18

2.2.2 Statistics of Medals ...... 19

v 2.2.3 Statistics of the Historical Hosting Cities ...... 21

2.2.4 Statistical Analysis of the Characteristics of Winners ...... 23

2.2.5 Comparison of Medals on the Same Gender in Different Countries 25

2.3 The Joint Analysis of the Olympic Games ...... 27

2.3.1 Examining the Same Countries’ Summer and Winter Olympic

Performance ...... 27

2.3.2 Strong Repeat Performances by Certain Countries in Certain Sports 29

2.3.3 Comparing Performance of Men and Women within the Same

Country ...... 34

3 Main Data Analysis 36

3.1 Data Set ...... 36

3.1.1 Data Description ...... 36

3.1.2 Method for Making Model ...... 37

3.2 Statistical Model for Analysis in in 2012 . . . 39

3.2.1 Relationships of Single Variables with Medal Count in 2012 . . 39

3.2.2 Medal Count Prediction Using GDP and Average High Winter

Temperature ...... 43

3.3 Statistical Model for Analysis in Winter Olympic Games in 2014 . . . . 53

3.3.1 Relationships Between Medal Count and Single Variables in 2014 53

3.3.2 Predicting Winter 2014 Medal Counts using GDP and Average

High Temperature in Winter ...... 56

3.3.3 Summary of Winter Olympic Games Analysis ...... 63

4 Conclusion 65

vi List of Figures

2.1 The total number of events each summer year...... 6

2.2 The total number of disciplines each summer year...... 7

2.3 The total number of medals of top fve countries in each year...... 10

2.4 The sex difference between men and women...... 16

2.5 The number of winner about men and women in the same Summer Olympic

Games...... 16

2.6 The total number of disciplines increase year by year in winning games. 18

2.7 The total number of medals of top fve countries in each winter year. . . 20

2.8 The top fve countries from 1924 to 2014 each medal compared. . . . . 21

2.9 The sex difference between men and women...... 24

2.10 The number of winning men and women in the same Summer Olympic

Games...... 24

2.11 The number medal of men and women in same country...... 26

2.12 The number medal of men subtract the number of medals women in the

same country...... 26

2.13 Each Country Medals of Ranking about Summer and Winter Different . 28

2.14 The winter sports event which the USA has advantages...... 30

2.15 The summer sports event which the USA has advantages...... 30

vii 2.16 The winter sports event which the RUS has advantages...... 32

2.17 The summer sports event which the RUS has advantages...... 32

2.18 The winter sports event which the GER has advantages...... 33

2.19 The summer sports event which the GER has advantages...... 33

3.1 The relationship each country log of base two GDP of billion dollar

match with the number of the medal in 2012...... 40

3.2 The relationship between country log of base two with GDP and number

of the medal in 2012...... 41

3.3 The relationship between each country average high temperature Winter

and number of the medal in 2012...... 42

3.4 The relationship between the log of base GDP in billion and average high

temperature Winter and number of the medal in 2012...... 43

3.5 Each country an actual number of medals and predicted number of medals

in 2012...... 46

3.6 Each country actual number of medals and predicted number of medals

2012...... 46

3.7 Diagnostic plots for the regression of the number of medals on the log

of base two GDP in billion dollar GDP and square of the log of base

two GDP in billion dollar GDP and cube of the log of base two GDP in

billion dollar and temperature ...... 47

3.8 Each country actual number of medals and predicted number of medals

2012...... 49

3.9 Each country an actual number of medals and predicted the number of

medals in 2012...... 50

viii 3.10 Diagnostic plots for the regression of number of medals on square of the

log of base two GDP billion U.S. dollars and temperature...... 51

3.11 Fitted using Smoothing Splines the residual values of the log base two

GDP billion U.S. dollars in 2012...... 51

3.12 Fitted using Smoothing Splines the residual values of the temperature in

2012...... 52

3.13 Each country actual number of medals and predicted number of medals

2012...... 52

3.14 Each country an actual number of the medals and predicted number of

medals 2012...... 53

3.15 The relationship each country log of base two GDP of billion dollar

match with the number of medal in 2014...... 54

3.16 The relationship between country log of base two with GDP and number

of medal in 2014...... 55

3.17 The relationship between each country average high temperature Winter

and number of medal in 2012...... 56

3.18 The relationship between log base 2 of GDP in billions and average high

temperature in Winter and number of medals in 2012...... 57

3.19 Each country actual number of medals and predicted number of medals

2014...... 58

3.20 Diagnostic plots for the regression of the number of medals on the log of

base two GDP in billion dollar in 2014...... 59

3.21 Each country actual number of medals and predicted number of medals

2014...... 60

ix 3.22 Diagnostic plots for the regression of number of medals on square of log

of base two GDP billion U.S. dollars and temperature...... 61

3.23 Fitted using Smoothing Splines the residual values of the log base two

GDP billion U.S. dollars in 2014...... 62

3.24 Fitted using Smoothing Splines the residual values of the temperature in

2014...... 62

3.25 Each country actual number of medals and predicted number of medals

2014...... 63

x List of Tables

2.1 2012 Compare to 1896 Adding 28 Disciplines ...... 8

2.2 Each year the number of Bronze Medals, Gold Medals and Silver Medals. 9

2.3 List each country with the total number of medals...... 11

2.4 List each year Summer Olympics in which city...... 13

2.5 List three different kinds of medals total number of Summer Olympics

in men and women...... 14

2.6 38 disciplines with total 430 events for women in 2012...... 15

2.7 2014 compared to 1924, it has been inherited 10 events of disciplines . . 19

2.8 2014 compared to 1924, it has been added 6 event of disciplines . . . . 19

2.9 2014 compared to 1924, it has been disappeared 4 event of disciplines . 20

2.10 NOR as the top one of won medals and USA as the top two of won medals. 22

2.11 List each year Winter Olympics in which city...... 23

2.12 7 disciplines with a total 83 events for women in 2014...... 25

2.13 There are 28 countries participated in both the Summer and Winter Olympic

Games from 2000 to 2014...... 28

2.14 There are top three countries which won the medals in both Summer and

Winter Olympic Games from 2000 to 2014...... 31 Chapter 1

Introduction

1.1 The Olympic Game and Olympics Spirit

1.1.1 Olympics Game

The modern Olympic Games or Olympics are the leading international sporting events featuring summer and winter sports competitions in which thousands of athletes from around the world participate[Wikipedia, 2018c]. The Olympic Games are considered the world’s famous sports competition with more than 200 nations participating, and each session lasts a period not exceeding 16 days. It is the most infuential sports event in the world.

1 1.1.2 Olympics Spirit

Olympic Games is the biggest event for the players because it is held every four years.

No matter what the result is, standing in the biggest stage and competing with the greatest rivals make the players feel proud. They chase to be higher, stronger and faster and these are the spirit of the Olympic Games. When the audience sees the players break the record and challenge human beings’ limitations.

1.2 The History of The Olympic Games

The Olympics began in Greece more than 2700 years ago. The frst recorded Olympic competition was held in 776 B. C. It was held in an outdoor stadium and about forty thousand people watched the event. The frst thirteen Olympics consisted of only one race running. The games had been held regularly for about 1200 years. Then, in the year

397, the Olympics were prohibited by the Roman Emperor.

It was not until 1896 that the frst Olympics of modern times were held in . From then on the games are held every four years regularly alternating by occurring every four years but two years apart. The Olympics have become the world’s most important athletic events and a symbol of sporting friendship of all the people of the world.

1.3 The Importance of Olympic Games Statistics

The Olympic Games is divided into many sports, such as swimming, athletics, all kind of ball games, etc. Each of these sports is divided into the men’s group and the women’s

2 group. For any event in the Olympic Games to determine the rankings and comparisons are used. For example, jump is farther and longer than anyone else; running takes less time than anyone; diving and synchronized swimming exercise is by comparing results used diffculty rate and completion rate. It is ultimately decided who wins the game.

1.4 Economic and Social Impact on Olympic Games

Hosting the Olympic Games will involves a lot of money, personnel and supplies. How- ever, at the same time as the huge revenues of the Olympic Games, the expenditures for hosting the Olympic Games are also very large. The hosting cities will cost as much as US dollar 1 to 10 billion. The development of the Olympic economy is a diversi- fed and comprehensive phenomenon. Its international development also highlights its importance and forward-looking.

The Olympic Movement has developed into the largest ever human activity for peaceful purposes. The Olympic Games has become global, sustainable, comprehensive, super- large, high cultural connotation, long time for preparation, huge investment, numerous participants and competition level. It has become an eye-catching activity with enormous social, economic, political, and cultural effects. It has received extensive attention from governments, people, media, and business groups in countries around the world. The impact of the Olympic Games has surpassed any other social and cultural activities today.

In the positive impact, the direct infuence includes the promotion of the growth of the

GDP of the host cities, the growth of employment, the development of related industries, and the improvement of the investment environment; the indirect effects include the im- provement of human capital, the promotion of regional economic development. On the

3 other hand, the negative impact refers to the idleness of sports facilities and extra social spending has increased.

To sum up, holding grand ceremonies has the positive and negative side. What we need to do is to make it perfect. Such as, how to reach the same effect with less pay. It is necessary to hold the ceremony, but we do not need to make it at the cost of cutting peoples beneft.

4 Chapter 2

Exploratory Data Analysis

2.1 The Basic Statistical Analysis of Summer

2.1.1 Statistics of Each Sport Event

The statistics of the events held for the Summer Olympics were analyzed and the major events for hosting the Summer Olympics from 1896 to 2012 were analyzed. The Summer

Olympic program featured some kinds of sports encompassing then divided by several disciplines and each discipline was separated by a lot of events[Wikipedia, 2018a].

As the Summer Olympics from 1896 to 2012 all kinds of sports were analyzed, we get some results. From the statistical data, the frst summer Olympic Games in 1896 only has nine major sports, and as time goes by, major sports have been increased. The lowest number of sports happened on the frst Summer Olympic. And the highest number of sports occurred between 1996 to 2008. As Figure 2.1 shown, where we can see from

1996 to 2008, the major sports of the Summer Olympics basically stabilized at total

5 Number of events in each summer Olympic Games "" .. .. .2S .21 .2f ,.23. µ 2. .. .. • • ? J.!l jg_ J.!!_ ,.IP ,il t ""J 4 '7 47 <1 7 .IB ..1.4. .. . i ,t

A.

O· •••••~~••••b •• • •~••~••••••• •~•••••~ lhtYN'Of$Ul'l'ffltt~

Figure 2.1: The total number of events each summer year. number twenty-eight. The 30th Olympic Games in 2012 two sports were reduced.

Next we analyze each discipline on Summer Olympics. The total number of disci- plines increase year by year after 1980. Since the Figure 2.2 shown, As the Summer

Olympic Games are gradually becoming more mature, more and more disciplines have been added. As far as specifc disciplines are concerned, some of the events as traditional projects have been developed at the Summer Olympics. Some disciplines have gradu- ally disappeared due to various reasons. Also, some emerging disciplines have gradually gained recognition from everyone. Some the sport disciplines become one of the parts of the Summer Olympic Games.

In order to more clearly observe the changes from disciplines to disciplines, we make a comparison between every year Summer Olympics disciplines changed. The number of disciplines held in the 2012 Summer Olympics is much higher than that in the 1896

Summer Games.

Therefore we do an analysis, using the 1896 Summer Olympic Games as a reference

6 4·. ,• ..o,.­ , , • -·· ' J, - .• 34 - ,, [ :;: § z, ­ ' , 'b 27- I ' I . ,( i ,s- ,•. I I • f ;;: I ' , 1' 20 • • . -. '·- 18 • ' 1> ' . ,. . 1,0i uol 1ou •0)81!)1 ::;. 1n o1:i:J,1 -02a 10l:J1il:'lt 1:i-1a •o;::uoss1~(jO 10s >1 10€1:l 11)J21t7G 1U)10E-1 •o~a1,o~no;:.:oonoo.12038:2912 ·1111:n.. u;~11111":1 0J11J11:-

Figure 2.2: The total number of disciplines each summer year. compare to 2012, analyzing which disciplines that were inheriting eight disciplines are

Athletics, Cycling Road, Cycling Track, Fencing, Shooting, Swimming, Tennis and

Weightlifting; disciplines that were disappearing two disciplines are Artistic G and Wrestling

Gre-R. From Table 2.1 we analyzing which 28 disciplines that have been added.

7 Table 2.1: 2012 Compare to 1896 Adding 28 Disciplines

Archery Badminton Basketball

Beach Volleyball Boxing Canoe Slalom

Canoe Sprint Cycling BMX Diving

Dressage Eventing Football

Gymnastics Artistic Gymnastics Rhythmic Handball

Hockey Judo Jumping

Marathon swimming Modern Pentathlon Mountain Bike

Rowing Sailing Synchronized Swimming

Table Tennis Taekwondo Trampoline

Triathlon

2.1.2 Statistics of Medals

Counting the number of different kinds of medals in Olympic Games according to each year as the Table 2.2 shown. The number of medals won by different countries in the

Olympic Games according to a different year. In order to further observation of different participating countries in the Summer Olympic Games, as the Figure 2.3 we list the winning countries and select the top fve countries in each year. We can clearly know the total number of medals increase year by year.

8 Table 2.2: Each year the number of Bronze Medals, Gold Medals and Silver Medals.

Year Bronze Gold Silver Year Bronze Gold Silver

1924 18 17 17 1928 16 15 13

1932 15 15 15 1936 18 18 18

1948 22 23 24 1952 24 23 23

1956 25 25 24 1960 28 29 27

1964 32 35 40 1968 33 36 38

1972 36 37 35 1976 39 39 39

1980 40 40 41 1984 41 41 41

1988 48 48 48 1992 58 59 60

1994 63 63 63 1998 70 71 70

2002 80 82 77 2006 86 86 86

2010 87 88 89 2014 104 104 102

9 •-~-•--•-•-•-·L•·fu-·~•-~·fu-•~-~•,.,,.... .•· .;...... ,.a,,,;,

Figure 2.3: The total number of medals of top fve countries in each year.

Since we do some statistics show that the cumulative number of medals won by these countries as the Table 2.3 shown. After statistics, it was found that by the end of 2012,

United States of America won a total of 2,386 medals, ranking frst in the medals list. second in the medals list with 981 medals was the Soviet Union. Greece won the number of medals 508 which was ranked third in the medals list.

10 Table 2.3: List each country with the total number of medals.

Country Total Number of Medal Country Total Number of Medal

USA 2386 URS 981

GER 508 GBR 453

GDR 409 CHN 392

RUS 381 SWE 299

FRA 264 AUS 252

FRG 204 HUN 196

ITA 148 ITA 148

FIN 132 EUA 118

EUN 111 BUL 75

CAN 64 POL 58

BEL 51 ROU 51

GRE 45 CUB 39

JPN 27 ZZX 11

11 2.1.3 Statistics of the Historical Hosting Cities

To hold a grand opening and closing ceremonies for the Olympics are the tradition for a long time. With the improvement of peoples’ living standard, the scale of the ceremonies has become greater and greater. As holding grand ceremony has been popular for a long time, it must have its reason. The greater the ceremony is, the more attention will be paid on it. In addition, a grand ceremony is able to encourage the participants which made the activity more interesting. Besides, it is a good way to spread the spirit of the activity.

Statistics on the hosting cities of each previous Summer Olympic Games as the Table 2.4 shown. Until 2012, London has hosted a total of three times. Athens, Los Angeles, and

Paris have been held twice in history, and the other cities have only been held once in history.

12 Table 2.4: List each year Summer Olympics in which city.

City Year City Year

Athens 1896 1900

St Louis 1904 London 1908

Stockholm 1912 Antwerp 1920

Paris 1924 Amsterdam 1928

Los Angeles 1932 Berlin 1936

London 1948 Helsinki 1952

Melbourne / Stockholm 1956 Rome 1960

Tokyo 1964 Mexico 1968

Munich 1972 Montreal 1976

Moscow 1980 Los Angeles 1984

Seoul 1988 Barcelona 1992

Atlanta 1996 Sydney 2000

Athens 2004 2008

London 2012

2.1.4 Statistical Analysis of the Characteristics of Winners

In addition, in order to verify the total number of medals won by men and women from

1896 to 2012, we count the number of Olympic medals according to gender character- istics. As the Table 2.5 shows, it can be seen from the statistical results that there is a signifcant difference between men and women. The analysis reasons may cause by such as the sports’ setting of the Olympic Games.

In addition, we need to explore whether the gender differences had existed since the start

13 Table 2.5: List three different kinds of medals total number of Summer Olympics in men and women.

Gender Total Bronze Total Gold Total Silver

Men 3798 3564 3528

Women 1328 1261 1274 of the Summer Olympics, or whether it has become prominent at a certain time. There- fore, we analyze the winning situations of men and women in each Olympic Games.

From the statistical results, we can see that there were not women’s sports in the frst

Summer Olympics in 1896 and the women’s sports were gradually set up after the second

Summer Olympics began, but the number of medals can be inferred that due to historical reasons, there are still fewer women’s participation events. We chose to compare the fe- male winners of the second Summer Olympic Games in 1896 with female winners of the

Summer Olympics in 2012. We can see that there were two women sports disciplines are

Tennis with 3 events and Golf with 7 events in 1896 and there were thirty-eight women sports disciplines in 2012 can read from the Table 2.6.

14 Table 2.6: 38 disciplines with total 430 events for women in 2012.

Total Number of Discipline and Total Number of Event

Archery 6 Athletics 69 Badminton 8

Basketball 3 Beach Volleyball 3 Boxing 12

Canoe Slalom 3 Canoe Sprint 12 Cycling BMX 3

Cycling Road 6 Cycling Track 15 Diving 12

Dressage 6 Eventing 5 Fencing 15

Football 3 Gymnastics Artistic 18 Gymnastics Rhythmic 6

Handball 3 Hockey 3 Judo 28

Marathon swimming 3 Modern Pentathlon 3 Mountain Bike 3

Rowing 18 Sailing 12 Shooting 18

Swimming 48 Synchronized Swimming 6 Table Tennis 6

Taekwondo 16 Tennis 9 Trampoline 3

Triathlon 3 Volleyball 3 Water Polo 3

Weightlifting 21 Wrestling Freestyle 16

In addition, from the sex difference between men and women, the difference in the num- ber of medals won by women and men frst showed an increasing trend, and the greatest difference was 380 occurred in 1920. The difference in the number of medals between men and women gradually decreased in the later period. As of 2012, the difference was reduced to 101 shown in Figure 2.4 and Figure 2.5. It can be seen that with the infuence of various factors, such as the constant improvement of the event setting, the improve- ment of the training level, and the improvement of the concept of equality between men and women. Womens’ participation in the Olympic Games will gradually gain recogni- tion of the medals.

15 • • • • • Ol- • • • • • • • variable • • i Men • • • • • . \Yemen 2.)) - • j • l • • • • • • • • ,-• • • •

1/11 'r'IAI :ti S HU'lll~ 0j1llj:<1

Figure 2.4: The sex difference between men and women.

_....-..._ ...... -.. .

oo- \'ariOblc • Men ¥ ~ Women Q •

o- 13;1: 1'°001110.u• 1t1;, •nc ,~ N 1m' im1 -m •11'-31,s21 ~" 1?e0 •~1Ka ttn,an 1QOO 100--1900 11{12~\J~ ;:coo21:o.iM:12012 M 'r~•~ot .iurrmer :.: iTlli:itt

Figure 2.5: The number of winner about men and women in the same Summer Olympic

Games.

16 2.1.5 Comparison of Medals on the Same Gender in Different Coun-

tries

In addition, in order to explore the characteristics of different countries in the same gen- der, the number of medals is obtained,and we perform analysis by country and gender to explore the advantages and disadvantages of gender. From the analysis of statistical results, it can be seen that some countries have signifcant advantages in female projects, and some countries have signifcant advantages in male projects. For this purpose, we subtract the number of medals won by women from men in different countries. From the statistical results, we can see that there are countries have advantages in the women’s program such as , and some countries have advantages in the men’s project such as the United States. However, the advantages of some countries are not obvious.

2.2 The Basic Statistical Analysis of Winter

Winter sports are an important part of modern sports. Sports activities are conducted under such harsh and natural conditions as ice and snow, and the summer sports program cannot play an important role in the training and training of young people[Wikipedia, 2018b].

The emergence of the 1924 Winter Olympics made up for the insuffciency of the Sum- mer Olympics. It was held every four years. The difference was that the Winter Olympics was held two years after the Summer Olympics. The Olympic Games can be staggered.

The 1992 Winter Olympics was the last Winter Olympics held in the same year as the

Summer Olympic Games.

17 .. . t,. 11111111111111 trel IP~' ,,.'.. ~ 1u,. '1-0' , ...., ·~..i ,n,11' 111!1 'At' $ 11>:;>' 1:,11. 'l!K' ., 11".11 1n;:-' (t1' t~•' ,:,,.,, ;1111. ,i'l'l t :\111 IM-ol ....,..,(.,o,p:

Figure 2.6: The total number of disciplines increase year by year in winning games.

2.2.1 Statistics of Each Sport Event

The statistics of the events held for the Winter Olympics were analyzed and the ma-

jor events for hosting the Winter Olympics from 1924 to 2014 were analyzed. As the

Olympics Game, all kinds of sports divided by several disciplines and each discipline

was separated by a lot of events.

All kinds of sports were analyzed from 1924 to 2014, and we get some results. From

the statistical data, the frst winter Olympic Games in 1924 only has six major sports

were , , , , Skating and . But from 1928 to

1956 with six times Winter Olympics Game without Biathlon and as time goes by, as

a traditional winter Olympics sports Bobsleigh, Hockey, Skating and Skiing, there are

continues until 2014.

Next we analyze each discipline on Winter sports. There are 15 disciplines in the 1924

and it has expanded to 65 disciplines in the Game, as Figure 2.6

shows. Next, we make a comparative between every year Winter Olympics disciplines

changed. The number of disciplines held in 2014 is much higher than that in the 1924

Winter Olympics Game. Therefore we do an analysis, using the 1924 Winter Olympic

Games as a reference compare to 2014, from Table 2.7, Table 2.8 and Table 2.9, analyz-

18 ing which disciplines of event that have been added, inherited and disappeared.

Table 2.7: 2014 compared to 1924, it has been inherited 10 events of disciplines

Event of Disciplines

10000M of Skating 1500M of Skating

5000M of Skating 500M of Skating

50KM of Skiing Curling of Curling

Four-Man of Bobsleigh Ice Hockey of Ice Hockey

Individual of Skating Pairs of Skating

Table 2.8: 2014 compared to 1924, it has been added 6 event of disciplines

Event of Disciplines

1000M of Skating 10KM of Biathlon

10KM of Skiing 10KM Pursuit of Biathlon

12.5KM Mass Start of Biathlon 12.5Km Pursuit of Biathlon

2.2.2 Statistics of Medals

The number of medals won by different countries in the Olympic Games according to different years. In order to further observation of different participating countries in the

Winter Olympic Games, as Figure 2.7 shown. We can clearly know that the top fve countries of winning the total number of medals are , the United States, Austria,

Germany and Finland.

In order to more objectively fnd out the advantages of different countries in different

19 Table 2.9: 2014 compared to 1924, it has been disappeared 4 event of disciplines

Event of Disciplines

18KM of Skiing Combined (4 Events) of Skating

K90 Individual (70M) of Skiing Military Patrol of Biathlon

Medal of Country

300· i• 1 200· ~ .! I 100· ..11 ! o------• - - - -  - • - - - - • - ~ ~ ~ · ~···~ ~~··~~~~~~ ·~·~~~~·ili~~···~~~~·~~Countryb~··ili

Figure 2.7: The total number of medals of top fve countries in each winter year. competitions, we made statistics on the number of medals won by different countries in different major items and selected the top fve countries for analysis on different kinds of medals as Figure 2.8 shown.

20 CAN 38 38 38 200 - Country 1 . ..JJT ~ .5: 1C~.N I RN I  NOR 0 100- su1 ~  B E' I URS f USA

Figure 2.8: The top fve countries from 1924 to 2014 each medal compared.

From the statistical results, it can be seen that Norway won the most number of gold medals in the Winter Olympics are 73, next is the United States won 49. For this reason, following the Table 2.10 we need carefully analyze the specifc sports event of disciplines of Norway and the United States to win three different kinds of medals. Norway as the top one of won medals are good at event 1500M and 10KM Pursuit and the United States as the top two of won medals are good at event 1000M and 1500M.

2.2.3 Statistics of the Historical Hosting Cities

All the Winter Olympic Games are sporting events on ice or snow, so there was some weather demand for the hosting cities. As the hosting cities of Winter Olympic Games, it can show the new image to the world; so each countries’ Olympic heroes will fght for the right to hold the Olympic Games. Successful in winning the right to hold the Winter

21 Table 2.10: NOR as the top one of won medals and USA as the top two of won medals.

Medals of NOR Medals of USA

Event Bronze Gold Silver Bronze Gold Silver

10000M 6 4 7 0 2 1

1000M 2 0 1 9 8 6

10KM 6 5 5 0 0 0

10KM Pursuit 0 5 1 0 0 0

12.5KM Mass Start 1 0 0 0 0 0

12.5Km Pursuit 0 1 1 0 0 0

1500M 8 8 7 4 5 5

15KM 2 4 4

Olympic Games is a huge pleasure. When we statistics on whether or not the city has been held many times. Until of 2014 , Lake Placid and St. Moritz have hosted a total of twice times. Other cities on the list have only been held once in history as the

Table 2.11 shown.

22 Table 2.11: List each year Winter Olympics in which city.

City Year City Year

Chamonix 1924 St.Moritz 1928

Lake Placid 1932 Garmisch Parten kirchen 1936

St.Moritz 1948 1952

Cortina d’Ampezzo 1956 Squaw Valley 1960

Innsbruck 1964 1968

Sapporo 1972 Innsbruck 1976

Lake Placid 1980 1984

Calgary 1988 1992

Lillehammer 1994 1998

Salt Lake City 2002 2006

Vancouver 2010 2014

2.2.4 Statistical Analysis of the Characteristics of Winners

The number of men and women winning in the Winter Olympics since 1924 is gradually increasing both, but women won the games increase more faster as Figure 2.9 shown.

The greatest difference was 138 occurred in 1988. The difference in the number of medals between men and women gradually decreased in the later period. As of 2010, the difference was reduced to 63 shown in Figure 2.10.

23 ,,

Gender !n f.l n ~ • V.or.,:r i ,. . •

1\i.!' ·1 192 ' 8 ' iiSl 'h~. ' 11M8 ' Hk' ' 11!.' ·;i 1ijtil. ' ' l:lt.i'l 1iJS 11/ '' l '9i6 1i:ll:J 111:11 111~ ' Ii~ ' lllM H,.'~l: 2U02 21Jtl(i 2✓'1 l. :!1.11 ' • Th:...,ord'A1t1:,:1;."(),-~o

Figure 2.9: The sex difference between men and women.

,

,,'->-- •'

...., t: iF.:"l,1P.r ~ ; ~) - ., Ur. · ' wonen •~ .~,,... ---.. --- .- \ , ~ D ·-. ___ ' --· ,.

, .. ,,.., ..J ·,v~, oi.,,,.,,.

Figure 2.10: The number of winning men and women in the same Summer Olympic

Games.

24 In addition, in order to better identify the disciplines that women in the Winter Olympic

Games we need do a comparative analysis of the history of the women’s Olympic Winter

Games disciplines of the event in 1924 and 2014. We found which are traditional disci- plines, and which are new adding disciplines. And from the analysis, we can know that women in 1924 Winter Olympics Game only with one discipline participated was Skat- ing, and it divided by Individual event and Pairs event. In addition, we can know that there were seven women sports disciplines and divided by eighty-three event in 2014 shown in Table 2.12.

Table 2.12: 7 disciplines with a total 83 events for women in 2014.

Discipline and Event

Biathlon 18 Bobsleigh 6 Curling 3

Ice Hockey 3 6 Skating 42

Skiing 66

2.2.5 Comparison of Medals on the Same Gender in Different Coun-

tries

Next, we need to explore the characteristics of different countries in the same gender.

As Figure 2.11 shown, from the analysis of statistical results, it can be seen that some countries have signifcant advantages in female sports, and some countries have signif- cant advantages in male sports. For this purpose, we subtract the number of medals won by women from the number of medals won by men in the same countries for statistical analysis. Shown in Figure 2.12. From the statistical results, we can see that there are

Germany, China, Ukraine have advantage in the womens’ program, and Norway, Fin-

25 "" 1 ,:,:1-.!t...... ~ I  \~«11: r l ~ -~~JL_. ___ ·: lJAh.01h : !. tin :.°A•:~ ~!'l... - 1¥.(:tlC__'-l ! ).. ! )rcw, .., .11 r -i..n ...· t,ll u, d: ..:-: i.! )¥.'.~-t:-. ~ ...._, .r_.-~ 1.L11,: ·:h.-, 'J -J.: ~, ..,-;i: ,:, ,J~.wtt.l: . '>.(~ ':-,·.'>) C).r.JJV

Figure 2.11: The number medal of men and women in same country.

I XC ! i :<· ,,_f ' ((: n•• J r· ,1i, (·------

~;ri>ti:w· ._..,. c--.;; ~-111.C"'.'l<'.O.t<.:J ~~ ~:I. i.e.~ :,,;,1,.,.:.,. :" ,,~,.. ; HI..A\'.~1:, u,.,:;;i: ...~, ,...ci;i. e, 11 V:'f i:wua,·~~ ~J•>1:r~ !

Figure 2.12: The number medal of men subtract the number of medals women in the same country.

26 land, Austria, and other countries have advantages in the mens’ project. The advantages of some countries are not obvious, including Spain, Hungary and so on.

2.3 The Joint Analysis of the Olympic Games

Through independent analysis, we have a preliminary understanding of the Summer

Olympic Games and the Winter Olympic Games. This section mainly analyzes the dif- ferences between the same attributes of the Summer Olympic Games and the Winter

Olympic Games through joint analysis.

2.3.1 Examining the Same Countries’ Summer and Winter Olympic

Performance

There are more countries participating in the Summer Olympic Games and the Winter

Olympic Games respectively. Here, we analyze the relevant information about the medal acquisition status of the same country in the Winter Olympics and the Summer Olympic

Games.

After 2000, the countries that participated in the summer and winter Olympic Games at the same time were analyzed as examples. The summer Olympic Games are held in 2000,

2004, 2008 and 2012 respectively; the Winter Olympic Games are held in 2002, 2006,

2010 and 2014 respectively. First, identify countries that participated in both Summer and Winter Olympic Games from 2000 to 2014 as the Table 2.13 shown.

Statistics on the number of medals won by these countries between 2000 to 2014, in the

Winter and Summer Olympics Game as Figure 2.13 shown.

27 Table 2.13: There are 28 countries participated in both the Summer and Winter Olympic

Games from 2000 to 2014.

Country

AUS AUT BLR BUL CAN CHN CRO CZE SVK SWE

EST FIN FRA GBR GER ITA JPN KAZ UKR USA

KOR LAT NED NOR POL RUS SLO SUI

._...... -.•·­...... •. ·­...... _...... •­- --

•·· 1111 ...:i .:....:.,....u,..;. .;,.....:.uca:i.u,-.JQ. ca..:. o,..~.:. ,...",e..:,...;...:..~Ol

(a) The same country between the Winter (b) The same country between the summer

Olympics Olympics

Figure 2.13: Each Country Medals of Ranking about Summer and Winter Different

28 Comparison of awards in the same country between the Winter Olympics and the Sum- mer Olympics. In terms of the total number of comparisons, the difference between the

United States, Russia, and Australia medals is the largest. Since the summer with more disciplines than winter then we need compare by percentage of medals of total medals.

In addition, we need Analyze from specifc gold, silver and bronze numbers won coun- tries by countries. The largest percentage of gold differentials is China and Canada. The largest percentage of silver differentials is Australia. Bronze has the largest percentage differentials is Norway.

2.3.2 Strong Repeat Performances by Certain Countries in Certain

Sports

Base on the Section 3.3.1 we analysis, the countries that participated in the summer and winter Olympic Games at the same time were analyzed as examples. And in this section for the winning rankings of the Winter Olympics and the Summer Olympic Games, we focused on analyzing those countries that have advantages in both types of Olympic

Games. In accord with the established order, we selected the top three countries with the highest number of medals for the two types of Olympic Games shown in Table 2.14 are the United States, Russia and Germany.

The analysis in the summer and winter Olympic Games advantages of the sports about the United States.

Since we analysis the United States of the medals in the Winter Olympics, from the statistical results as the Figure 2.14 shown, the United States has an absolute advantage in the Snowboard at the Winter Olympics, followed by the Short Track .

29 Spudik∈· S-,,;·...tio.inf - C loj Jun~iu:r SlelElO'I• Sh,111 T, .-:,..~ll ~d Ska jni,- Nt di~ C :lrnSoc\1 - LuQe- •~ Aror1• lt1:H~ ...11y · Gold j r111f'1~ , :: ~A irn1· SJ1(I ;;101,1re s~1n11- C\#lnJ l~'O!L Ct1111 1yfil 111 ~ - Bclls.lEIQ'l- 1:11.;,,ll'IQi, Hf1rr.::l~1n~ 00 25 15 10) 1he Num;"er or M:-Oat~

Figure 2.14: The winter sports event which the USA has advantages.

i.1edal eron:.,; a.,M ...... •••• ..._

Oir_;:;.p!ine

Figure 2.15: The summer sports event which the USA has advantages.

30 Table 2.14: There are top three countries which won the medals in both Summer and

Winter Olympic Games from 2000 to 2014.

Top Three Country Won Total Medals

Country Sum of Winter Sum of Summar Sum of Two Game

USA 300 1077 1377

RUS 174 653 827

GER 205 463 668

In addition, statistics on the advantages of the United States in the Summer Olympic

Games are also available. From statistical data as Figure 2.15 shown, it can be seen that the United States has absolute advantages in Swimming and Athletics.

The analysis in the summer and winter Olympic Games advantages of the sports about the Russia.

Since we analysis Russia of the medals in the Winter Olympics, from the statistical re- sults as the Figure 2.16 shown, Russia has an absolute advantage and

Cross Country Skiing at the Winter Olympics.

In addition, statistics on the advantages of the Russia in the Summer Olympic Games are also available. From statistical data as the Figure 2.17 shown, it can be seen that Russia has absolute advantages in Athletics.

The analysis in the summer and winter Olympic Games advantages of the sports about the Germany. Since we analysis the Germany of the medals in the Winter Olympics, from the statistical results as the Figure 2.18 shown, Germany has an absolute advantage in Biathlon and Luge at the Winter Olympics.

31 Speed $k:.'tllng: - Snowboard- SkiJum;iino· SKeletor - snort Track Speed s1eann9 - Noratc cornOIMd· Medal Lug6 - - :a" 8 10028 lc;e Hotter u Gola " t-reesty e Sklin9- D"' ngure s1ea11n9- Sil\•u Cu'lino · c,oss c ounlty Skiino · 8obSl !?1Qt'I- 01athlor - Alp l'ie Skiing - 0 5 10 The Number -of Med.l.ls.

Figure 2.16: The winter sports event which the RUS has advantages.

25 · • .,. ~,, , t,,te;:.a!  IS· BIO'I~ ~ G:,1:, ~ ~ ". $1111:, £ ;- I II ) . I L ..I I ~ I. I 11 . ' .' '' ' ••• 'I'' .' .' .

Figure 2.17: The summer sports event which the RUS has advantages.

32 Soeed skatino­ Sno1A"board Sf

00 25 50 7 5 10 0 12.5 Th~ Number of Medals

Figure 2.18: The winter sports event which the GER has advantages.

,,, ?" :> · i l.ledol i :-.u· e·on::,~ -8

Figure 2.19: The summer sports event which the GER has advantages.

33 In addition, statistics on the advantages of the Germany in the Summer Olympic Games are also available. From statistical data as Figure 2.19 shown, it can be seen that Germany has absolute advantages in Boxing and Athletics. From this section analysis, we can clearly to know that as the Olympic Game top winning countries, there are both very good at Athletics in the Summer Olympics. On the other hand, as the top one the United

States has absolute advantages for the Snowboard. And other two Russia and Germany have advantages on Skiing and Skating at the Winter Olympics.

2.3.3 Comparing Performance of Men and Women within the Same

Country

Analysis of the total number of medals under the Winter and Summer Olympics in each country with a different gender. First, we analysis the medals of gold between 20 coun- tries in different gender. The countries with the greatest differences among men in gold medals are the United Kingdom, China, and Coratia. The countries where the is most different among women is Coratia, Belarns, Slovak Republic. We combine the result together to know that Coratia with the higher the number gold medals different of two Olympics.

Next we analysis about the medals of silver between 20 countries in different gender.

The countries with the greatest differences among men in silver medals are the United

Kingdom and Coratia. The countries where the is most different among women is Latvia and Kazakhstam.

Lastly we analysis about the medals of silver between 20 countries in different gender.

The countries with the greatest differences among men in bronze medals are Kazakhstam

34 and Ukraine. The countries where the is most different among women is

Latvia and Coratia. Since we fnish this part analysis we can clearly to fnd that women won more medals in Winter than in Summer Olympics. On the other hand, the medals for men won were kind of the same compared with Winter and Summer Olympics.

35 Chapter 3

Main Data Analysis

In this part, we should fnd the models as the relationship between Number of Medals and Total GDP, Population Density, Temperature Average high in Winter.

3.1 Data Set

3.1.1 Data Description

The dataset is about each the number of medal and country relationship in 2012 Summer and 2014 Winter. There are total 7 variables in the column data, such as Country Name,

Population (one unit is one thousand people), GDP per Capita, Total GDP (one unit is one thousand dollar), Area, Pop Density (one unit is one thousand people in one Square-

Kilometer) and Each Country’s capital Average high in Winter (for example the United

States and United Kingdom winter in January and Australia winter in July. Since the southern hemisphere season is opposite to the northern hemisphere. Because the northern

36 hemisphere has a large area in the summer, the south pole of the southern hemisphere is

facing the space, and the sun is not fully illuminated for a long time.) These variables

have two relationships are:

TotalGDP = Population × GDPperCapita

PopulationDensity = Population ÷CountryArea

3.1.2 Method for Making Model

Linear regression is a linear approach to modeling the relationship between a dependent

variable and one or more independent variables. The case of one explanatory variable

is called simple linear regression, so we can use this regression model to predict the Y

when the X is known, this mathematical equation can be generalized as Equation 3.1 :

Y = β0 + β1X + ε (3.1)

Where β0 is the intercept and β1 is the slope, so they are called regression coeffcients. ε is the error term, the part of Y the regression models is unable to explain. On the other hand, the case of some explanatory variable is called Multiple linear regression, then the mathematical equation can be generalized as Equation 3.2:

Y = β0 + β1X1 + β2X2 + β3X3 + ··· + βPXp + ε (3.2)

LOESS is one of many “modern” modeling methods that build on classical methods, such as linear and nonlinear least squares regression[Jin and Lin, 2012]. LOESS combines much of the simplicity of linear least squares regression with the fexibility of nonlinear regression. It does this by ftting simple models to localized subsets of the data to build

37 up a function that describes the deterministic part of the variation in the data, point by point[Baumer et al., 2017]. Poisson regression is a generalized linear model form of re- gression analysis used to model count data and contingency tables[RamachandranA, 2017].

Poisson regression assumes the response variable Y has a Poisson distribution, and as- sumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters.

GAM is simply a class of statistical Models in which the usual Linear relationship be- tween the Response and Predictors are replaced by several nonlinear smooth functions to model and capture the nonlinearities in the data[RamachandranA, 2017]. These are also a fexible and smooth technique which helps us to ft Linear Models which can be either linearly or non linearly dependent on several Predictors[James et al., 2013]. GAMs are just a generalized version of linear models in which the predictors depend linearly or nonlinearly on some smooth nonlinear functions. The regression function gets modifed in generalized additive models, fts the data very smoothly and fexibly without adding complexities, then mathematical equation can be generalized as Equation 3.3, where the functions are different Non Linear Functions on variables X:

Y = α + f1(Xi1) + f2(Xi2) + f3(Xi3) + ··· + fP(Xi p) + ε (3.3)

38 3.2 Statistical Model for Analysis in Summer Olympic

Games in 2012

3.2.1 Relationships of Single Variables with Medal Count in 2012

The Pearson correlation coeffcient indicates the strength of a linear relationship between

two variables. In the order to fnd a correlation of medals between simple variable, we

only get a log of base two GDP in billion dollar and average high temperature in winter

with the strength of a linear relationship between Number of Medals. So we focus on

these two variables: number of medals and log of GDP in billions of dollars.

• Number of Medals and Log of Base GDP in Billion Dollar

As we can be seen plot of Number of Medals = f (log base two of GDP in billions) is

named the linear regression curve from the Figure 3.1, the number of medals obtained

has a certain linear correlation with log of base two GDP, and we can say that: greater

the GDP, greater the medal count. But after we careful study of the mapping, we fnd

that in some countries such as China, Russia and the United Kingdom with the GDP is

not high but with the high number of medals. On the other hand, the number of medals

obtained in some countries does not match GDP.

Analysis of the reasons, we believe that for Russia, the United Kingdom and other coun-

tries because GDP is not high and with high the number of medals, there are the fol-

lowing reasons: (1) The perception of sports throughout the country. These countries

are more recognized for Olympic medals, so they are more willing to invest more peo-

ple, money and materials on the basis of limited resources, making these countries more

competitive in sports. (2) The stability of the national social environment. A stable social

39 USA•

CHN•

RUS•• GBR

GER JPN AUS FRA KOR ITA Number of Medals NED HUN UKR ESPCAN BRA CUB NZL IRI JAM KENBLRAZE CZEKAZ DEN POL GEO ETH COL SWE MEX 0 20 40 60 80 100 MGL LTUSLOSRB CRO SVK IRLEGY RSATHANORARGSUI TUR IND GRN MNE TJK ARMBAH BOTGABAFGUGAESTCMRLATBRNCYP TUNGUAUZBBULDOM MARPURKUW QATALGPORGREHKGFINMASVENBEL KSAINA 0 2 4 6 8 10 12 14

Log of Base GDP in Billion Dollar

Figure 3.1: The relationship each country log of base two GDP of billion dollar match with the number of the medal in 2012. environment is the foundation of deep roots in the feld of sports. (3) National country population. (4) The country is the host country or not.

Next, we check the Pearson correlation is 0.6297164 with P-value is 3.914e-10 smaller than 0.05 test result is signifcant which measures a linear dependence between two vari- ables log of base two with GDP and number of medals with the strong positive linear relationship. Its also known as a parametric correlation test because it depends on the distribution of the data. It can be used only when the log of base two with GDP and number of medals are from normal distribution. The smooth curve of plot of Number of

Medals = f(log of base two with GDP) is more clearly to show the relationship between these two variables in Figure 3.2.

40 0 0 0 t 0 ,,, 0 0 ,,, 0 0 ,,, ,,, + +- + ,,, ,,, 0 ,,, 0 8 +-

8 Number of Medals 0 20 40 60 80 100

0 2 4 6 8 10 12 14

Log of Base GDP in Billion Dollar

0

Figure 3.2: The relationship between country log of base two with GDP and number of the medal in 2012.

• Number of Medals and Average High Temperature in Winter

Studies have shown that different ethnic groups in different climate zones have their own strengths sports. People who generally live in the monsoon region, with distinct seasons and better fexibility. The advantages of Ping-pong, diving, and other technical projects are obvious. Other people live in high temperature and tropics for a long time.

The average annual temperature is over 20 degrees Celsius. It has strong heat resistance and temperature regulation. It is suitable for explosive and endurance such as jumping projects. Some of the other people live more in the cold area, there is no low temperature in summer, and the body is tall and strong to adapt to the cold climate. In athletics, ball games, swimming and other power projects.

Based on these, we can clearly to know that there are must have a relationship between the temperature and number of medals. The number of medals obtained has a linear correlation with temperature. We can know that the countries with too high and too low

41 30

20

Number of Medals 10

0

−10 0 10 20 30 Average High Temperature in Winter

Figure 3.3: The relationship between each country average high temperature Winter and number of the medal in 2012. temperature both will reduce the count of medals. Next, we check the Pearson correlation is -0.2660228 with P-value is 0.01708 smaller than 0.05 test results is signifcant which measures a linear dependence between two variables temperature and the number of medals with the negative linear relationship. There are show that most the number of medals are located the temperature between 0 to 15 degrees Celsius area. Because the good weather situation has more a good opportunity for training and practicing the sport of Olympic games.

42 100 • •

75 • Temperature • 30 20 50 10 • 0 • Number of Medals • • −10 • 25 • • • • . '• • • • • • • .• • • _;,t• .]•• ...•• . , • • 0 • • ...... ,. 0 5 10 Log of Base GDP in Billion Dollar

Figure 3.4: The relationship between the log of base GDP in billion and average high temperature Winter and number of the medal in 2012.

3.2.2 Medal Count Prediction Using GDP and Average High Winter

Temperature

Since getting Log of Base Two GDP in Billion Dollar and Average High Temperature in

Winter with a strength of a linear relationship between Number of Medals. The Linear regression is a linear approach to modeling the relationship between a dependent vari- able: Number of Medals and two independent variables: Log of GDP in Billion Dollar and Average High Temperature in Winter. First, we check the relationship with Log of

GDP in Billion Dollar and Average High Temperature in Winter shown in Figure 3.4.

The multiple linear regression, then mathematical equation can be generalized as Equa- tion 3.4:

Y = β0 + β1Logo f GDP + β2Temperature + ε (3.4)

43 Rs base installation provides numerous methods for evaluating the statistical assumptions in a regression analysis. Doing so produces four graphs that are useful for evaluating the model ft. Then we need to check the dependent variable is normally distributed for a fxed set of predictor values, then the residual values should be normally distributed we can fnd there are existed two outliers USA and CHN. Next, we try to fnd another model in order to ft these outliers from the last linear regression model, from the outliers test in R we get USA and CHN. Then we get a new linear regression equation to ft many outliers is shown in Equation 3.5:

2 Y = β0 + β1Logo f GDP + β2Temperature + β2Temperature + ε (3.5)

Then we need to check the dependent variable is normally distributed for a fxed set of predictor values, then the residual values should be normally distributed; From the outliers test in R of this models, we get USA and CHN still are outliers.

Then we need to fnd that there are existed more strong linear relationship, and also this model is more ftting to normally distributed. The Linear regression is a linear approach to modeling the relationship between a dependent variable: Log of Base Two Number of

Medals; in addition, two independent variables: Log of Base Two GDP in Billion Dollar and Average High Temperature in Winter. Then we get a new linear regression equation to ft many outliers is shown in Equation 3.6:

log [Y ] = β0 + β1Logo f GDP + β2Temperature + ε (3.6)

Then we get a new prediction equation to ft many outliers. Because a GDP of 0 is impossible, you would not try to give a physical interpretation to the intercept. It merely becomes an adjustment constant. From the P-Value column, we see that the regression coeffcient is signifcantly smaller than 0.05. Sine this multiple R-squared is a little be slow so that we try to fnd another model. We get a new linear regression equation to ft

44 many outliers is shown in Equation 3.7:

2 3 Y = β0 +β1Logo f GDP+β2Logo f GDP +β3Logo f GDP +β4Temperature+ε (3.7)

Then we get a new the prediction equation is shown in Equation 3.8 and in this equation we let Log of GDP label to X:

Y = 1.26936 + 7.20333X − 1.90886X2 + 0.13711X3 − 0.20791Temperature (3.8)

Because a GDP of 0 is impossible, you would not try to give a physical interpretation to the intercept. It merely becomes an adjustment constant. From the P-Value column, we see that the regression coeffcient is signifcantly smaller than 0.05. From the multiple

R-squared indicates that the model accounts for 77.31% of the variance in the number of medals. This is the highest multiple R-squared we can get. So we know that correlation between the actual and predicted value get an increase. When we keep the temperature are not change, we can see that for every 1 increase that the regression coeffcient of the log of base with billion dollar GDP is signifcantly and indicates that there is an expected increase of 7.20333 count of medals; for every 1 increase that the regression coeffcient of the square of log of base with billion dollar GDP is signifcantly and indicates that there is an expected decrease of 1.90886 count of medals; for every 1 increase that the regression coeffcient of the cube of log of base with billion dollar GDP is signifcantly and indicates that there is an expected increase of 7.20333 count of medals. On the other hand, if we keep GDP are not change we can see that for every 1 increase that the regression coeffcient of temperature is signifcantly and indicates that there is an expected decrease of 0.20791 count of medals. We can get the Figure 3.5, which is an actual number of medals and predicted number of medals. Since we can make the fgure bigger; We can get the Figure 3.6, which is an actual number of medals and predicted number of medals. Then we need to check the dependent variable is normally distributed

45 USA

CHN

RUS GBR

GER JPN AUS FRA KOR ITA NED HUNUKR ESP BRACAN CUBNZL IRI Actual Number of Medals KENJAMAZEBLRCZEDENKAZPOL ETH COLGEO SWE MEX 0 200 40 60 80THAEGY 100 SVKCROIRLSLOSRBLTURSAARGNORMGL SUITUR IND GRNPURQATDOM GUAHKGMASCMRMARKUWUGABRNTUNGABPORCYPBAHBOTGREVENUZBBULAFGALG MNEESTFINLATTJKARMBELKSAINA 0 20 40 60 80 100

Predicted Number of Medals

Figure 3.5: Each country an actual number of medals and predicted number of medals in

2012.

0 0 NED 0 UKR 0 HUN ESP CUB 0 NZL 0 0 IRI KEN JAM 0 0 BLR CZE 0 0 0 AZE DEN KAZ POL 0 COL SWE 0 ETH GEO MEX CROIRL RSA 00 LTU 0 MGL 0 0 THA EGY SVK SLOSRB ARG 0 NOR 0 SUI TUR TUN UZB FIN 0 BEL INA PURDOM MASQAT GRE BUL ESTLAT ARM 0 GUACMRMARKUWUGA BRNGABALGPORHKGCYPBOTBAHVEN AFGMNE TJK KSA Actual Number of Medals −5 0 5 10 20 30

0 5 10 15 20

Predicted Number of Medals

Figure 3.6: Each country actual number of medals and predicted number of medals 2012.

46 Residuals vs Fitted Normal Q−Q 6631 31 66 0 45 45p −2 4

−30 40 I I I Residuals 0 20 40 60 80 100 −2 −1 0 1 2

Fitted values residuals Standardized Theoretical Quantiles

Scale−Location Residuals vs Leverage 6631 66 45 35 ~ - ~ - --Cook's0------distance------~~ 0.51 j -----======.. :: :-: • 10.5 −2 4 ;if45 7 :~ :: 0.0 2.0 I I I 0 20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 Standardized residuals Fitted values residuals Standardized Leverage

Figure 3.7: Diagnostic plots for the regression of the number of medals on the log of base two GDP in billion dollar GDP and square of the log of base two GDP in billion dollar GDP and cube of the log of base two GDP in billion dollar and temperature for a fxed set of predictor values, then the residual values should be normally distributed shown in Figure 3.7. From the outliers test in R of this models we get Russia and the

United Kingdom are outliers.

Next, we will try Poisson Regression. Since this dataset used the Poisson Regression to ft the model is Overdispersed. Then we will do the Negative Binomial Distribution; the negative binomial distribution as a gamma mixture of Poissons can be used to model count data with overdispersion. When the dataset is big enough the negative binomial distribution is the same as a Poisson distribution. The R output provides the deviances, regression parameters, and standard errors, and tests that these parameters are 0. Note that each of the predictor variables is signifcant at the P-Value small than 0.05 level.

Then we get the prediction equation with negative binomial distribution is shown in

47 Equation 3.9:

logE [Medals] = 1.011838 + 0.019127logo f GDP2 − 0.022455Temperature (3.9)

In a Poisson regression, the dependent variable being modeled is the log of the condi- tional mean log(Medals). In a Poisson regression, the dependent variable being modeled is the log of the conditional mean log(Medals). From the multiple R-squared indicates that the model accounts for 78.33% of the variance in the number of medals. When we keep the temperature does not change, We can see that the regression coeffcient of

GDP is signifcant. When we keep the temperature are not change, We can see that the regression coeffcient of GDP is signifcant. For every 1 increase that the regression co- effcient of the square of log of base with billion dollar GDP is signifcantly and indicates that there is an expected decrease of 1.0193110 count of medals. On the other hand, if we keep GDP are not change we can see that the regression coeffcient of temperature in winter is signifcant. As one temperature increase, the number of medals increase by

0.9777953. We can get the Figure3.8, which is an actual number of medals and predicted number of medals. Since we can make the fgure bigger; We can get the Figure3.9, which is an actual number of medals and predicted number of medals. Its usually much easier to interpret the regression coeffcients in the original scale of the dependent variable (num- ber of seizures, rather than log number of seizures). To accomplish this, exponentiate the coeffcients, shown in Equation 3.10:

Medals = 2.7506529 + 1.0193110Logo f GDP2 + 0.9777953Temperature (3.10)

Then we need to check the dependent variable is normally distributed for a fxed set

of predictor values, then the residual values should be normally distributed shown in

Figure3.10. Then we can see that there are existed more strong linear relationship, and

also this model is more ftting to normally distributed.

48 USA

CHN

RUS GBR

GER JPN AUS FRA KORITA NED HUNUKR ESPBRA CAN CUBNZL IRI Actual Number of Medals JAMKENAZEBLR CZEDENKAZPOL ETHGEO COL MEXSWE 0 20 40SLOSRBCROLTUMGL 60THASVKEGYIRLRSA 80ARG 100 NORSUITURIND GRNGUABAHGABUGABOTCMRMNEDOMBRNTJKCYPARMPURQATTUNAFGALGESTMARUZBLATBULKUWMASPOR HKGVENAFGALGVENGREINAKSAFINBEL 0 20 40 60 80 100

Predicted Number of Medals

Figure 3.8: Each country actual number of medals and predicted number of medals 2012.

Finally, we compare the linear regression models and negative binomial models together we will pick the negative binomial as our Summer Olympic medals predictor model. So we can use this model to calculate and predictor the next Summer Olympic how many numbers of medals will win by each country.

New we are ftting a GAM which is nonlinear in the log of GDP and Temperature with 3 degrees of freedom because they are ftted using Smoothing Splines the residual values should be normally distributed; Then we can see that there are existed more strong linear relationship, and also this model is more ftting to normally distributed. That medals frst slow increase with log of GDP and since the log of GDP around 8 the number of medals increase so fast. As Figure 3.11 shown. For variable of Temperature, the medals tend to decrease, and it seems that there is an increase so fast in medals at around -10 to 0 degrees

Celsius temperature. As Figure 3.12 shown. The curvy shapes for the variables log of

GDP and Temperature are due to the Smoothing splines which model the nonlinearities in the data. From the P-Value column, we see that the regression coeffcient is signifcantly

49 AUS 0 KOR ITA

0 NED 0 UKR HUN ESP 0 BRA CUB 0 NZL 0 0 IRI JAM KEN 00 BLR CZE AZE DEN KAZ POL 0 CD COL SWE ETHGEO MEX 0 CRO IRLRSA IND LTUMGL 0 0 0 SLOSRB THASVKEGY oCbARG 0 NOR SUI TUR

Actual Number of Medals TUN UZB INA FIN BEL DOMARMPUREST LATBULMAS QAT GRE 0 0 5 10 15 20GRN 25BAH 30 GABBOTUGACMRMNETJKBRN GUACYPAFG MAR KUW ALGPORHKGVEN KSA

0 5 10 15 20

Predicted Number of Medals

Figure 3.9: Each country an actual number of medals and predicted the number of medals in 2012. smaller than 0.05. From the multiple R-squared indicates that the model accounts for

77.60% of the variance in the number of medals. This is the highest multiple R-squared we can get. Then mathematical equation can be generalized as Equation 3.17, where the functions are different Nonlinear Functions on variables X, function one is about the ftting Smoothing Splines the residual values of the log base two GDP billion U.S. dollars in 2012 and function two is about the ftting Smoothing Splines the residual values of the temperature in 2012:

Y = α + f1(Xi1) + f2(Xi2) + ε (3.11)

We can get the Figure3.13, which is an actual number of medals and predicted number of medals. Since we can make the fgure bigger; We can get the Figure3.14, which is actual number of medals and predicted number of medals.

50 Residuals vs Fitted Normal Q−Q 44 44 4719 4719

a QQQp.i• −2 2

−2§ 2 Residuals 1 2 3 4 −2 −1 0 1 2 Std. deviance resid. Std. deviance Predicted values Theoretical Quantiles

Scale−Location Residuals vs Leverage 44 4719 44 4719 1 Cook's distance 0.5 −2 4 0.0 1 2 3 4 0.00 0.05 0.10 0.15 Std. deviance resid. Std. Pearson resid. Std. Pearson Predicted values Leverage

Figure 3.10: Diagnostic plots for the regression of number of medals on square of the log of base two GDP billion U.S. dollars and temperature. 0 20 40 60 s(log_GDP_brillion, 3)

0 2 4 6 8 10 12 14

log_GDP_brillion

Figure 3.11: Fitted using Smoothing Splines the residual values of the log base two GDP billion U.S. dollars in 2012.

51 s(Temperature, 3) s(Temperature, −2 0 2 4

−10 0 10 20 30

Temperature

Figure 3.12: Fitted using Smoothing Splines the residual values of the temperature in

2012.

USA

CHN

RUS GBR

GER JPN AUS FRA KOR ITA NED HUNUKR ESP CANBRA CUBNZL IRI Actual Number of Medals JAMKEN AZEBLRKAZCZEDEN POL ETH GEOCOL SWE MEX 0 20 40 60EGYMGLTHA 80RSACROSLOSRBLTUSVKIRL 100 ARG NORSUI TUR IND GRNGUABAHGABBOTCMRUGADOMBRNPURQATTUNMARCYPKUWMNE HKGPORMASTJKGREVENUZBAFGALG ARMBULESTLAT FINKSABELINA 0 20 40 60 80

Predicted Number of Medals

Figure 3.13: Each country actual number of medals and predicted number of medals

2012.

52 0 0 UKR 0 HUN CUB 0 NZL 00 IRI JAMKEN 0 BLR c9CZE 0 AZE KAZ DEN POL 0 0 COL SWE ETH GEO RSACRO IRL MGL LTU 0 oO EGY THA SLOSRB SVK ARG 0 0 NOR SUI TUN UZB 0 FIN BEL INA DOMPUR QAT MAS GRE BULARMEST LAT 0 GRNBAHBOTGABUGACMR GUA BRN CYPMARKUW ALGMNEHKGPORTJK VEN AFG KSA Actual Number of Medals −5 0 5 10 20 30

0 5 10 15

Predicted Number of Medals

Figure 3.14: Each country an actual number of the medals and predicted number of medals 2012.

3.3 Statistical Model for Analysis in Winter Olympic Games

in 2014

3.3.1 Relationships Between Medal Count and Single Variables in

2014

The Pearson correlation coeffcient indicates the strength of a linear relationship between two variables. In the order to fnd a correlation of medals between simple variable, we only get the log of base two GDP in billion dollar and average high temperature in winter with the strength of a linear relationship between Number of Medals. So we focus on these two variables.

• Number of Medals and Log Base 2 of GDP in Billion Dollar

53 RUS•

USA• NOR• CAN • NED• GER

AUT• SWE• FRA SUI• Number of Medals CZE ITA CHN SLO• KOR• • JPN• • BLR POL LAT• FIN• • GBR• AUS•

0 10 20 30 UKR CRO• SVK • • KAZ• 6 8 10 12 14

Log of Base GDP in Billion Dollar

Figure 3.15: The relationship each country log of base two GDP of billion dollar match with the number of medal in 2014.

As we can be seen the plot of The Number of Medals = f(log of base two GDP in Billion

Dollar) is named the linear regression curve from the Figure 3.15, the number of medals obtained has a certain linear correlation with log of base two GDP. But after we careful study of the mapping, we fnd that in some countries such as CHN, RUS and NOR,

NED with the GDP is not high but with high number of medals. On the other hand, the number of medals obtained in some countries does not match GDP. Next we check the

Pearson correlation is 0.4868872 with P-value is 0.01165 smaller than 0.05 test result is signifcant which measures a linear dependence between two variables log of base two with GDP and number of medals with a strong positive linear relationship. Its also known as a parametric correlation test because it depends on the distribution of the data.

It can be used only when the log of base two with GDP and number of medals are from a normal distribution. The smooth curve of the plot of Number of Medals = f(log of base two with GDP) is more clearly to show the relationship between these two variables in

54 0 0 -·­ 0 , .- -- ·-o- ·-·-·-·-·-·-·-· o l ,, ,, 0 ,, 0 . ,, 0

0 Number of Medals -- Q-•-·.,,,,. · ' __ .,,,,. 0 0 ·o

0.- ·- ---- 2.-- --·~ ·-·-·-·-·- . ______:; .----· .9 . - . ~. 0 0 --- 0 - .a.. . 0 10 20 30

6 8 10 12 14

Log of Base GDP in Billion Dollar

Figure 3.16: The relationship between country log of base two with GDP and number of medal in 2014.

Figure3.16.

• Number of Medals and Average High Temperature in Winter

Based on these, we can clearly to know that there are must have a relationship between the temperature and number of medals. The number of medals obtained has a linear correlation with temperature. We can know that the countries with too high and too low temperature both will reduce the count of medals. Next, we check the Pearson correlation is -0.1688126 with P-value is 0.4097 greater than 0.05 test results is not signifcant which measures a linear dependence between two variables temperature and the number of medals with a negative linear relationship. There are show that most the number of medals are located the temperature between -10 to -5 degrees Celsius area. Because the good weather situation has a good opportunity for training and practicing the sport of

Olympic games.

55 40

30

20 Number of Medals 10

0

−15 −10 −5 0 5 10 Average High Temperature in Winter

Figure 3.17: The relationship between each country average high temperature Winter and number of medal in 2012.

3.3.2 Predicting Winter 2014 Medal Counts using GDP and Average

High Temperature in Winter

We used Log of Base Two GDP in Billion Dollars and Average High Temperature in

Winter in a linear model for the number of medals. The Linear regression is a linear approach to modeling the relationship between a dependent variable: Number of Medals and two independent variables: Log of GDP in Billion Dollar and Average High Tem- perature in Winter. First we check the relationship with the Log of GDP in Billion Dollar and Average High Temperature in Winter shown in Figure 3.18. The multiple linear regression, then the mathematical equation can be generalized as Equation 3.12:

Y = β0 + β1Logo f GDP + β2Temperature + ε (3.12)

56 •

30 • • • Temperature • 10 5 20 • 0 • −5 • • −10 Number of Medals

10 • • • • • • • • • • • • • • 0 • • • 5.0 7.5 10.0 12.5 Log of Base GDP in Billion Dollar

Figure 3.18: The relationship between log base 2 of GDP in billions and average high

temperature in Winter and number of medals in 2012.

R’s base installation provides numerous methods for evaluating the statistical assump-

tions in a regression analysis. Doing so produces four graphs that are useful for evaluat-

ing the model ft. Then we need to check the dependent variable is normally distributed

for a fxed set of predictor values, then the residual values should be normally distributed;

Since we can fnd this equation that temperature is not signifcant which measures a lin-

ear dependence. We get a new linear regression equation to ft many outliers is shown in

Equation 3.13:

Y = β0 + β1Logo f GDP + ε (3.13)

Then we get a new the prediction equation is shown in Equation 3.14:

Y = −7.115 + 2.029Logo f GDP (3.14)

Because a GDP of 0 is impossible, you wouldnt try to give a physical interpretation to the intercept. It merely becomes an adjustment constant. From the P-Value column, we

57 0 RUS

- 0 0 0 USA NOR 0 CAN NED 0 - GER 0 AUT 0 0 SWE FRA 0 - ~ SUI 0 0 0 CZE 0 ITA 0 CHN 0 SLO 0 0 KOR JPN BLR POL 0 Actual Number of Medals LAT FIN 0 0 GBR - 0 0 0 AUS 0 10 20 30 UKR CROl SVK KAZ I I I 5 10 15 20

Predicted Number of Medals

Figure 3.19: Each country actual number of medals and predicted number of medals

2014. see that the regression coeffcient is signifcantly smaller than 0.05. From the multiple

R-squared indicates that the model accounts for 20.53% of the variance in number of medals. This is the highest multiple R-squared we can get. So we know that a correlation between the actual and predicted value get increase. We can see for every doubling of

GDP, there is an expected increase of 2.029 medals. We can get the Figure 3.19, which is actual number of medals and predicted number of medals. Then we need to check the dependent variable is normally distributed for a fxed set of predictor values, then the residual values should be normally distributed shown in Figure 3.20. From the outliers test in R of this models we get Russia is an outlier. Next we will try Poisson Regression.

The R output provides the deviances, regression parameters, and standard errors, and tests that these parameters are 0. Note that only the predictor variables of the log of base two GDP in billion dollar is signifcant at the P-Value smaller than 0.05 level. Then we get a the prediction equation with the negative binomial distribution is shown in Equation

58 Residuals vs Fitted Normal Q−Q 200 20 18 18 0 0 0 9 ..... O o . Q O . ~oo-o.o P- . ~ Q OU OO O 0 0 01 i o Q 0 0 o -ooo °-I 80 −1 2 10 0 -

−10 20 10 Residuals I I I ~~I I I I 5 10 15 20 −2 −1 0 1 2 Standardized residuals Standardized Fitted values Theoretical Quantiles

Scale−Location Residuals vs Leverage 20 20 1 18 10 0.5 26 ro o a 0 o 3 ~oCook's ,., distance o : ---- ~--- - e - ...9-1 0 0 −1~L,~-~--,- 2 -___,,,·g--=q,c...,.-- o_~o~--'-5

0.0 1.5 1 I 5 10 15 20 0.00 0.05 0.10 0.15 Standardized residuals Standardized residuals Standardized Fitted values Leverage

Figure 3.20: Diagnostic plots for the regression of the number of medals on the log of

base two GDP in billion dollar in 2014.

3.15:

logE [Medals] = 0.44930 + 0.20548logo f GDP (3.15)

In a Poisson regression, the dependent variable being modeled is the log of the condi- tional mean Log � Medals). From the multiple R-squared indicates that the model ac- counts for 18.53% of the variance in number of medals. When we keep the temperature are not change, we can see that the regression coeffcient of GDP is signifcant. When we keep the temperature unchanged, We can see that the regression coeffcient of GDP is signifcant. For every doubling of GDP, there is an expected increase of 1.228115 count of medals. Its usually much easier to interpret the regression coeffcients in the original scale of the dependent variable � number of seizures, rather than log number of seizures).

To accomplish this, exponentiate the coeffcients, shown in Equation 3.16:

Medals = 1.567219 + 1.228115Logo f GDP (3.16)

59 0 RUS

0 0 0 USA NOR 0 CAN NED 0 GER 0 AUT 0 SWE FRA

SUI 0 0 0 CZE 0 ITA 0 CHN SLO 0 0 0 KOR JPN BLR POL 0

Actual Number of Medals LAT FIN 0 GBR 0 00 0 AUS

0 10 20UKR 30 CRO SVK KAZ 5 10 15 20 25

Predicted Number of Medals

Figure 3.21: Each country actual number of medals and predicted number of medals

2014.

We can get the Figure3.21, which is actual number of medals and predicted number of medals. Then we need to check the dependent variable is normally distributed for a fxed set of predictor values, then the residual values should be normally distributed shown in Figure3.22. Then we can see that there are existed more the strong linear relation- ship, and also this model is more ftting to normally distributed. Now we are ftting a

GAM which is Non linear in log of GDP and Temperature with 3 degrees of freedom because they are ftted using Smoothing Splines the residual values should be normally distributed. Then we can see that there are existed more strong linear relationship, and also this model is more ftting to normally distributed. That medals frst slow increase with log of GDP and since log of GDP around 6 the number of medals increase so fast.

As the Figure 3.23 shown. For variable Temperature the medals tend to decrease, and it seems that there is a large number won in medals at around -15 to -5 degrees Celsius temperature, then since the temperature increase the number of medals is decrease. As

60 Residuals vs Fitted Normal Q−Q 18 18

23 14 −1 2 14 23 −2 1 Residuals 1.5 2.0 2.5 3.0 −2 −1 0 1 2 Std. deviance resid. Std. deviance

Predicted values Theoretical Quantiles

Scale−Location Residuals vs Leverage 23 14 18 18 1 § 0 0 0 0 0 0 20 0.5 0'0 0 06 0 0 0 I Cook's distance ~o g o o o 0 0 oO------O

a −1 2 5 0.0 1.0 1.5 2.0 2.5 3.0 0.00 0.05 0.10 0.15 0.20 Std. deviance resid. Std. Pearson resid. Std. Pearson

Predicted values Leverage

Figure 3.22: Diagnostic plots for the regression of number of medals on square of log of base two GDP billion U.S. dollars and temperature. the Figure 3.24 shown. The curvy shapes for the variables log of GDP and Tempera- ture is due to the Smoothing splines which models the Non linearities in the data. From the P-Value column, we see that the regression coeffcient is signifcantly smaller than

0.05. From the multiple R-squared indicates that the model accounts for 50.97% of the variance in number of medals. This is the highest multiple R-squared we can get. Then mathematical equation can be generalized as Equation 3.17, where the functions are dif- ferent Nonlinear Functions on variables X, function one is about the ftting Smoothing

Splines the residual values of the log base two GDP billion U.S. dollars in 2014 and func- tion two is about the ftting Smoothing Splines the residual values of the temperature in

2014:

Y = α + f1(Xi1) + f2(Xi2) + ε (3.17)

We can get the Figure 3.25, which is actual number of medals and predicted number of medals.

61 s(log_GDP_brillion, 3) −10 −5 0 5 10

6 8 10 12 14

log_GDP_brillion

Figure 3.23: Fitted using Smoothing Splines the residual values of the log base two GDP billion U.S. dollars in 2014. s(Temperature, 3) s(Temperature, −10 −5 0 5

−15 −10 −5 0 5 10

Temperature

Figure 3.24: Fitted using Smoothing Splines the residual values of the temperature in

2014.

62 0 RUS

0 0 USA 0 0NOR CAN NED GER

AUT SWE FRA

SUI 0 ITA CZE CHN SLO JPN KOR BLR POL

Actual Number of Medals LAT FIN GBR AUS

0 10 20 30 UKR CRO SVK KAZ 5 10 15 20

Predicted Number of Medals

Figure 3.25: Each country actual number of medals and predicted number of medals

2014.

3.3.3 Summary of Winter Olympic Games Analysis

As the result shown, we know that the winter Olympic Games’ medals are hard to ana- lyze. There are problems in the winter project. Analysis of the reasons, we believe the number of medals do not have a strong relationship with GDP and temperature, there are the following reasons: (1) The climate in many European countries is suitable for skiing, while in a few countries there are only a few areas with skiing climatic condi- tions, and there is still a large gap between the number of ski resorts and the quality of developed countries. In addition, temperatures that are too low in winter in some areas can also have an impact on athlete training. The levels of these ski resorts are mixed, and some ski resorts have the conditions to train the national team, and some can only meet the general skiing needs of the general public. Therefore, the development of snow sports inevitably has certain geographical restrictions. (2) The lack of athletes in related

63 projects is one of the reasons for this phenomenon. Compared with the lack of athletes, the lack of professional talents such as coaches is also diffcult to fnd. (3) The empha- sis in winter and summer sports is different, and the development of projects on ice and snow is unbalanced. This situation has lasted for a short time. For most snow projects, it is diffcult for athletes to make up for the gap in strength in a short period of time, and the backwardness of the results has also led to the development of the project more diffcult, less attention and fewer people join in.

64 Chapter 4

Conclusion

We have fnished the data analysis of this data, we can more clearly analyze the domi- nant sports in each country and the gender advantages of each game. For example, as the

Olympic Game top winning countries, there are both very good at athletics in the Sum- mer Olympics. They are also both have advantages on Cross Country Skiing and Speed

Skating at the Winter Olympics. Additional we know that there are some countries with the greatest differences among men and women in gold medals. So analysis the Olympics is very important and useful. also we fnd women won more medals in Winter than in

Summer Olympics. Since the Olympic movement has developed into the largest ever hu- man activity for peaceful purposes. The Olympic Games has become global, sustainable, comprehensive, super-large, high cultural connotation, long time for preparation, huge investment, numerous participants and competition level. It has become an eye-catching activity with enormous social, economic, political, and cultural effects. It has received extensive attention from governments, people, media, and business groups in countries around the world. The impact of the Olympic Games has surpassed any other social and cultural activities today.

65 We have fnished main analysis about the relationship number of medals with GDP and temperature; we can fnd the very good model to ft the Summer Olympic Game, but the

Winter Olympic Game is so hard to ft. There are a lot of reasons cause the matter is true.

In winter games some of the gold medals’ disciplines, such as short track speed skating, fgure skating, , etc. These games are not familiar with the public to learn.

The participation of these people is not high. It is not easy to learn and very limited by venues and venues. According to the analysis, the skiing distribution area in some countries is relatively narrow, and the distribution area is mostly concentrated. On the other hand, Skiing in Europe, America, Japan and is already a very popular winter sport, especially in Europe and the United States, where almost everyone know how to skiing and it is suitable for all ages. However, due to cognitive misunderstandings, people generally think that skiing is an extreme sport. It is not dangerous. In fact, some skiing projects are indeed dangerous projects. However, if you take protective measures and learn step by step, you can greatly reduce the risk factor.

Some countries may not be good at everything in one thing, but in the parts and areas they are good at, they play the best level, meaning that they may be closer to success. So from the Olympic Game, we realized that we have strengths and weaknesses, as long as we strive to play the advantages, the disadvantages to play to a normal level, then we are successful.

66 Bibliography

[Baumer et al., 2017] Baumer, B., Kaplan, D., and Horton, N. J. (2017). Modern data

science with R. Francis CRC Press.

[James et al., 2013] James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An

Introduction to Statistical Learning with Applications in R. Springer.

[Jin and Lin, 2012] Jin, D. and Lin, S. (2012). Advances in Future Computer and Con-

trol S Systems. Springer.

[RamachandranA, 2017] RamachandranA, K. M. (2017). Mathematical Statistics with

Applications in R. Elsevier Acadmic Press.

[Wikipedia, 2018a] Wikipedia (2018a). 2012 Summer Olympics.

https://en.wikipedia.org/wiki/2012 Summer Olympics Accessed 11/8/2018.

[Wikipedia, 2018b] Wikipedia (2018b). 2014 Wummer Olympics.

https://en.wikipedia.org/wiki/2014 Winter Olympics Accessed 11/8/2018.

[Wikipedia, 2018c] Wikipedia (2018c). Olympic Games.

https://en.wikipedia.org/wiki/Olympic Games Accessed 11/8/2018.

67