<<

An Investigation of the Relationship Between Sci-Fi Imaginations and Academic Endeavors on Digital Technologies: Taking Chinese Sci Fi and Scholarly Computer Science Publications As An Example by

Lizao Wang

An Undergraduate Honors Thesis

Submitted to the Department of Media Arts and Sciences

Wellesley College

2020 May

1 TABLE OF CONTENTS

1. Purpose of the Project…………………………..….....……...... 3

2. Chapter 1: Data Processing………………………………………………………………...... ….6

2.1 Selection of Data Source…………………………………………………………...... 6

2.2 Data Coding Procedure …………………...... ……...... 10

2.2.1 CS Subcategory Codes…………………………………………………...... 10

2.2.2 General Codes Specifically For Sci-Fi……………………………..…...... 12

2.2.3 Coding Procedure…………………………...... 15

3. Chapter 2: Understanding Results…………………..……………………………………………18

3.1 Revisiting the Hypothesis………………………………………....…………………………18

3.2 Possible Bias Introduced By the Data Collection Method…………………….27

3.3 Possible Bias Caused by Social Context ………………………………..…...... 32

3.4 Additional Results………………………………………...... 36

4. Chapter 3: Data Representations……………………………..…………………...... ……...... 41

4.1 Intentions and Guidelines………………………….…………………………..…...... 41

4.2 Design Process………………………………..…………………………………..………...... 43

4.3 Design Testing and Evaluations………………………….…………….…………...... 48

5. Final Note…………………………………………………...……...... 52

6. Appendix A: List of Links to digital components of this thesis...... 54

7. Appendix B: List of Figures……………………………………………………………………………55

8. Appendix C: Citations…………………………………………………………………………………..56

2 1. PURPOSE OF THE PROJECT

The Assumption that imaginations in novels have impacts on actual digital technology developments has been going around for a while. One of the many who made the assumption popular is the inventor of cell phones, Martin Cooper, openly acknowledging that his inspiration comes from the handheld communicators in Star

Trek(Blazeski, 2017).

Is there really a relationship between Sci-fi imaginations and digital technology developments? Ed Finn from Arizona State University believes that Sci-Fi is a “laboratory to experiment intellectually”, where people test out their envisioned technical and social changes(Eschrich,

2015). Brian David Johnson, Ed’s colleague from the Center for Science and the Imagination also has extensive work on the theory of ‘Science

Fiction Prototyping’, which points to science fictions as test grounds for or happening technology innovations(Johnson, 2011).

On top of theorizations, some concrete answers have also been provided by scholars. Philipp Jordan from University of Hawaii conducted a data analysis where he counts how many times Science

Fictions are mentioned in ACM-SIGCHI across

3 years so as to show the change of possible influence that sci-fi has on

HCI research. Results of Jordan’s study shows that the mentioning of science fiction in HCI works is increasing year by year(Philipp, 2018), indicating that Sci-Fi has become an increasing impact on computer science.

With the foundation relatively laid out, this project aims to investigate a similar topic with a different lens- showing the relationship between sci-fi and technology innovations by exploring the popularity changes of different CS topics across years in these two quite distant genres. The hypothesis of the project is that if

Sci-Fis are indeed the inspirations of technology innovations, then its trends of topic popularity might to some extent predict the trends of topic popularity in CS publications. On the other hand, if science fictions function more as a prototyping playground for happening technologies, then the topic popularity trends in two categories might show similarities in their rises and falls.

Moreover, with current research primarily focusing on the relationship between American science fiction and technology development while both fields are experiencing global blooming development, it seems an appropriate opportunity to introduce a

4 comparative study of the topic. Therefore, this project collects its data on

Chinese science fictions and CS publications from 2009 to 2013. On top of providing a comparative angle to the study of relationship between sci- fi and science, focusing on Chinese sci-fi during the selected time period carries its own purpose of investigating the trends of topic popularity during the Chinese Sci-fi golden age(Song, 2018), a contemporary literature movement that arguably has its impact not only on culture, but also science and even international relations(Huang, 2019).

This thesis includes three main chapters: the first describing processes of making this project, the second analyzing results from this project, and the last is the design document for the interactive web- based data presentation.

5 2. CHAPTER 1: DATA

2.1 Sample Selection

To investigate whether there exists a relationship between science fiction imaginations about technology and actual technology developments, two sets of data are needed. The first set is data on the occurrence of different themes in Chinese computer science publications across the years 2009 to

2013, and the second set is that in Chinese science fictions.

Criteria for choosing the sources where data is collected include 1)

Both CS publication and science fiction sources should be creditable and possibly leading publications in the field. This condition tries to ensure that the data collected reflect the mainstream trends across time in the two fields. 2) Both publications should be relatively timely in publishing work submitted to them. Both sources should preferably be monthly publications instead of quarterly or yearly. This condition might minimize the difference in sources’ publishing speed impacting trends in popular topics. 3) Both publications should be general in terms of what paper/story topics they accept, as specialized publications won’t be able to reflect trends. 4) Both publications should mainly focus on local

Chinese works rather than foreign ones.

6 The search for the source for Chinese science fictions is relatively simple, with the final selected publication being (

科幻世界). First, there are not many choices when it comes to

Chinese science fiction magazines. Other than science fiction, which started in 1979 under the name QiTan(奇谈) and

KeHuanWenYi(科幻文艺), it there are two other publications that roughly fulfill the criteria listed above. One of them is King of

Science Fiction(科幻大王) and the other one is World Science

Fiction Expo(科幻世界博). The latter- World Science Fiction

Expo- was ruled out as the data source of the project for its suspension in 2008 and its relative lack of focus on Chinese local science fictions. While King of Science Fiction stood as a business competitor to Science Fiction World throughout the years that this project intended to cover, it was also ruled out because it is more targeted towards younger readers, and focuses more on comics as well as western works rather than local ones(Baike, 2015). Science Fiction World did not become the final choice only because the other options are ruled out though. It actually fulfills all the criteria listed above perfectly. Its long history, popularity as well as becoming the best-selling in 2000(Sahuls, 2019) makes it clear that it is the renowned and popular publication that the project is looking for. Moreover, the magazine focuses mainly on local Chinese

7 Sci-Fi, with the annual ‘’(银河奖) attracting many

Chinese science fiction writers to contribute(LiangTaide,2019). There were concerns about the 2010 rebellion that happened in the Science

Fiction World publishing house affecting the quality of work around that time. However, such concerns would not disqualify Science Fiction

World from being the data source. How the event may have impacted this project’s results will be discussed in a later chapter.

Secondary data, instead of primary data is used in this project for data on topics in Chinese CS academic publications. Meng Xiaofeng, a scholar of Renmin University of , led a team of two to collect data on the citation frequency of Chinese CS publications across 60 years(Meng, Fan, Su, 2019). Their research gets its data from 10 major

Chinese CS publications, including but not limited to general publications like China Computer Association Communications(中国计算机协会通

信) and field specific publications like China Journal of Image and

Graphics(中国图象图形学报). Though their research is intended to investigate the citation situation of Chinese CS journals in

ScholarSpace, their data on the number of citations of differently themed

CS publications across the years proves to be very valuable for this project, as it accurately reflects the change in

8 popularity of different CS topics. Thus, data from Meng and his team’s research fulfills all the requirements for this project, as it focuses on local Chinese CS journals, only includes data from renowned journals, and also collects data from journals including a wide range of topics.

One challenge that is posted by using Meng’s data, however, is that due to the large size of the sample(10 journals) that his research uses, comparing the occurrence of certain-themed publications between CS publications and Sci-fi is not viable as the sample size for sci-fi is merely one magazine instead of ten. The problem cannot be solved by enlarging the sample size of sci-fi because as mentioned above, there are not enough Chinese sci-fi publications that fulfill the listed requirement. As will be seen in the project, multiplying the data collected from 1 sci-fi publication by ten is the current solution to make the two sets of data comparable in terms of the number of occurrences. Overall, the process of selecting a sample/ selecting a set of existing data for the project goes relatively smoothly once the criteria of selection are established.

9 2.2 Data Coding Procedure

2.2.1 CS subcategory codes

With the decision to employ an existing data set for Chinese CS publications, only coding for Chinese science fictions from Science

Fiction World needs to be collected. With consideration, it was decided that the category coding used in Meng’s research on Citation frequency should be employed on this project after minor adjustment. For one, using this method of coding can ensure consistent coding across the two data sets. Moreover, Meng’s category is both scientific in dividing subcategories and comprehensive in terms of covering computer science related subcategories. Therefore, the CS subcategories coding used in the project are:

a) R&AI&C: , AI and (人工智能与机器人技术): Cyborgs are

not included in Meng’s original category code, but added to the

coding system used with science fictions as is a popular sci-fi

topic closely related to Robot and AI.

b) CNC: Computer Networking and Communications (计算机网络与通信)

c) ISPP: Information Security and Privacy Protection(信息安全与隐私保

护)

10 d) NLPIT: Natural Language Processing and Information Retrieval(中文信

息处理与信息检索): Meng’s original coding specifies language

processing to be Chinese Language processing, while the coding

used for sci-fi is more general. The decision is made considering the

large percentage of Chinese CS scholarship on Language processing

is on Chinese Language(Meng, 2019), but that is not the case for

scifi. Since the popularity of a CS topic in each publication type is

what is being measured here, ensuring an equal percentage of work

is considered more important. e) TCS: Theoretical Computer Science(计算机理论) f) CGV&HCI: Computer Graphics and Vision and Human Computer

Interaction(图形图像与人机交互) g) SESD: software Engineering, System Software and System Design(软

件工程,系统软件与程序设计) h) DBDM: Database and Data Mining:(数据库系统与数据挖掘) i) CAHPC: Computer Architecture and High Performance Computing(体

系结构与高性能计算)

11 2.2.2 General Coded Specifically for Science Fictions

Unlike scholarly journals which provide data that is by default CS- related, science fictions provide a much wider range of topics including many not necessarily related to computer science. In the very beginning of the project, no attempt was made to code non-CS related science fiction pieces. However, it was soon discovered that the change in popularity of Computer Science topics in comparison to other sci-fi themes like time and space is another type of valuable information to have. Moreover, having a double-layered coding system where whether a piece is CS related is evaluated first is advantageous in terms of ensuring the efficiency and accuracy of coding.

The process of coming up with the general codes is different from that of deciding on the CS subcategory codes. With subcategory codes, existing codes are evaluated as to whether they are suitable to be used with the existing sample, while with establishing general codes, a small sample was taken from science fiction stories and codes were devised based on what would best indicate the topic in the sampled stories. Afterwards, whenever existing general codes could not summarize the theme in one piece correctly, a new general code was created. Below are all general codes used and their explanations.

12 a) CS: Computer Science b) T: Time: Topics that would be categorized under this code include

but are not limited to: and time travel related paradoxes

like . c) E: Environment/ Energy/ Resources: Environmentalism has

become an increasingly popular topic for sci-fi. Imaginations of

future world/universe order after resource exhaustion, of new

energy and of new nature pyramid etc. are included in this

category. d) S: Space: This category includes topics like space travels, alien

encounters, space colonization etc. e) A: Apocalypse: This code often occur together with E, but not

necessarily. Any pieces of work that center around the destruction

of earth or imminent global disaster are coded to be this category. f) R: Reality/Virtuality: Dreams, virtual experiences created by

technology or medication etc. all belong to this category for the

purpose of this project. g) L: Life form: Imaginations about aliens, unknown/manmade

species and mutations etc. all are coded with this category. h) M: manufacturing: This category mainly aims to capture technical

discussions of how things are manufactured. For example, a

detailed

13 description of how a robot is built, rather than how it function etc. belongs to this category.

I) Culture: this category is created for pieces in Science Fiction

World that comment on Sci-fi movies etc. It is not a category that is counted in the analysis, but is helpful in the process of coding.

14 2.2.3 Coding Procedure

Coding data on all Chinese sci-fi for this project was collected manually instead of using automatic/semi-automatic means like keyword search algorithms. This decision, together with the set timeline limits the size of sample that data can be collected from.

However, below are reasons why such a coding procedure was selected.

As mentioned above, Philipp Jordan and his team conducted a data-based research on the frequency of mentioning of sci-fi in HCI publications across the years. Looking into their method of collecting data, it is found that they set up an algorithm that searches for the keyword “sci-fi” and its different variations (such as science fiction,

Sci Fi etc.) in the ACM-SIGCHI database(Philipp, 2018).

Considerations were given to whether a similar data collecting method can be applied to this project. However, automatic search seems not viable. This is mainly because the discussion of a topic/theme is very likely to occur without mentioning of the name of the topic/theme. For example, Our Earth(我们的地球)(Yao,

2012) builds its story surrounding “Liuguang”( 流光), a virtual system aiding human survival in a post-apocalypse world, and should be accordingly coded with C(computer science)-

CGV&HCI(Human Computer Interaction)

15 and R(reality). However, the entire piece does not include any keyword such as ‘interaction’ that could enable it to be correctly coded if automatic coding methods were used. Moreover, when the coding aims to capture the theme of the whole piece, many times the mentioning of a certain keyword does not mean that the corresponding code should be given. For example, the word “computer” could be mentioned in a story with any theme which perhaps would not be coded with “C” manually. As a result, automatic or semi-automatic coding using keywords is ruled out for this project.

The consideration that went into the coding method selection also becomes helpful in establishing rules for later manual coding, as possible pitfalls like attributing a code to a piece of work falsely just for the appearance of a keyword are already pointed out. Below are the rules for manual coding used for this project.

a) Each work of Science Fiction should have at least 1 general code, and

can have multiple general codes. b) Every work should at least have a general code C before having any

specific CS subcategory code c) There’s no Maximum to how many general codes or CS specific

codes a piece of work can have d) A code should only be given when the corresponding topic is not 16 only mentioned but also explained or discussed in at least two

sentences. e) Codes should be given as the piece of work is being read, rather than

in the end of reading. Whenever the piece of work qualifies for a

code(has two sentences on that topic), the code should be given at

that moment.

Overall, coding is done in a rather old-fashioned way, but

following clearly defined rules. It is believed that since themes

and topics are hard to be determined mechanically without reading

and human understanding, manual coding can best capture the

actual themes of pieces. Moreover, the established rules about

coding are expected to ensure the same standard being applied to

all works and add objectivity and consistency to manual coding.

17 CHAPTER 2: UNDERSTANDING RESULTS

2.1 Revisiting the Hypothesis

As introduced in the beginning chapter-- “Purpose of the Project”, this project wishes to investigate the relationship between science fictions’ imaginations about technology and actual technology innovations by looking at the trends of popular topics in the two fields’ publications. The hypothesis is that if the trends of topic popularity in sci-fi across the years is reflected in that in CS publications in later years, then it is possible that science fictions are inspirational to actual computer science advancement.

Related theories like Sci-fi prototyping may gain evidence as well. On the other hand, if trends in topic popularity in Sci-fi and in CS publications are relatively concurrent, it might be possible to deduce that the two are mutually influential or that they are influenced by similar other factors.

18 Below are two graphs that conclude results from this project:

Figure 1. Trends of topic in Chinese CS publications

Figure 2. Trends of topic in Chinese CS Themed Sci-fi (Note: these

percentages are calculated within works already coded ‘C’ )

19 Looking at the two graphs, it is possible to conclude that they show no general relationship between science fictions’ imaginations of technology and scholarly CS related advancement. First of all, trends in

Chinese sci-fi topics fluctuates while lines representing topic trends in CS publications stay relatively flat. This seems to suggest that first, trends of popular topics in Sci-fi have no, or very little impact on what’s getting attention in CS academic journals. Moreover, as can be observed, the ranking of topic popularity in CS publications remains relatively stable, with Works on Computer Networks and communication occupying the highest percentage and those on Information Security and Privacy

Protection being the least popular throughout the years in focus. The case is very different with data collected from CS related sci-fi pieces, as different topics’ popularity fluctuates across the years, with the most noticeable case being works on Software Engineering and Software

Design, which is the third most popular topic in 2010, but the least popular in three other years.

Other than the trends having different levels of fluctuation, another source of evidence that shows the lack of a relationship between topic popularity in Sci fi and in CS publications is the different popularity rankings of the subcategories. A general popularity ranking of topics in

Sci-fi goes like

20 below(Note: ranking calculated by average percentage occupied throughout the years, and summarized in Graph 3):

a) AI& R& C(57.4%) b) CGV& HCI(30.6%) c) CNC(6%) d) CHAPC(5.4%) e) DBDM(5%) f) SESD(3.6%) g) ISPP(2.2%) h) LPIR(1.8%) i) TCS(1.4%)

While popularity ranking of topics in Chinese CS publications between 2009 to 2013 is shown below:

a) CNC(25.54%)

b) CGV& HCI(22.22%)

c) SESD(9.86%)

d) TCS(8.5%)

e) AI& R & C(7.9%)

f) LPIR(7.4%)

g) CHAPC(5.64%)

h) DBDM(4.96%)

i) ISPP(2.88%)

21

Figure 3. Average Topic Popularity Ranking

As can be observed, other than the category CGV&HCI, each topic’s level of popularity varies considerably across the two fields.

The most noticeable example is perhaps the percentage that AI&R&C occupies in the two categories. With more than half of Sci-Fi works investigated during the project having the AI&R&C code, this subcategory only occupies 7.9% on average in scholarly CS works.

This imbalance seems to contradict the hypothesis that science fictions reflect the social and technological happenings of its time. Instead, it seems that Sci-fi writers have their favorite topics for explorations and imaginations. Chinese Sci-fi writers’ favorite topics as shown by the result of this project also seem to validate the argument that

AI& related topics, as well as those about Human computer interactions are among the most common Sci-fi themes(The 22 Encyclopedia of Science Fiction).

Furthermore, when looking at the trends of popularity of each individual subcategory between the two fields, it is possible to find that popularity of a topic in sci-fi neither predicts future popularity of it in CS publications, nor suggests similar popularity in CS publications at that time. For example, below is a graph of popularity trends of AI&R&C in both sci-fi and CS scholarly works(percentage of ). As can be seen, despite the fluctuations, the popularity of the topic is decreasing in sci-fi while increasing in CS publications. Similar situation can be found from the subcategories including Computer Network and

Communication and Computer Architecture& High Performance

Computer etc.

Figure 4. Trends of CNC popularity in Sci-Fi and CS journals

23 A possible argument against the above conclusion that could be brought up is that it may take time for scientists to start writing about what sci-fi writers were imagining. Thus with the short time period that this research project covers, conclusive arguments about the lack of an inspirational relationship between the two cannot be reached. However, as data on topic popularity in Chinese Sci-fi is taken from Meng’s existing study, data from more recent years are actually available. Below is an image taken directly from Meng’s study, showing the topic popularity trends in Chinese CS publications over 3 decades, and looking at topic popularity in CS scholarly works after the year 2013 in relation to that in CS themed sci-fi from 2009 to 2013 might shine some light on whether imaginations in Sci-Fi are inspirational in the long run.

24

Figure 5. Legend Translation: Light Blue- Computer Network and Communication; Blue- Information Security and Privacy Protection; Pink- Chinese Information Processing and Information Retrieval; Red- Theoretical Computer Science; Light Green- Computer Graphics, Computer Vision and Human Computer Interaction; Green- Software Engineering, System Software and Software Design; Light Orange- Computer Architecture and High Performance Computer; Light Purple- AI and Robotics; Purple- combined, interdisciplinary (Meng, 2019)

Focusing on the subcategory AI&R&C first, it is noticed that its decreasing popularity in Chinese sci-fi is not only not reflected in that in

CS publications during 2009 to 2013, but also not reflected in that after

2013. As can be seen in Graph 5, the percentage that the subcategory

AI&R&C occupies has been increasing gradually since around 2005.

25 However the contrary situation is shown with some other subcategories.

For example, popularity of CAAGPC and TCS both show decreasing trends in both Chinese sci-fi(2009-2013) and Chinese Science publications(after 2013). Moreover, there are also cases where the popularity in CS publications remains relatively stable after 2013 no matter the changes of popularity in Sci-fi during 2009- 2013. Overall, comparing topic popularity trends in Sci-fi from an earlier time period to that in CS publications from a later time shows no conclusive results. It cannot counter the previous argument that there is no shown relationship between technology imaginations in Sci-fi and scholarly CS research; it can neither show a predictive relationship between the two.

In conclusion, by looking at the results of this project, no supporting evidence can be found for a predictive or mutually- influential relationship between technological imaginations in Sci-fi and

CS explorations in academic publications. However, it is believed that there are multiple factors that could have impacted the results of this project, and with further studies, a different conclusion is possible.

Sections below will therefore explain and explore these external factors.

26 3.2 Possible bias introduced by the data collection method

In the above section, it is discussed that topic popularity trends in Chinese

Sci-fi have rather big fluctuations while those in CS publications are relatively stable. Though this difference is used as support to show the lack of relationship between Technological explorations in the two different fields, the claim could be defused by explaining the different level of fluctuation by pointing to the different sizes of sample where data is harvested. As introduced in Chapter 1, data on Chinese sci-fi are collected from one publication: Science Fiction World, which features

10-20 stories or articles each month. However, Meng’s team collected their data on Chinese CS publications from 10 major Chinese CS journals. It is thus unfair to not notice the difference between the data sample size. As a result, fluctuations in topic popularity trends in Sci-fi can be caused by much smaller changes in the number of occurrences of works with certain themes. Following such a conclusion, the different levels of fluctuations in results are caused by a different sample size, thus not fully capable of showing the lack a relationship between topic popularity in the two genres.

It may also be argued that due to the small sample size, trends in

Chinese sci-fi might not accurately reflect the actual level of popularity of different topics in Sci-fi; for a small change in the number of occurrences might alter the trend. This concern is valid, however, it is worth pointing out that despite the small sample size, the sample is chosen to best 27 represent Chinese sci-fi writings, and is already the largest possible. As introduced in the coding procedure section, there are three main Sci-fi

Publications in China, and other than Science Fiction World which provides the sample for this project, the other two were both suspended in the last 10 years. World’s Science Fiction Expo’s lack of focus on local sci fi writings and Science Fiction Kong’s particular target audience---the youth--- deem both to be unsuitable sample sources to reflect topic popularity in Chinese Sci-fi. Overall, the results produced in this project, though easily influenced by small changes and therefore not entirely reliable for comparison, are accurate to the best possible extent in reflecting topic trends in Chinese Sci-fi.

If further studies on this topic shall be carried out, changing the location of focus to other regions that may provide more data on Sci-fi, or incorporating data from other regions with those obtained from Chinese

Sci-fi is recommendable. As those appear to be viable ways to increase the sample size of this study. However, studying the relationship between sci-fi imaginations and CS scholarly works across different regions may also introduce new problems into the equation, as a different social context might have influence over the popularity of technology

28 topics in both sci-fi and in CS academia. The social influence over data will be further discussed in the later sections.

Another possible criticism of the results of this project is the limited years(2009-2013) whose data is collected. Though admittedly more conclusions might be drawn if data from a longer time was collected; it is believed that the current conclusion would stay relatively the same. For one, the relationship between Sci-fi imaginations about technology and CS scholarly explorations during the same time period is adequately shown with five years worth of data. Moreover, with the data on CS publications available all the way to 2017, the hypothesized predictive relationship between the two trends are in effect tested in a 10 year timeframe. If there are further data available, it is possible to test whether there is predictive relationship between technology in Sci-fi and that in academic publications in a longer time frame. Whether it is meaningful to test the hypothesized predictive relationship in a longer time is debatable. As known, acknowledged cases happen within a shorter time- frame(Blazeski, 2017), and the question of whether scientists read over 10-year-old Sci Fi which are not wildly popular is debatable.

Another bias that might have influenced the result of this project is the data collection method used. Meng and his team did not mention in

29 their paper the exact methods they used for their data collecting. It is unclear if there are limits applied to their research which are not applied to data collection from Chinese Sci-fi. For example, if Meng’s team restricted the maximum number of codes that one piece of work may have while data collection from Chinese Sci-fi did not, it might result in an overall higher percentage associated with all subcategories in Chinese

Sci-fi. Such a different method might even influence the ranking of topic popularity between the two genres of publications. For subcategories which have a higher chance of occurring even when it is not the center of discussion(i.e. AI, HCI) would have a higher percentage in Sci-fi data than in CS publications data. Another concern with the data collection method is that automatic coding is very possibly used by Meng and his team, which might result in differences in the standard of coding from manual data coding for Sci-fi. However, this concern is minor considering that academic publications have a much more comprehensive and systematic indexing system already in place. Even if Meng’s team used an automatic coding method, indexes that they would be using are still created based on human understanding instead of arbitrary keyword searches.

Overall, this section discussed three possibilities where biased data could have been generated: 1) Different sample size, specifically small

30 sample size for Chinese Sci-Fi, 2) Small number of years from which data is collected, especially for Chinese Sci-fi and 3) possibly a different method of data collection used. It is argued that though more conclusions might be drawn, and that different conclusions might be drawn if a different set of data(not Chinese Sci-Fi and CS publications) was selected , these concerns do not pose serious challenges to the standing conclusions. Recommendations to overcome these possible concerns for bias are also given.

31 2.3 Possible Bias Caused by Social Context

In the previous section, several impacts of having a relatively small sample size for Chinese Sci-fi were discussed. Other than those already mentioned, having only one publication as the data source also opens up the possibility of results being impacted by social context and other subjective factors. For example, works in Science Fiction World not only reflect what Chinese science fiction writers are writing on during the time in focus, but also what editors working in the publishing house were looking for(LiangTaiDeXiaoKeAi).

This project’s time of focus-- 2009 to 2013 ---was selected based on whether the data source(aka. Science Fiction World magazines) was available, and whether data on CS publications after the time period was available for investigating possible predictive relationships. However, though the time period between 2009 to 2013 is perfect time-wise for the purpose of this research, it does present certain challenge to the results as it is a turbulent time for the publishing house of Science Fiction World.

After the widely acknowledged ‘Golden Era’ of the magazine

Science Fiction World from around 1999 to 2006(Song, 2018) where the publication was led by credited writer like A Lai(LiangTaiDeXiaoKeAi), the magazine entered a time period of instability. In 2010, which is during the time period that

32 data for this project is collected, the infamous DaoShe Strike(倒社

风波) happened at the magazine(Zhang, 2010). Mainly, all employees of the magazine threatened to resign in a public letter to its governing department in an effort to fire the head of the magazine at that time(Zhang, 2010). The strike itself provides little reasons for concern for this project. However, the employees went on the strike because their leader at that time were allegedly substituting science related columns with advertisement and encouraging editors to write instead of paying writers who produce great works so as to save expanses(Zhang,

2010). Such behaviors accused in the strike clearly impact the magazine’s ability to fairly represent what is popular in the field then.

The new president that came to Science Fiction World after the DaoShe

Strike also faced similar accusations. Such a context thus gives rise to the question of whether data collected from Science Fiction World from

2009 to 2013 can accurately capture and reflect trends in Chinese science fiction. As talented authors and excellent works may have been missed by the publishing house due to instability itself.

Another possible bias in data that could have been introduced to the study by social context is related to the society in the larger sense, rather than that in the publishing house. It is noticeable that though the subcategory ISPP has a relatively small percentage, its level of popularity

33 is not much different from other topics which have similar low percentage(i.e. LPIT-1.8%, TCS-1.4%). However, the level of popularity of ISPP(2.88%) in CS publications is not only the lowest, but also much lower than the other two subcategories immediately above it in the ranking(CHPC- 5.64%, DBDM-4.96%). Though this might be used as evidence to suggest the lack of relationship between technological imaginations in Sci Fi and Scholarly CS research, it calls for consideration over other factors that could have caused this phenomenon.

As the Sci-fi writer Han Song stated, reality, especially reality in China is more like Sci-fi than Sci-fi(Luo, Gao, 2016). Thus, though it is only a deduced possibility, the difference in ISPP’s popularity levels between genres could have been caused by varied levels of censorship in different fields.

Overall, such possible bias introduced to the study by social and community contexts may remind us that even if there is a relationship between imaginations in Sci-Fi and Computer Science scholarly research, the relationship is far from being straightforward. In fact, a number of factors all come into play to determine what gets written, published and distributed. With such realization, it is possible to realize that if there indeed is a relationship between sci-fi imaginations and scientific discoveries, the relationship is perhaps far from something that can be measured systematically. Rather, if further studies wish to investigate this

34 possible relationship, approaching the problem with a more individualistic point of view(i.e. the number of computer scientist s who gives sci-fi credits for their research etc.) might be more accurate.

35 2.4 Other Possible Conclusions

Other than the hypothesis that this study sets out to investigate, other relationships between collected data are also explored as an addition for the understanding of Chinese sci-fi, specifically, CS in

Chinese sci-fi. The first question for investigation is the popularity of computer science as a general category in Chinese sci-fi with varied themes. The answer to this question will not only show the level of popularity of CS as a whole, but also provides a general idea of the portion that different subcategories occupy in the entire genre. Thus, below is the summary of the percentage that each general category occupies.

Figure 6. Percentage of CS subcategories in all Sci Fi works across years

36 It’s rather worth noticing that while lines representing other general categories’ percentage either fluctuates or remains relatively stable, the line representing the percentage that CS topics occupy grows rather steadily. This first suggests CS topics’ rise in terms of popularity in Chinese Sci-fi, which might have relations to the years 2009 to 2013 coinciding with the bloom of Chinese Internet Companies(Internet

Society of China, 2014), However, this is merely a deduction based on the hypothesis that this study failed to fully support with data discussed above.

More importantly though, this data indicates that those trends analyzed in the last section on the popularity of varied subcategories may look different if the percentages are calculated based on the entirety of

Chinese sci-fi in the selected magazine rather than those already coded to be CS related. The increase in the number of CS themed sci fi overall means the same percentage would actually represent more pieces of work. With this realization, the graph below reflects levels of popularity of different subcategories in the entirety of Sci-fi published by Science

Fiction World.

37

Figure 7. Percentage that varied subcategories occupy in all sci-fi works looked at

Figure 2. (same from above section for comparison)

As can be observed from the above two graphs, certain CS related sci-fi topics appear to have different change direction once they are looked at within the entirety of sci-fi. For example, instead of having a

38 relatively stable line representing it in Graph 2., the topic CNC has a slightly increasing trend in Graph 7. Such change in the trend of CNC topic popularity in Sci-fi enables it to represent that trend in CS publications, both presenting a slight rise. However, this brings up the question of which of the above graphs based on two different calculations should be used in comparison to data from CS publications. As Graph 2 was already used for discussion in the previous section, this project’s decision is clear. Calculations of percentage based on all Sci-Fi works already determined to be CS-related are used, since the limitation ‘sci-fi pieces already deemed CS related’ makes sure that the changes in other topics irrelevant to the issue being investigated do not affect the results.

In other words, if calculations based on the entirety of Sci-fi works in the chosen magazine were used, then much more irrelevant factors, such as the recent rocket launching giving the Space theme a great boost for example, might all impact the results. Nevertheless, Graph 7 still provides us with useful information despite introducing more than needed interfering factors into the investigation.

For the purpose of maintaining the focus of this project, there are more possible conclusions, or perhaps less academically relevant facts that could be found with the dataset made available by this study. For example, one can make a conclusion about different Chinese Sci-fi authors’ favorite

39 topics using the dataset and so on. The initial intention behind making interactive data representations is thus exactly enabling more to be done with the collected dataset. In the next chapter, explanations will be given to the other part of this study: web Data Presentation.

40 CHAPTER 3: DATA REPRESENTATION 3.1 Intentions and guidelines

This project is intended to be a digital humanities project with a focus on representing collected data digitally from the very beginning.

Lisa Spiro in This is why we fight states that “the digital humanities seeks to push the humanities new territory by promoting collaboration, opened and experimentation.”(Spiro, P23) , and this study was intended to be a digital humanities project exactly for the three qualities of work that Spiro argues DH strives for. As mentioned in the previous section, many other possible conclusions might be drawn from the dataset used for this project, thus the decision to make the entire dataset public. Overall, the hypothesis and conclusions that are presented in this thesis are only one possibility realized based on the published dataset.

Other than making the entire dataset available, a digital experience was also designed and implemented in an effort to enable viewers to come up with their own conclusions based on the available data about the hypothesized relationship between sci-fi imaginations and actual computer science research. It is believed that a more accessible experience may enable wider reading of the results, also, personal bias of the researcher while interpreting data could be

41 avoided to some extent if the viewers are able to see certain summaries and come up with their varied interpretations. During the

UX design process, below are design guidelines followed to help with design goals: simplicity, objectivity as well as gamified and fun sci-fi vibe:

a) Employ as little as possible interpretational texts, but provide all

available information on how a graph is produced technically;

b) Use less words, more visual elements;

c) Explore Sci-fi illustrations to select the color scheme and art

style for the online data representation;

d) The web presentation needs to have clear structures, so the

viewer always know what they are seeing;

e) Provide ample means to connect to the entire database or this

write-up report in case viewers need more information.

f) Simple and intuitive interface, aiming for high learnability and

memorability.

42 3.2 Design Process

The first decision to be made about the web experience was the information architecture. By the time the design phase started, it was already clear that the data to be used for the online experience will include but not necessarily be limited to: the comparison between each category’s popularity trends in Sci Fi and in CS publications; topic popularity ranking in Chinese sci-fi; topic popularity ranking in

Chinese CS scholarly journals; general topics’ popularity ranking in

Sci-fi; comparison of each category’s percentage in different publications.

Therefore, based on the known data, information architecture options were made as seen below.

Figure 8. Information Architecture plan A

43

Figure 9. Information Architecture plan B

The information architecture presented in Graph 8 is eventually employed for the web presentation. The decision was made to maximize the simplicity of each individual page, since the architecture presented in

Graph 9 is able to provide a simplistic home page, and can perhaps provide a clear structure with the separation of Sci Fi and CS publications. Each page on the second level would have to be stuffed with a lot of information though. With the architecture in Graph 8, though there are more options that users need to choose from, the structure enables them to focus on data of one subcategory first and possibly synthesize everything in the final “summary” section.

Some limited prototyping was carried out after preliminary design and after project implementation. The purposes of prototyping is mainly exploring different layout possibilities rather than testing out functions or button locations, since there are no nontraditional functions employed

44 and both the header and footer bars are modeled after established practices. Thus, below are the layout prototypes.

Figure 10. Homepage Prototype

Figure 11. Individual subcategory page prototype

45 There are several design features that became clear during the prototyping phase. The first decision was employing dark blue as the overall background, which is a relatively safe choice for web design in general. However, it is chosen for this project not for being conventional, but rather for the associations that the color might have: deep oceans, outer space and the great unknown that those spaces represent. Thus the color dark blue was chosen to create an ambience to the digital experience that may help bring viewers to the world of Sci-fi, of technology explorations. Going along with the main color choice, pastel colors with neon/luminating outlines were also chosen for representing each of the subcategories. Reasons behind such color choice include enabling the icons representing subcategories to stand out, as well as letting neon/luminating colors, which are commonly associated with sci- fi(cyber punk to be exact), to further enhance the ambience of the digital experience.

Not many major changes were made from these prototypes during project implementation. However, the background and text color in individual subcategory pages are reversed. The main reason behind this change is improving readability, which is important especially when the function of these pages is to provide viewers with informative texts and graphs. As a result, a transitional animation is implemented for the change from the homepage to any individual subcategory page. Both the

46 transition animation and the more conventional black-text-on-white-back- ground design were made in the hope to provide viewers with a fluent and intuitive experience.

47 3.3 Design Testing and Evaluations

With limitations posted by COVID-19, user testing for the digital experience was only conducted with two participants who I share my living spaces with. Ideally, the digital experience needs to be tested more with a more varied participant group. However, below is the summary of the available user testing results.

User Information:

User A: Female, age 23, major in Jazz music. Only really goes online with mobile devices. Computers and tablets are generally used once per week or less. A lot of mobile experiences though. No prior exposure to digital humanities projects, sci-fi, CS related materials or data interpretation projects.

User B: Female, age 23, major in Math and Business. Online experiences include mobile and desktop; digital games player. Certain level of prior exposure to sci-fi. Certain level of training in Computer

Science, especially in modeling using digital software. Experiences in both conducting and reviewing data based projects.

User Testing Task Menu:

a) Please find out what this website is trying to do. What is this?

48 b) Please find out why this website is created? c) Please find out how popular sci-fi themes: space, time and CS

were throughout the year 2009 to 2013. d) Please find out if works about AI& Robots& Cyborgs are more

popular in Sci-fi or in CS publications. e) Please find out if there’s a relationship between popularity of

the topic Computer network in CS publications and in Sci-fi f) What can you say about the subcategory of CS: Information

Security and Privacy Protection?

Below are the transcripts from the user testing:

User A:

1) Time Used- 1min. Result- gave up(Note: language issue, not

necessarily to do with the design)

2) Time Used – 10s. Result- gave up(same as above) 3) 4) Time Used – 1min. Result – correct answers given. Note- help

was provided when the participant was trying to figure out

legends

5) Time Used – 1min 20s. Result – correct answer given. Note-

help provided for legend understanding.

6) Time used- 30s. Result – “no relationship” (translated answer

given here)

49 7) Time used – 30s. Results – “not popular in neither category”

(translated answer given here)

User B

1) Time Used- 1min2s. Result- “find relationship between sci-fi

and computer science.”

2) Time Used – 30s. Result- “Because it’s important to know for

both of the fields? Or there are not a lot of available studies

yet.”

3) Time Used – 28s. Result – correct answers given. 4) 5) Time Used – 30s. Result – correct answer given. 6) 7) Time used- 30s. Result – “Can’t be concluded? Oh, only

thing definite is that it’s popular in one genre but not in the

other. But can’t see relationship between the two.”

8) Time used – 10s. Results – “It’s very not popular in CS

publications”

Though accurate evaluation of the digital experience is hard to derive from only two user testing results, if a conclusion has to be drawn from the above transcripts, I would suggest that the website serves its original purpose, as answers given were reasonably close to the expected ones. Also considering that the one intention behind the digital

50 experience is enabling different and open-ended interpretations, the varied answers collected thus could suggest possible success in achieving such a goal. Moreover, the “gave up” results, which in normal situations would suggest low understandability or usability of the product, in this case were caused by the test participant’s limited mastery of English. Participant A was successful in navigating herself to the pages explaining the project’s purposes and motivations, which proves usability, but failed to interpret all the text. This setback experienced with user A actually provided future inspiration to possibly have bilingual options for the explanations, or to substitute those paragraphs with more graphic representations. Moreover, user A failing to understand the text heavy explanatory pages, but succeeding in interpolating data seems to suggest the goal of simplicity, being graphic-heavy and presenting no non-necessary texts goals are achieved.

51 FINAL NOTES

In Conclusion, this thesis takes its readers to walk through the entire process of the Thesis project’s making. From the study of related theoretical backgrounds in the very beginning, to identifying the hypothesis, methods, and then data source and data collection methods, then to results analysis and discussion, and finally all the way to the data visualization part’s envision, design and implementation. The hope with completing this project is to--as mentioned in the very beginning and in the introduction of the digital experience—facilitate understanding of the relationship between imaginations and science, and promote open, collaborative and easily-accessible digital humanities studies.

Truthfully, there are many areas in which this thesis project could be improved. Possible improvements include but are not limited to collecting data on Chinese Sci Fi with a larger sample that covers a larger number of years and completing more iterations based on more comprehensive user testing for the web-based data representation and providing more data interpretations like the mentioned relationship between author and theme. With the right circumstances, it would be great to finish improving the project in the mentioned areas. However, evaluating the project as it is now, it seems to have provided useful information that leads to several conclusions about the hypothesis as well as additional possible

52 conclusions. Based on the analysis provided in the previous chapters, these conclusions also bear certain accountability.

Overall, this project shows the lack of a clearly visible relationship between topic popularity level in Chinese Science Fictions and in

Chinese CS scholarly publications. This conclusion suggests that imaginations in Sci Fi cannot be proved to have predictive impact on actual technology innovations, and that the two(sci-fi and CS academia) might be less mutually influential than thought by some scholars working on imaginations and science. However, though the possibility of sci-fi having large-scale influence over CS academic explorations is refuted by this study, cases where sci-fi has immense influence over a small number of CS researchers can still perfectly exist to establish the relationship between imaginations and science.

53 APPENDIX A: LIST OF LINKS TO DIGITAL COMPONENT OF THE PROJECT

1. Link to the web-based interactive data representation: http://lizao-wang.com/trendsearch/

2. Link to shared data set: https://drive.google.com/drive/folders/1auKRRrpNSExofAQuwj4Mp0 OWlnSMaWGy?usp=sharing

54 APPENDIX B: LIST OF FIGURES

Figure 1. Topic popularity/occurrence in Chinese CS scholarly publications...19 Figure 2. Topic popularity /occurrence Chinese CS-themed Sci-fi 19, 38 Figure 3. Average topic popularity ranking in Sci-fi and CS publications 22 Figure 4. CNC category popularity in Chinese sci-fi and CS Journals 23 Figure 5. Topic citation frequency in Chinese CS Journals 25 Figure 6. Sci-fi general topic popularity in all Chinese sci-fi 36 Figure 7. CS related subcategory topic popularity in all Chinese sci-fi 38 Figure 8. Information architecture plan A 43 Figure 9. Information architecture plan B 44 Figure 10. Homepage prototype 45 Figure 11. Individual page prototype 45

55 APPENDIX C: CITATIONS

Primary Sources

Magazines

He, Shibo; Han, Song; An, Long; Ye, Xingyu; Zhai, Ren; Jin, Mai; Ye, xingxi; Jiang, Bo etc. Science Fiction World( 科幻世界) Science Fiction World Magazine(科幻世界杂志社), 2009, 8-12

Jiang, Bo; Yang, Er; Yin, Tao; Zhang, Guoxin; Qiu Daoyu etc. Science Fiction World(科幻世界), Science Fiction World Magazine(科幻世界杂志社), 2010, 1-12

Wang, Jinkang; Ye, xingxi; Han, Bing; Hao, Jingfang; etc. Science Fiction World(科幻世界), Science Fiction World Magazine(科幻世界杂志社), 2011, 1-12

Cheng, Jingbo; Mi, Ze; Liu, Weijia; ; etc. Science Fiction World(科幻世界), Science Fiction World Magazine(科幻世界杂志社), 2012, 1-12

Zhang, Ran; Xie, Yunning; DianZiQiShi; etc. Science Fiction World(科幻世界), Science Fiction World Magazine(科幻世界杂志社), 2013, 1-12

Yao Mo(杳漠), Our Earth(我们的地球), Science Fiction World (科幻世界)2012-1

News Articles, Blogs and Encyclopedia Entreis

Blazeski, Goran; Inspired by Star Trek: Martin Cooper invented the mobile phone in 1973, the Vintage News, 2017, https://www.thevintagenews.com/2017/02/25/priority-martin- cooper-invented-mobile-phone-1973-says-inspired-captain-kirks- gold-flip-top-communicator-star-trek-2/

56

Eschrich, Joey; Shaping the Future Through Sci-fi at ASU, Arizona State University Center for Science and Imagination, 2014, https://csi.asu.edu/press/news/shaping-the-future-through-sci-fi-at- asu/

Luo, Xin, Gao, Yang, Chinese Science Literature: Reality being more sci-fi than sci-fi(中国科幻文学:现实比科幻更“科幻”), PengPai News(澎湃新闻), 2016

Zhang, Shuzhou, President of Science Fiction World Forced To Resign(科 幻世界社长被免职), Southern Metropolitan Daily(南方都市报), https://web.archive.org/web/20131216070510/http://gcontent.oeeee.com/f/8a/f8a7 e9f 5efd72a91/Blog/da9/6b993f.html,2010

The Encyclopedia of Science Fiction, http://www.sf- encyclopedia.com/category/themes

LongXingZhiMengDeMeng(龙星之梦的梦), King of Science Fiction (科幻大王), Baike( 百科), https://baike.baidu.com/item/%E7%A7%91%E5%B9%BB%E5%A 4%A7%E7%8E%8B/566821?fr=aladdin

LiangTaiDeXiaoKeAi(凉太的小可爱), Science Fiction World(科幻世界), Baike(百科 ),https://baike.baidu.com/item/%E9%93%B6%E6%B2%B3%E 5%A5%96

Sahuls, World’s Science Fiction Expo(世界科幻博览), Baike(百 科)https://baike.baidu.com/item/%E7%A7%91%E5%B9%BB%E4% B8%96%E7%95%8C/298990?fr=aladdin

57 Secondary Sources

Reviews Huang, Yingying, The Reincarnated Giant: An Anthology of Twenty- First-Century Chinese Science Fiction, Chinese Literature in Review, Taylor & Francis Ltd. 2019

Emerging Technology from the arXiv, When Science Fiction Inspires Real Technology, MIT Technology Review, 2018

White Paper Internet Society of China( 中 国 互 联 网 协 会 ),China Network Information Center, Report on Internet Development in China,( 中国互联网发展报告), 2014

Journal Articles Johnson, David, Brian, Science Fiction Prototyping: Designing the future with Science Fiction, Synthesis Lectures on Computer Science, Morgan& Claypool Publishers, 2011

Philipp Jordan, Omar Mubin etc. Exploring the Referral and Usage of Science Fiction in HCI Literature, arXiv: 1803.08395, 2018

Meng Xiaofeng(孟小峰), Fan Zhuoya(范卓雅), Su Hanting(

粟寒婷), Analysis on the Citation of Chinese CS Publications ( 基于学术空间的计算机中文期刊引文分析 ), China Computer Federation Communications,( 中国计算 机学会通信), 2019

Song, Mingwei, “INTRODUCTION: Does Science Fiction Dream of a Chinese New Wave?”, The Reincarnated Giant: An Anthology of Twenty-First-Century Chinese Science Fiction, 2018

Books Spiro, Lisa, This is Why We Fight, Debates in Digital Humanities, Edited by Matthew K. Gold, University of Minnesota Press, 2012, P23

58