Visualising Gender Inequality in the Hollywood Film Industry

Visualising gender inequality in the Hollywood ﬁlm industry

Sophie Ensing 10751297

Bachelor Thesis University of Amsterdam 17/07/2017

Supervised by Frank Nack

Abstract This study looks at the multifaceted gender inequality in the Hollywood film industry and how to visualise this in a stimulating way. A literature review describes the multifaceted gender inequality in the film industry, the importance of interaction in visualisations and it introduces the parallel coordinates technique. The parallel coordinates technique was used to create an interactive visualisation. The visualisation was evaluated with three participants from the study fields of Media & Culture and Gender Studies. The evaluation showed that the parallel coordinates technique is a stimulating way to visualise the gender inequality in the Hollywood film industry. Future work should be directed at developing an adaptable version of the visualisation. Contents

1 Introduction 3

2 Literature review 4 2.1 Gender inequality ...... 4 2.1.1 Representation of women on-screen ...... 4 2.1.2 Representation of women oﬀ-screen ...... 5 2.1.3 Box oﬃce and critical acclaim ...... 5 2.1.4 Conclusion ...... 6 2.2 Information visualisation ...... 6 2.2.1 Interaction ...... 6 2.2.2 Visualisation techniques ...... 7 2.2.3 Parallel coordinates ...... 8 2.2.4 Conclusion ...... 9

3 Methodology 9 3.1 Data collection ...... 10 3.1.1 Start data ...... 10 3.1.2 OMDb ...... 10 3.1.3 IMDb ...... 11 3.1.4 Gender classiﬁcation ...... 11 3.2 Visualisation ...... 12 3.2.1 Design of the visualisation ...... 12 3.2.2 Prototype testing ...... 13 3.2.3 The visualisation in D3 ...... 15

4 Evaluation 17 4.1 Tasks ...... 18 4.2 Participants ...... 19 4.3 Experiment set-up ...... 20 4.4 Results/Discussion ...... 20 4.4.1 Identify tasks ...... 21 4.4.2 Locate tasks ...... 22 4.4.3 Determine tasks ...... 23 4.4.4 Compare tasks ...... 24 4.4.5 Infer tasks ...... 24 4.5 Limitations ...... 26

5 Conclusion 26 5.1 Future Work ...... 28

6 Appendix 31

2 1 Introduction

Gender inequality has been an important topic in the western society for decades. Unfortunately gender equality has not been accomplished in many areas. The film industry is one of them. In 1985 Alison Bechdel wrote about a test in one of her comic strips. The Bechdel test is still used today to classify films. The test is fairly simple and there are three conditions to pass. A film needs at least two named women in it (1), who talk to each other (2) about something besides a man (3). The test has been embraced by some researchers as a detector for male bias in films (Agarwal et al., 2015). This test seems easy to pass, but nearly 50% of the films on bechdeltest.com, a crowdsourcing site for Bechdel test results, don’t pass this test (Hickey, 2014). Not passing the test is a sign of inequality. If condition one is not met, it means the cast isn’t equally made up of men and women. Failing the second condition means the women in the film don’t talk to each other, so there is never a scene with interaction between women. The last condition is also important, because if the film fails this condition, the conversations are very male oriented. Not meeting these criteria is a sign of inequality, because roughly half the population worldwide is female and women are also half of the film visitors, but they are not equally represented in this important medium (Smith et al., 2016). In 2016 the Media, Diversity & Social Change Initiative published an In- equality Report where 800 popular films from 2007 to 2015 were analysed (Smith et al., 2016). In these films the representation of gender, ethnicity and the Les- bian Gay Bisexual and Transgender (LGBT) community were measured. The overall conclusion is that films are not representative of all genders and ethnici- ties. Both this research and research by the Geena Davis Institute on Gender in Media show that men get two times the amount of screen and speaking time as women (Narayanan and Heldman, 2017). The inequality also applies to the film industry off-screen. In this group of people working behind the scenes, only 19% are female. When looking at directors of films alone, only 4.1% were female. Though these studies are some of the largest researches on this subject, they do not provide an overview of the multifaceted gender inequality in the film industry. They do contain large sets of graphics and tables, but these figures are not connected in a comprehensive way. The data is too fragmented to draw conclusions based on the visualisations or connect variables. The goal of this research is to provide an overall view of the multifaceted inequality in the film industry. This will be presented in the form of a visualisation that is stimulating for the user. Stimulating in this context means the user reacts to the visualisation and is stimulated by the interaction with the visualisation. The target group for the visualisation are people from the fields of Media & Culture and Gender Studies. These fields are related to gender inequality and the film industry, so a tool like this could provide insights for their research from actual data on the topics. A data set of 2000 films from research on the gender divide in Hollywood by the Weekly Journal of Visual Essays: the Pudding, in combination with data from the Open Movie Database (OMDb) and the Internet Movie Database (IMDb) will be used to create a visualisation.

3 The research question to be answered is:

What is a stimulating way to visualise the multifaceted gender inequality in the Hollywood ﬁlm industry?

To provide an answer to the research question, two sub questions will be answered:

1. What are the eﬀects of interaction in a visualisation? 2. What is a good visualisation technique for complex data?

In chapter 2, related work will be presented to provide an overview of the gender inequality in the film industry. The scope of the data for the visualisation will be based on the related work on gender inequality. There will also be a literature review on visualisation techniques and user interaction that will help to answer the sub questions. The methodology chapter will describe the data set and the visualisation techniques. This section will also contain a description of how user tests were performed on the first prototype. Chapter 4 will describe the evaluation process and there will be a discussion of the results. The final chapter presents the conclusion and proposes directions for future research.

2 Literature review 2.1 Gender inequality 2.1.1 Representation of women on-screen As mentioned, a lot of films do not pass the Bechdel test. Research on 1,600 films shows that 53% of the films pass(Hickey, 2014). About 10% of the films fail the first criteria, which means there are less than two named women in the film. Most films, almost 30% fail on the second criteria and 10% fails on the third, meaning the women in the film mainly talk about men. The second and third criteria say something about the amount of dialogue between women and the subject of it. Researchers at The Pudding, a journal of visual essays, responded to the Bechdel test with a more in depth analysis of the dialogue in films (Anderson and Daniels, 2016). The study was inspired by critique on the Bechdel test, because there are limitations to the test. The film Gravity (2013) for example does not pass the Bechdel test, because there is only one named woman in the film. It doesn’t seem fair for this film not to pass, because the film only has three named characters and Sandra Bullock has the leading role. This example shows the scaling problem of the Bechdel test. With so few characters, where the character with the most screen time is a woman, the Bechdel test doesn’t work. The research on film dialogue by the Pudding shows that 1500 films contain 60% or more male dialogue and 170 films contain 60% or more female dialogue. This distribution is not equal and shows gender inequality better than the Bechdel test, where 53% of the films still pass the test.

4 Another measure of equality is the Geena Davis Inclusion Quotient. The GD-IQ is a software tool developed by the Geena Davis Institute on Gender in Media for analysing audio and video content (Narayanan and Heldman, 2017). The GD-IQ, introduced in a report by the Geena Davis Institute, is able to measure screen and speaking time by incorporating Google’s machine learning technology and audio-visual processing technology of the University of Southern California. The GD-IQ was used to look at gender representation by analysing screen and speaking time (Narayanan and Heldman, 2017). Most research on the subject of gender representation in film had been done manually until then, but this tool automated the process. In the report, the top 100 grossing films of 2014 and 2015 were analysed. The conclusion was that female characters are still underrepresented in popular films. The female characters that are present have less screen and speaking time than the male characters in the film (Narayanan and Heldman, 2017).

2.1.2 Representation of women off-screen The gender inequality reaches further than only the actors and actresses. The research by the Media, Diversity, & Social Change Initiative shows an imbalance behind the scenes of films, where only 4.1% of the directors of the analysed films are women (Smith et al., 2016). Other researches show similar numbers about the inequality within the industry. The ratio of men working on films to women is five to one and in the top 250 films of 2012 only 9% of the directors were women (New York Film Academy, 2013). The percentages between the two studies are slightly different because of a different population, but both numbers show inequality. The greatest imbalance is among cinematographers where only 2% of this group are women(Lauzen, 2016). Interesting to mention is that when a woman is directing, an increase of 10,6% in female characters appears (New York Film Academy, 2013).

2.1.3 Box office and critical acclaim One measure of success for a film is the box office. An interesting question to ask is whether there is a difference in box office between films that offer an equal representation of gender on- and off-screen and films that do not. There appears to be a link between films with a strong female presence and a lower box office (Lindner et al., 2015). According to Lindner et al., this is not caused by a so called downstream effect, which would mean that the public is less interested in films featuring women. The cause seems to be an upstream effect. The problem starts with the budget of a film, which is one of the best predictors of the box office of a film (Lindner et al., 2015). The research shows that films that pass the Bechdel test and have strong female presence have a lower budget than films that don’t pass the test and have less of a female presence. Another study shows that films that do pass the Bechdel test have 35% less budget than films that don’t pass the test (Hickey, 2014). To find out why films with more female representation were getting less funding, Hickey reached out

5 to several producers, journalists and entrepreneurs in Hollywood. The most common belief was that films featuring women do not do well internationally, which is an important factor for foreign pre-sales. They also pointed to the assumption that American audiences prefer films with a male leading role. Other measures of success are ratings by the public and critics. The Internet Movie Database (IMDb) is a large database of films, with their cast and crew. Users can also write reviews and rate films on the website. For this research, the assumption will be made that the IMDb rating reflects the public opinion. Critics rate films in several other places. Almost all reviews by critics from different papers, websites and other sources are grouped and combined to one rating between 0 and 100 on the website Metacritic 1. This rating is also shown on the IMDb page of a film. Critics have several ways of making judgements about a film. They can either influence or predict (Lindner and Schulting, 2017). The influencer role means they look at the artistic aspect of a film, like the technique, meaning and performance of the actors. The predictor role is more about commercial aspects that are likely to make a film successful of popular, like the popular actors or the familiarity of the plot. In both roles the critics have influence on the audience, but they do not appear to have a gender bias (Lindner and Schulting, 2017). To check this and compare the ratings on Metacritic to the IMDb rating, both variables will be added to the dataset.

2.1.4 Conclusion Before choosing a fitting visualisation method, it is important to determine the scope of the data. The data that will be included in the visualisation is based on the related work on gender inequality in the film industry (Hickey, 2014; Narayanan and Heldman, 2017; Smith et al., 2016; Lauzen, 2016; New York Film Academy, 2013; Lindner et al., 2015; Lindner and Schulting, 2017). To give more context about the film, the year and genre will also be added to the data set. The variables in the data set will be:

• Percentage of female directors • IMDb rating

• Percentage of female producers • Metacritic rating • Percentage of female writers • Box Office • Percentage of female cast • Year of the film • Percentage of female words • Genre of the film

2.2 Information visualisation 2.2.1 Interaction Information visualisation is a way to communicate abstract data through a visual interface. In the ﬁeld of information visualisation, interaction is a crucial

1Metacritic rating. Available at: http://www.metacritic.com/

6 part. An information visualisation without interaction is like a static image or automated animated image, because the user can’t manipulate the visualisation (Yi et al., 2007). Static images can have analytic value, but the value and use- fulness are limited. Defining what interaction means in the field of information visualisation is still a hard task. According to Yi et al., interaction techniques in information visualisation can be viewed as features that enable users to manipulate and interpret the representations (2007). A static image, according to this point of view, does not enable this interaction. Research has shown that interaction is an important part in the process of gaining insight with the use of a visualisation (Yi et al., 2008). The ability to filter data to explore a large amount of data enables the user to filter out some data they might not need. Adjusting and detecting patterns are ways to formulate an hypothesis and test the expectations through interaction with the data. The research also shows that users enjoy the ability to interact with the data. Ben Shneiderman has defined seven tasks the user should be able to perform in a visualisation (Shneiderman, 1996). The seven tasks are: overview, zoom, filter, details-on-demand, relate, history and extract. The user should be able to get an overview of the entire data set, but also be able to zoom in on certain items. It should be possible to to filter out data and get details on an item or a group on demand. History is important to keep track of the actions and allow the user to go back to the original state of the visualisation. The user should also be able to relate, to find relationships among items. The last task is extraction, which allows the user to extract sub-collections of the data.

2.2.2 Visualisation techniques There is no consensus on what information visualisation technique is the best solution for visualising a data set with several dimensions (Siirtola and Räihä, 2006). A survey of powerful visualisation techniques by Heer et al. describes various visualisation techniques and their best practices (Heer et al., 2010). The research describes several visualisation techniques for different types of data. The best fitting category for the data set in this study is the statistical distri- butions category. This category presents visualisation techniques best suited for exploratory data analysis. The goal of exploratory data analysis is gaining insight into the distribution of the data. Most of the presented techniques, like stem-and-leaf plots, Q-Q Plots and scatter plots are able to visualise two dimensions in one graph (Heer et al., 2010). Another technique described in the research is the parallel coordinates technique developed in 1990, shown in Figure 1 (Inselberg and Dimsdale, 1990). This technique is widely used to visualise multidimensional information and has the ability to show more than two dimensions in one graph. This technique allows all variables of the data set in this study to be visualised in one visualisation, instead of just two of the variables.

7 2.2.3 Parallel coordinates Visualisations using parallel coordinates can look very overwhelming at first, with a lot of overlapping lines. When used correctly, the visualisation has the ability to bring meaningful patterns to light when interacting with the visualisation (Few, 2006). Parallel coordinates look quite similar to normal line graphs, but should be read differently. The lines do not display a change through time, like in a line graph, but they connect a series of values. All these values are different variables, which could represent multiple aspects of an entity, like a film. When using this technique, the graphs can become cluttered and harder to read when the data set is very large. It also takes practice from the user to use it properly and might be best suited for experts (Shneiderman, 1996). According to Siirtola and Räihäthe problem is not the complexity of the graph, but the lack of interaction (Siirtola and Räihä,2006). They discuss the possibilities of parallel coordinates and how to interact with it. The initial aim of the technique was to represent numerical data with equal scales on all axes. It was later also used with different scales on the axes to represent ordinal and even nominal attributes in a similar way. Figure 1 shows an example of a parallel coordinates graph, with several attributes of a car, like the power and the production year of the car. The axes all have different scales and should be read differently. Without interaction it is not an easy task to discover clusters or connections between variables.

Figure 1: Example Parallel coordinates (Siirtola and R¨aih¨a,2006)

A useful interaction technique to make sense out of the data in a parallel coordinates graph like Figure 1 is brushing. This technique enables the user to zoom in on items and relate them to each other, which are important interactions (Shneiderman, 1996). Figure 2 shows a common brushing technique, where a part of an axis is selected to make a selection of certain lines/items. The lines that are not selected turn into a diﬀerent colour to distinguish the lines that are selected from the lines that are not. It is also possible to select a single line, which gives the user details-on-demand. It gets interesting when multiple

8 brushes, or multiple selections on axes, are used at once. This interaction with the data makes it possible to discover clusters and leave irrelevant data behind (Siirtola and R¨aih¨a,2006).

Figure 2: Brushing example (Siirtola and R¨aih¨a,2006)

2.2.4 Conclusion Two sub questions were formulated to support the research question. Research shows that interaction is crucial with information visualisations and interactive visualisations have more analytic value than a static image (Yi et al., 2007). Users also enjoy interacting with data and it is necessary in the process of gaining insight (Yi et al., 2008). There are several tasks a user should be able to perform: overview, zoom, filter, details-on-demand, relate, history and extract (Shneiderman, 1996). These tasks support the interactive process and will be used to create the visualisation. To answer the second question, it is important do mention that there is not one best visualisation technique (Siirtola and Räihä,2006). One useful technique for exploratory data analysis however, is the parallel coordinates technique first introduced by Inselberg (Heer et al., 2010; Inselberg and Dimsdale, 1990). The technique can be complex and needs some practice, but with the right interactions, like brushing, it allows the user to perform some of the important tasks formulated by Shneiderman (Shneiderman, 1996). The interactions makes it possible to discover clusters and leave data behind (Siirtola and Räihä,2006). For these reasons the parallel coordinates technique was chosen to visualise the multifaceted gender inequality in the film industry, because it is possible to combine all variables from the data set in one visualisation.

3 Methodology

Chapter 2 described the importance of interaction and showed that the parallel coordinates technique is a fitting visualisation technique for complex data. To answer the main research question, an interactive parallel coordinates visualisation will be made and evaluated. The participants for the evaluation process were students from the study fields of Media & Culture and Gender Studies, because they are the target group of the visualisation. This chapter describes how the data was collected, how the prototype was made and tested, and how the final visualisation was made.

9 3.1 Data collection 3.1.1 Start data The Media, Diversity & Social Change Initiative and the Geena Davis Institute on Gender in Media were contacted for this research to get access to a large data set to use for the visualisation (Narayanan and Heldman, 2017; Smith et al., 2016). The data sets were considered because they contained a lot of the variables that are needed for the visualisation. It was not possible to obtain this data, but the data set from the research on dialogue in films by the Pudding was available on GitHub2 and suitable for this research (Anderson and Daniels, 2016). Their data set consisted of information about 2000 films. The information about one film was divided over three csv-files. The first file contained the film, the IMDb-id, the box office and the production year of the film. The second file was a list of film characters with their gender and the number of words in the script. This list only contained characters with at least one hundred words. The third file was used to map the characters to the id of the actor on IMDb and the film they appeared in. The three csv-files were read with Python 3 and then transformed into pandas data frames. Pandas4 is a library to analyse and manipulate data structures with the programming language Python . The data set did not contain all the data necessary to make the visualisation. The Metacritic rating, IMDb rating, names of the actors, directors, writers and producers with their gender had to be added from other sources. Most of the data was obtained through the Open Movie Database (OMDb) and the rest was scraped from IMDb. IMDb was chosen as source because of its reliability. The information from IMDb is verified by studios, filmmakers and other people in the industry. They use as many sources as possible and their data goes through consistency checks. Not all data was available for every film. Sometimes there was no Metacritic score or the box office was unknown. When one or more of the variables of a film was missing, this film was dropped from the dataset. This resulted in a dataset of 1285 films.

3.1.2 OMDb OMDb has an API to obtain data of any film available on IMDb. This API was preferred over scraping everything from IMDb, because it was very fast. It returns a JSON object with the title, ratings, director, writer, genre and more. Films can be searched using an API key and the IMDb-id of a film. The IMDb- id’s were available in the data set of the Pudding. These id’s were used to get the genre, Metacritic rating, IMDb rating, directors and writers for every film. This information was added to the data frame of films using the IMDb-id to add the data to the right film.

2Film Dialogue Data set. Available at https://github.com/matthewfdaniels/scripts/ 3Python Software Foundation, version 2.7. Available at http://www.python.org 4Python Data Analysis Library, version 0.20. Available at http://pandas.pydata.org

10 3.1.3 IMDb The data set of the Pudding did not contain the names of the actors and producers of the ﬁlms. This data was not available through OMDb, so it was scraped from IMDb. The names of the actors and producers were scraped from IMDb using the Python library BeautifulSoup5. The IMDb pages were accessed through the IMDb-id of the ﬁlms. BeautifulSoup parsed the HTML code of the page and from the HTML code the element with the names were extracted. These names were added to the data frame.

3.1.4 Gender classification The genders of all characters were available through the data set of the Pudding. The assumption was made that this gender corresponds with the gender of the actor playing the character. The genders of the directors, writers and producers had to be determined using a gender detector in Python. The library classifies the gender by their first name. If the name is ambiguous it is classified as unknown. The gender detector classified roughly 75% correctly and the rest was classified as unknown. The gender of unclassified people was determined by hand to complete the data set. With this methodology it was not possible to take other genders than male or female into account for any of the categories. An example of the gender detector is outlined in Figure 3.

from gender_detector import GenderDetector

detector= GenderDetector('us') #language detector.guess('Amy') # => 'female'

Figure 3: Python gender detector.

After all genders were registered in the data set, the percentage of women was calculated for every film for several categories. The percentages were calculated for the categories of cast, directors, writers and producers. The amount of dialogue for every character was represented in the data set of the Pudding by the number of words. To compare all films, this category was also transformed to a number that represents the percentage of spoken words by women. After classifying all genders and calculating the percentages, the data set was ready to use for the parallel coordinates visualisation. The final data frame consisted of seventeen columns. The ten columns shown in the example row in Figure 4 are the columns that were needed to create all the lines in the parallel coordinates plot. The other columns were year, names of the directors, names of the writers, names of the producers and the names of the cast. These columns were used to give context about the film and this information only appears after selecting a single film in the visualisation.

5Python Screen-scraping Library, version 4-4.6. Available at https://pypi.python.org/pypi/beautifulsoup4

11 Figure 4: Data frame row.

3.2 Visualisation 3.2.1 Design of the visualisation In the visualisation, the parallel coordinates graph is the most important compo- nent. There also needed to be space for an introduction about the visualisation, details-on-demand and filtering to include the important tasks formulated by Shneiderman. The visualisation was divided into five blocks (Figure 5). The introduction in block 1 gives the user information about the gender inequality in the film industry and the use of the visualisation. Block 2 contains the parallel coordinates graph, where the y-axes are the columns from the data frame in Figure 4 . If a line in the graph is selected, information about the film appears in block 3. In block 4 the user can select one or more genres to filter the lines in the graph. The last block contains a search bar with suggestions to look for a specific film.

Figure 5: Visualisation design.

The several blocks were chosen with the seven tasks of Shneiderman in mind (Shneiderman, 1996). When the visualisation is opened, an overview is provided of all the data. To zoom in on specific data, the user can select a single line in the graph or search a film through the search bar. This also provides details-on- demand, because more information about the film will appear after selecting a line. The lines in the graph can be filtered by genre to reduce the amount of data and compare the films from different genres. To relate items to each other the brushing function can be used. The subject of history was also implemented. After making a selection, a button with an ’x’ will remove the selections and to remove a brush the y-axis has to be clicked. These actions will recover the visualisation to the original state. The design in Figure 5 was translated to a prototype for testing.

12 3.2.2 Prototype testing After choosing the visualisation method of parallel coordinates, a prototype was made in the prototyping tool Just in Mind 6 (Figure 6). This prototype contained some of the basic interactions like filtering, the search function and clickable lines to get more information. Figure 6 shows the visualisation when a single film is selected in the parallel coordinates graph and additional information about the film is displayed at the bottom-left. To test the functionality of brushing another prototype was made, that was linked to the data (Figure 7). This could not be done in the prototyping tool, so it had to be tested separately. The second prototype was made with the JavaScript library D3.js7.

Figure 6: Prototype 1.

6Just in Mind Prototyping Tool. Available at https://www.justinmind.com 7Data-Driven Documents, version 4.9.1. Available at https://d3js.org

13 Figure 7: Prototype 2.

To evaluate the two prototypes, three students in the field of Media & Cul- ture and Gender Studies were used for testing. According to research by Nielsen & Landauer, the optimal number of participants for usability evaluation is between three and five participants (Nielsen and Landauer, 1993). This number is based on an optimal benefits/cost ratio and how many of the usability problems were found in the test. Based on this theory, three people, one man and two women, were chosen for the test. The prototypes were used for a short usability test on the three students. In a session of 10 minutes, the participants were asked eight short questions/tasks:

• What does a line in the visualisation represent? • Can you read the information from the line? • What do you expect to happen after clicking a line?

• Click on a line. Is this diﬀerent from your expectations? • Look up the ﬁlm Her. • What do you expect to happen when changing the genre selection?

• Select the ﬁlms with the highest Metascore. • Is there any information missing for you?

In these usability tests the interactions that would be implemented into the visualisation tool were tested. This way the interactions were tested early in the process of making the visualisation and could be improved. The goal of

14 the tests was to find any interaction or usability problems, so they would not influence the evaluation of the end product. With these tests a few problems were uncovered. When clicking the line in the prototype, all users expected to get an oversight of the exact numbers where the line crosses the y-axes. The numbers only appeared next to the line in the graph, but this was not very clear. Two users missed the ’x’ button on the bottom-left to remove the information about the selected film. Another point of feedback was the amount of information that appeared after clicking a line. According to two of the users there was a little too much information all at once. They said it would be better to get less names. The representation of a film as a single line in the parallel coordinates was clear for all users and they were able to read the graph without much effort. One of the users immediately linked all the variables together and formed the hypothesis that the graph would probably look very different if only films in the genre romance were selected. Prototype 2 was then used to test the brushing functionality. Users were asked to select the films with the highest Metascore. They all succeeded quickly and thought the brushing effect was a good feature. The only problem here was the overlapping of all the lines. Because the lines are all the exact same colour, it is hard to distinguish a single line. The lines were also too thin to be clicked on.

3.2.3 The visualisation in D3 After the prototype testing, the visualisation was built using D3.js 8(Figure 7- 10, see Appendix A-D for larger versions). Based on the prototype testing, a few things were changed. The ’x’ button was removed. To remove the selected lines and the extra information the user has to press the escape key in the eventual visualisation. The information about the selected film that appeared on the left-bottom was also altered.The exact numbers where the line crosses the axes are shown and only the directors, writers and cast are listed. The large amount of overlapping lines were a problem in the prototype, so the lines were given various colours. The colours were based on the first listed genre of the film. The last problem that arose from testing was the problem with selecting a line. The lines were made thicker, so clicking a single line was easier. Figure 8 shows the whole visualisation. The blocks that were defined in Figure 5 remain the same here. The user can brush along the y-axis to make a selection. Figure 9 shows a selection of films with a score of 80 or higher on Metacritic. In block 1, the introduction block, the counter shows how many films are selected. In Figure 10 all films in the genre Action are shown and the film Inception is selected. Information about the film appears in block 3 and the selected line is thicker and brighter than the other lines so it can be seen clearly. Finally, Figure 11 shows the use of the search bar. The film Her is entered and the line of the films is automatically selected. The information about the film also appears automatically.

8Gender inequality in the ﬁlm industry. Available at https://sophieensing.github.io/

15 Figure 8: Start screen

Figure 9: Brushing function

16 Figure 10: Genre and line selection

Figure 11: Search function and line selection

4 Evaluation

To test the eﬀectiveness of the visualisation, it needs to be evaluated. Usability testing is an approach to evaluate a system (Rogers et al., 2011). Characteristic for this approach is a controlled setting, where the evaluator controls the test and gives tasks to the participant. For the evaluation process a qualitative testing method was used with three participants from the study ﬁelds of Media & Culture and Gender Studies. The number of participants was again chosen based on the theory of Nielsen & Landauer (Nielsen and Landauer, 1993). Ten tasks (described in section 4.1) based on a taxonomy of tasks for evaluation by Valiati et al. were used to

17 evaluate the visualisation (Valiati et al., 2006). The usability of the visualisation and the stimulation were measured with these tasks. If one of the tasks can not be completed, this is a usability problem. The stimulation was measured by the reactions of the participants to the visualisation and the way the tasks were carried out. The participants were asked to think-aloud while carrying out the tasks. The screen and audio were recorded. The screen recording was used to match what the participants said on the audio recording to what they were doing on screen. The think-aloud process was transcribed and then coded to analyse the process.

4.1 Tasks To evaluate a multidimensional visualisation technique, like parallel coordinates, an approach introduced by Valiati et al. was used (Valiati et al., 2006). The paper proposes a taxonomy of visualisation tasks to evaluate the visualisation. These tasks were used to create specific tasks for the visualisation in this paper to evaluate the effectiveness. The taxonomy consists of seven tasks types, which are: identify, determine, visualise, compare, infer, configure and locate. Table 1 shows every type of task and an explanation for every task.

Task type Explanation A task with the goal of finding, discovering or estimating Identify some new information regarding the data. Determine A task of calculation, defining or precisely indicating a value. A task where the user manipulates parameters to change the Visualise visualisation. An analytic task where the user compares data items, values Compare or clusters in the visualisation. The task of defining hypotheses, trends or characteristics Infer of cause and effect. Does not happen at once, but is part of the process. The task ends when the user gains insight. A task of user interaction with the system to select the desired Configure options for the visualisation, like filtering and re-ordering dimensions. A task related to information already visualised, identified or Locate determined. The particular task is spotting the precise position of the desired information.

Table 1: Seven tasks

Most tasks types are combined with another task type when a particular task is carried out. Very often one is a sub-task of the other or several tasks need to be performed in a sequence to get the desired result. Two of the tasks, visualise and conﬁgure are tasks that support the other analytical tasks. Conﬁgure and visualise are not given a particular task in the evaluation because they are sub- tasks for all of the other tasks.

18 Two tasks were formulated for the task types identify, locate, determine, compare and infer (Table 2). The tasks don’t tell the participant how they should get to the result and what parameters should be taken into account. This way the participant has more freedom. By analysing the think-aloud process it will be possible to uncover if the visualisation is stimulating for the participants. Task 1 and 2 are about the identification of relationships and which variables influence each other. The identifying tasks force the participant to think about what a good gender representation in the film industry is and what variables play a role here. It is up to the user to form an opinion on this and then apply this to the visualisation. Task 3 and 4 are about locating particular data points. Task 3b is more analytic, because the participant has to interpret the data and describe what they see. The determine tasks (5 and 6) are about indicating a value with precision. The compare tasks (7 and 8) let the participant compare different categories by filtering the data. The answers here depend on the interpretation of the participants. The last two tasks are about forming an hypothesis and testing it with the data in the visualisation. It is important that the hypothesis is formed first with task 9a and then tested with task 9b. Task 10 is used to uncover any insights gained by the participants and to ask for their opinion.

Task type Task/question 1. Find the films with a good gender representation. Identify 2. Identify the relationship between a female director and box office. 3a. Find the film Interstellar. Locate 3b. Describe the gender representation in the film. 4. Locate the highest rated films directed by women. 5. How many films have a score of 100 on Metacritic? Determine 6. How many films have a bad gender representation? 7a. Describe the films in the genre action. Compare 7b. Describe the films in the genre drama. 8. Compare the films in the genres action and drama. 9a. Do you expect a connection between gender representation and the rating of a film? If yes, what connection? Infer 9b. Research this hypothesis with the data. 10. What is your conclusion about gender inequality in the film industry? Is this different from what you thought before?

Table 2: Tasks for evaluation by task type

4.2 Participants The three participants for the evaluation process are diﬀerent from the participants in the prototype tests, but they do have a similar background. Participant 1 was a woman with a bachelor’s and master’s degree in Media & Culture. Par- ticipant 2 was a woman with a bachelor’s degree in Media & Culture and a

19 master’s degree in Gender Studies. Participant 3 was a woman with two bachelor degrees in Media & Culture and Information Studies.

4.3 Experiment set-up The ﬁrst two tests were carried out at a quiet coﬀeehouse and the third was carried out in the university library. At the beginning of the test the participants were given a short introduction about the subject and the process of the test was introduced. I explained the think-aloud method and asked them for permission to record the screen and audio. It was also stressed that the participants should not feel any pressure because the process was not about testing them, but evaluating the visualisation. After the introduction I started the recording and gave them the tasks.

4.4 Results/Discussion The audio recordings of the tests were transcribed and events were matched with the screen recordings when something was unclear. The transcriptions were then coded. Based on a ﬁrst round of coding everything that was said, the following categorisation scheme was formed:

• Approach classiﬁes the parts where the participants talk about their approach to carry out the task. This may contain what variables they use or want to manipulate or if they want to use the ﬁlter function.

• Expectations was used when the participant talked about their expectations. This could be a hypothesis or a reaction to the data showing something that was diﬀerent from their expectations. • Analysis of data was used when the participants described the data in the visualisation or the results they saw after manipulating the data.

• Conclusion describes the part where the participant gained insight or formed a conclusion. • Stimulation was used to describe events where the participant was stimulated in any way by the visualisation. This happened mainly in two ways. The participant either got oﬀ track during the task and wanted to discover other parts of the data or they had a strong reaction to what they were seeing. Both of these events were categorised as stimulating, because the visualisation had an eﬀect on the participant. • Usability described everything regarding the usability of the visualisation. The statements in this category were about improvements as well as things that were appreciated.

The results will be discussed per task type in the next section. From this point, the diﬀerent participants will be referred to as P1, P2 and P3.

20 4.4.1 Identify tasks The first task was to find films with a good gender representation. All participants completed the task in a slightly different way, because they had a different view of what a good gender representation is. P1 made a selection of the films with 80% or more female cast, directors, writers, producers and words. She expected to see a list of the remaining films, but could only see that 28 films were selected. She looked at some of the films individually and then changed her mind about the selection she made. She changed it to include all films with 50% or higher female cast, directors, writers, producers and words. After looking at a few films, she concluded films within these margins had a good gender representation. P2 looked at the same variables as P1, but made a different selection. First she removed the genres action, crime, horror and sport from the dataset because her assumption was that these films don’t have a good gender representation. Then she selected a margin of 50% for all variables. Only two films remained with this selection. She reacted to this strongly and wanted to look at the films more closely. She clicked the lines and analysed the extra content. Both films were familiar to her and she said they had a good gender representation. She thought it was not surprising that both films were of the genres comedy and drama, because these genres offer quite good gender representation. P3 only made a selection of the percentage of female cast, because this best showed the gender representation according to her. She selected a high margin of 80% or more, like P1. She concluded these films had a good representation of gender. All participants were able to make a selection of films that had a good gender representation according to their own views. They were able to manipulate the variables they thought were important and they managed to do this without exact instructions. They all started with a hypothesis of what they thought was a good gender representation. P1 uncovered a usability problem, because she was not able to see at once what films were selected. During this task both P1 and P2 were stimulated. P1 changed her approach as a reaction to the data. P2 was stimulated because she analysed the films on her own, which was not necessary for the tasks. Important to mention is that P2 was the only one to remove certain genres from the visualisation, because her assumption was they would not have a good gender representation.

The second task was to identify a relationship between a female director and the box office of the films. All three participants were unsure what to look for and had a different approach in identifying the relationship. P1 started by looking at a few individual films with the highest box office. The five highest films were directed by men. She then wanted to see films with a low box office. Her conclusion was that most films were still made by men, but based on this data it was hard to draw a conclusion. Her conclusion was that male directors have made a lot of films with a low box office, but they are also responsible for the most successful films.

21 P2 made a selection of all films directed by a woman. She expressed she enjoyed using the visualisation. Immediately she concluded that most of the selected films had a low box office. A few lines with a very high box office caught her attention and she looked at the highest one, which didn’t have a female director. She was not pleased with the results and had a strong reaction. She then looked at the films with the highest box office that were directed by a woman per genre. She expected to see all comedy and drama films, but there were a few exceptions even though most films were comedy or drama. While doing this she articulated again how shocking she found the results. Her conclusion was that films directed by women have a low box office. Then she noticed another relationship between female directors and other people behind the scenes. She concluded that that female directors work a lot with female writers, but not necessarily with female producers. P3 started with a selection on the axis of box office. She moved this selection from top to bottom. As the selection went lower, more films appeared. She said that films with a low box office correlate with a higher percentage of female directors. She also noticed that the films with the highest box office are all directed by men. Her conclusion was that films directed women have a lower box office. P2 and P3 came to a similar conclusion. P1 approached the tasks more analytical and concluded that the films directed by men and films directed by women were very hard to compare because the large difference in the size of the groups.

4.4.2 Locate tasks The goal of the locate tasks was to find specific data points in the visualisation. Task three was divided into two tasks. First they had to find the film Interstellar 9 and then they had to talk about the gender representation in this film. P1 analysed all the different variables. There were no female directors or writers, but because this was both only Christopher Nolan this was not a very big deal. There were 60% female producers, which she thought was very good. Only 20% female cast was not that good, but the women did have 20% of the dialogue. This relationship was very positive according to P1, because in a lot of films women have less dialogue then man. P2 had a similar method of analysis, but she was also able to incorporate her own analysis of the film. She was not surprised that the film was written by a man, because she already had that feeling while watching it. She was surprised by the amount of female producers, but disappointed by the amount of women in the cast. She concluded the film does not offer a diverse representation of gender. P3 also analysed the different variables, but instead of looking at the line like P1 and P2, she looked at the exact variable values in the extra info about the film. She was surprised by the numbers, even though she had seen the film. She

9Interstellar (2014): http://www.imdb.com/title/tt0816692

22 said she did not like the low percentage of women in the film. The percentage of female producers however, was a positive point. All three participants were able to find the film without struggling. P3 had a different way of analysing again, because she didn’t only look at the line. P2 en P3 were able to match the data to their own opinion of the film and they were stimulated by the data.

Task four was to locate the highest rated films directed by a woman. All three participants selected the films by a female director first and then looked at the films with the highest IMDb and/or Metacritic rating. Interesting to mention is that both P2 and P3 were very surprised to see the highest rated film was The Matrix 10, because they didn’t think this was directed by a woman. The Matrix was directed by the Wachowski twins, who were both born biologically as men. After The Matrix the Wachowski’s announced publicly they actually identified as women.

4.4.3 Determine tasks Task ﬁve was easy for all participants. They had to select the ﬁlms with a score of 100 from the critics on Metacritic. They all managed to make this selection and give the right answer, which were: Boyhood11 and The Godfather12.

There was more variation with task six. The task was to find how many films have a bad gender representation. The answer to the question was dependent on what the participant considered a bad gender representation. For P1 the films with a margin of 0 - 30% female cast, directors, producers, writers and words had a bad gender representation. This resulted in 363 films. She then decreased the range to 20%, but concluded that 30% was a better limit. P2 and P3 only looked at the percentage of female cast. P2 chose 25% as a limit and P3 chose 10%. They were both shocked at the amount of films that remained. With the maximum of 25% female cast there were 470 films and with a limit of 10% there were 94. P2 also looked at the 10% margin and P3 also looked at films with no women at all in the female cast. P3 reacted quite strongly to seeing two films that fit this condition. P2 and P3 both concluded that a lot of films have a bad gender representation. The difference between P1 and the other two participants was quite interesting. She used different variables and she did not react as strongly to the amount of films with a bad gender representation as P2 and P3. P2 and P3 both looked at the representation on screen and were surprised by the results.

10The Matrix (1999): http://www.imdb.com/title/tt0133093 11Boyhood (2014): http://www.imdb.com/title/tt1065073 12The Godfather (1972): http://www.imdb.com/title/tt0068646

23 4.4.4 Compare tasks Task seven consisted of two sub tasks. The participants were asked to look at the genre action and at the genre drama. Task eight was to compare the two genres. P1 started with looking at the box office of the action films. She concluded most films don’t have a really high box office and a single film caught her attention. This film had a very low box office and she was curious to see which film it was. She then analysed drama and looked at the different variables. The overall conclusion when comparing both genres was that the IMDb and Metacritic rating were higher for drama films and drama had more female directors and writers than action films. While comparing she looked at both genres separately. P2 analysed action per variable and concluded there were not a lot of women working on these films. She stated this was as she expected and she was even surprised to see a few female directors, even though one of the films was the Matrix. She found the drama films harder to analyse, because there was a lot of fluctuation in the variables. She was surprised that even with drama films there were still more male directors. She assumed that drama would be more female driven because it is usually seen as a more feminine genre, so the amount of female directors didn’t really match her expectations. While comparing the genres she also looked at them separately. She concluded that the action films follow a certain pattern and the drama films don’t. Action matched her expectations, but she was surprised by the drama films. P3 also analysed the genres per variable. She was surprised to see only two female directors in the genre action, but she also concluded it was typical. She thought the female representation was quite low. In the genre drama she thought the representation was better, because there were more women involved in making the films and there was more female dialogue. While comparing the genres she looked at them separately but she also analysed them together in the visualisation. Her conclusion was that action had a lot less female dialogue and female representation overall. With these tasks it was interesting to see how the participants were surprised or not surprised by the results in the visualisation based on their own expectations or assumptions. P3 was the only one to look at both genres together in the visualisation, while P1 and P2 preferred to look at them separately. They were all able to compare groups of films and analyse several variables simultaneously.

4.4.5 Infer tasks The infer task was used to let the participants form a hypothesis and let them test their hypothesis with the visualisation. They were asked if they thought there was a connection between gender representation and the rating of a film. Then they were asked to test their hypothesis. P1 thought there would not be a connection. To test her hypothesis she looked at the cast first. She concluded hesitantly that films with a mainly male cast had a higher rating than a mainly female cast. She did not want to

24 really make this statement because there were a lot less films with a dominantly female cast. Her analysis was similar for the directors. She saw higher ratings with male directors. She was careful drawing a conclusion because there were only 58 films directed by women and more than 1200 directed by men. Her overall conclusion was that she saw some effect, but couldn’t really conclude there was a strong relationship because of the different sizes of the groups she was comparing. P2 expected there to be a connection. She thought that a film directed by a women would probably be looked at in a different way, but she was not sure how this would affect the ratings. To test her hypothesis she looked at the films directed by women. She also selected a high margin of female producers. Only five films remained. The films were rated between a 6 and 7 on IMDb, which was not too bad according to P2. She then looked at these margins for male directors and producers. A lot more films appeared, and the ratings fluctuated a lot. This made sense because of the different sizes of the two groups. She then looked at films with an equal male/female ratio. Only nine films met the criteria. These films all had average to good ratings. She was very disappointed again to see there were very few films that were mainly made by women, but a lot of films that were mainly made by men. Her conclusion was that there were a few relationships, but it was too hard to compare because of the different sizes of the groups. P3 did not expect a connection. P3 had a similar approach to P1. She was quite surprised to see there were no films that had 90% or more female directors, producers, cast, writers and words. She had a similar conclusion as P1 and P3. With this task all three participants formed their own hypothesis and had a slightly different approach to test it. They carried out their own little research and used analytical thinking to come to a conclusion. Their conclusions were thought out. The groups were hard to compare because there are a lot more films created by men than films created by women. They all saw some connections and relationships, but were cautious to draw any conclusions.

The last task was to form a conclusion on gender (in-)equality in the film industry based on the visualisation. The participants were also asked if this conclusion was different from what they thought before using the visualisation. All three participants said that what they had seen matched their expectations, with a few exceptions. P1 mentioned it was interesting to see there were a lot of women writing drama films. P2 was quite surprised by the genres comedy and drama, because her expectation was that these genres were dominated by women behind the scenes and on screen. She also said the really liked using the visualisation and thought it was very clear. She thought it would really work well for people in the field of gender studies and people who have a bit of knowledge about the topic. Her conclusion was that even if you know something about the inequality, this visualisation shows how bad it really is. P3 said she was surprised and disappointed to see the differences between the genres and that a lot of films don’t have a good gender representation.

25 4.5 Limitations There are a few limitations to keep in mind with this research. First of all, there were only three participants for the prototype testing and the evaluation of the visualisation. Even though three people is in line with the theory of Nielsen (Nielsen and Landauer, 1993), it should be tested with more people to draw better conclusions. It could also be beneficial to test the visualisation with a more diverse group of people. The people that were used for the evaluation were all of a similar age, background and gender. All three participants were women. In the study fields of Media & Culture and Gender Studies there are a lot more women then there are men and unfortunately there weren’t any men available for the tests. Another limitation is the data set that was used for the visualisation. There could have been mistakes with the classification of the genders. This would mean that the percentages were maybe not always accurate. These mistakes probably don’t affect the visualisation as a whole, but are important to mention. The genres in the data set were extracted from IMDb. All films have one to three genres, so a film could be both drama and action. In the visualisation a film like this appears in both categories. The problem this could cause in the visualisation is that the categories could be hard to compare, because there are a lot of films that don’t really represent the genre. A ranking system for the genres could be a way to fix this, where the most representative genre of the film would get the highest ranking. The colour of the line of the film would be the colour of the most representative genre.

5 Conclusion

The question to be answered in this research is: What is a stimulating way to visualise the multifaceted gender inequality in the Hollywood film industry? Chapter 2 started with a literary review of the gender inequality in the film industry. The research on this topic was defining in determining the scope of the data to be visualised. The several researches showed where inequality played a part and what variables influenced each other. To help answer the main question, two sub questions were formulated. In chapter 2 these questions were answered. The first question was: What is the purpose of interaction in a visualisation? The conclusion was that interaction is necessary in a visualisation, because it has more analytic value than without interaction. Interaction also provides users with insight and makes a visualisation enjoyable to use. The second question was: What is the best visualisation technique for complex data? The visualisation technique that seemed most suitable for the data set was parallel coordinates. This technique allows for many variables to be combined in one visualisation. With the right interactions, this technique enables the user to perform the tasks formulated by Shneiderman.

26 Based on chapter 2, the parallel coordinates technique would be a stimulating way to visualise the multifaceted gender inequality in the Hollywood film industry. To test this theory, the visualisation was evaluated. The results of the evaluation showed a few important findings. The most important conclusion was that all participants were able to carry out the tasks successfully. While carrying out the tasks, all participants followed a certain pattern. This pattern consisted of three steps with a few variations. The first step was to think of what approach they were going to use. This could be about the variables they wanted to include or how to get the data they needed. Their approach could have been influenced by the available parameters in the visualisation, but the parameters were not restrictive. The participants often had a different approach and used different parameters to carry out a task, which shows they were not restricted. This aspect of the visualisation also has a negative side, because it can support a bias of the user. P2 had a feminist point of view and with some tasks she selected certain parameters to support her biased view. In one of the tasks for example, she removed data from the visualisation based on an assumption she had. This supports her bias, because the data is not even shown in the visualisation. The second step was to analyse what they saw. All participants were able to interpret the data in the visualisation. The third step was forming a conclusion. This was not the case for all tasks, because for some tasks they just had to look up specific data point in the visualisation. When they gained insight or saw connections between variables, they articulated this in the form of a conclusion. There were exceptions to this pattern. The participants also talked about their expectations and the usability of the visualisation. Another exception was when they were stimulated by the visualisation. In some cases, the participants said what they expected to see. They either formed a hypothesis or were surprised by parts of the data, because their expectations were different. There were a few small usability problems. Most of these problems were in the beginning of the evaluation. The visualisation offers a lot of options and in the beginning they had to get familiar with the visualisation. There was one usability problem that could be improved in the visualisation. P1 expected to see the films that were selected by a brush in a list somewhere on the screen. This is easy to add to the existing visualisation. Finally, all three participants were stimulated by the visualisation during the test. They reacted to the visualisation either verbally or with their actions. The verbal reactions were mostly of surprise or disappointment about the inequality. These are signs of stimulation because the participants had strong reactions to what they saw on screen. The actions that showed they were stimulated were actions they were not asked to do. This happened when something in the visualisation, that was not necessarily related to the task, caught their attention and they took a closer look at it. The results of this study indicate that the parallel coordinates technique is indeed a stimulating way to visualise the gender inequality in the Hollywood film industry, even though there might be other fitting techniques. The participants were able to carry out important tasks and were stimulated by the visualisation.

27 They could also apply their own views to the data with individual settings and they were able to test their expectations. Finally, all participants thought the visualisation was able to show the gender inequality and, according to P3, how bad the situation really is.

5.1 Future Work There are two ways to carry out future research. The visualisation could either be used by people in the study field of Gender Studies for example or the visualisation could be improved. The visualisation could bring to light connections that could be further re- searched by people in the study field of Gender Studies. The variables are only represented by numbers in this visualisation. The amount of female dialogue and cast are examples of this. It is very interesting to look at the dialogue semantically and analyse how women are actually represented on screen. There are also several things that could be done with the visualisation as future work. The visualisation takes a fixed set of variables into account. It would be interesting to do research on an adaptable version of the visualisation, so the user can manipulate what variables are shown. As mentioned in the limitations section, genre is one of the variables that could be used in several ways. Genre is now the only categorisation variable that was used, but more variables are suitable for filtering. Several versions of the visualisation should be developed and tested to find an optimal visualisation to show the gender inequality in the Hollywood film industry.

28 References

Agarwal, A., Zheng, J., Kamath, S., Balasubramanian, S., and Dey, S. A. (2015). Key female characters in film have more to talk about besides men: Au- tomating the bechdel test. In HLT-NAACL, pages 830–840. Anderson, H. and Daniels, M. (2016). The largest analysis of film dialogue by gender, ever. Retrieved June 2, 2017, from https://pudding.cool/2017/ 03/film-dialogue/. Few, S. (2006). Multivariate analysis using parallel coordinates. Perceptual edge, pages 1–9. Heer, J., Bostock, M., and Ogievetsky, V. (2010). A tour through the visualization zoo. Commun. Acm, 53(6):59–67. Hickey, W. (2014). The dollar-and-cents case against hollywood’s exclusion of women. FiveThirtyEight. Retrieved April, 1:2014. Inselberg, A. and Dimsdale, B. (1990). Parallel coordinates: a tool for visual- izing multi-dimensional geometry. In Proceedings of the 1st conference on Visualization’90, pages 361–378. IEEE Computer Society Press. Lauzen, M. M. (2016). The celluloid ceiling: Behind-the-scenes employment of women on the top 100, 250, and 500 films of 2015. Center for the Study of Women in Television and Film. Retrieved from: http://womenintvfilm. sdsu. edu/files/2015 Celluloid Ceiling Report. pdf. Lindner, A. M., Lindquist, M., and Arnold, J. (2015). Million dollar maybe? the effect of female presence in movies on box office returns. Sociological Inquiry, 85(3):407–428. Lindner, A. M. and Schulting, Z. (2017). How movies with a female presence fare with critics. Retrieved June 17, 2017, from https://osf.io/preprints/ socarxiv/dzcq2. Narayanan, S. and Heldman, C. (2017). The Reel Truth: Women Aren’t Seen or Heard. Geena Davis Institute on Gender in Media. Retrieved May 1, 2017, from https://seejane.org/research-informs-empowers/data/. New York Film Academy (2013). Gender inequality in film - an infographic. Retrieved May 1, 2017, from https://www.nyfa.edu/film-school-blog/ gender-inequality-in-film/. Nielsen, J. and Landauer, T. K. (1993). A mathematical model of the finding of usability problems. In Proceedings of the INTERACT ’93 and CHI ’93 Conference on Human Factors in Computing Systems, CHI ’93, pages 206– 213, New York, NY, USA. ACM. Rogers, Y., Sharp, H., and Preece, J. (2011). Interaction design: beyond human- computer interaction. John Wiley & Sons.

29 Shneiderman, B. (1996). The eyes have it: A task by data type taxonomy for information visualizations. In Visual Languages, 1996. Proceedings., IEEE Symposium on, pages 336–343. IEEE. Siirtola, H. and R¨aih¨a,K.-J. (2006). Interacting with parallel coordinates. In- teracting with Computers, 18(6):1278–1309.

Smith, S. L., Choueiti, M., and Pieper, K. (2016). Inequality in 800 Popu- lar Films: Examining Portrayals of Gender, Race/Ethnicity, LGBT, and Disability from 2007-2015. Media, Diversity, & Social Change Initiative. Valiati, E. R., Pimenta, M. S., and Freitas, C. M. (2006). A taxonomy of tasks for guiding the evaluation of multidimensional visualizations. In Proceedings of the 2006 AVI workshop on Beyond time and errors: novel evaluation methods for information visualization, pages 1–6. ACM. Yi, J. S., Kang, Y.-a., and Stasko, J. T. (2007). Toward a deeper understanding of the role of interaction in information visualization. IEEE transactions on visualization and computer graphics, 13(6):1224–1231. Yi, J. S., Kang, Y.-a., Stasko, J. T., and Jacko, J. A. (2008). Understanding and characterizing insights: how do people gain insights using information visualization? In Proceedings of the 2008 Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization, page 4. ACM.

30 6 Appendix

Appendix A: Start screen

31 Appendix B: Brushing function

32 Appendix C: Genre and line selection

33 Appendix D: Search function and line selection