CMSC 838B Information Visualization Visualizing Mailbox

Yoo Ah Kim Min-ho Shin [email protected] [email protected] Department of Computer Science University of Maryland

Abstract In this paper, we propose two visualizations of email dataset to help users perform these tasks: time-based view and thread-based view. Time- Electronic mails are one of the most popular based view displays messages in a two- computer applications. As the number of emails dimensional table, of which rows are people and we exchange increases at high rate, it becomes columns are received/sent time. To scale large more and more important how to manage huge volume of data, we use dynamic query and volume of electronic messages. In addition, zooming method. It also has sort, filter, email data patterns may give us useful aggregate functions to help users find information including his/her personal history. information they need. Thread-based view shows We propose two visualizations of email dataset: emails that belong to the same thread. Threads time-based view and thread-based view. Time- are created using “reply” menu when users send based view displays messages in a two- mails. Thread-based view shows all senders who dimensional table of which rows are people and participated in a thread and messages in the order columns are received/sent time. To scale large of time with relationship of those messages. volume of data, we use dynamic query and zooming method. Thread-based view shows emails that belong to the same thread. It shows Design Goals all senders who participated in a thread and messages in the order of time with relations . View sent/received email patterns among those messages. With email dataset, users may want to see mail Keywords: Electronic mails, time-based view, patterns according to time. Interesting questions thread-based view, scalability are who sent the most emails in a certain period or when a person sent emails most frequently. To Introduction see patterns with large volume of messages, the scalability problem should be solved. We used Nowadays electronic mails are one of the most dynamic query, zooming, aggregation, filtering, popular computer applications. As the number of and gradation to cope with this problem. emails we exchange increases at high rate, it becomes more and more important how to . Find people and emails related to each manage huge volume of electronic messages. other Although emails are invented for asynchronous communication, they are used for other purposes Emails can be threaded using "reply" and several such as task management, personal archives. In users participate in a mail thread. It would be addition, email data patterns may give us useful useful if we can see all participating users and information including his/her personal history. who sent or received emails in the thread with However, there is no proper visualization which relations among them. can meet these purposes. . Search information in the mailbox Emails are used as personal archives to find Outlook 2000 also has time-based view (Figure information in the future. Several studies [8] 2). They display all messages with subject at showed that semantic hierarchies using folders, received time without aggregating by date or the most predominant scheme currently, is not considering senders. Because they used the fixed suitable for this task because it is difficult for width for a day and show all messages with users to organize mail folders properly and subject, the view might be messy and hard to figure out which mail folder has the mail they understand if there are too many messages. In need. Because people may easily figure out the case that many emails arrives for a short time senders and approximate sent/received time of period, they expand y-axis to list them. the message, time-based view can help users find Threading is necessary to help manage conversation history and track the status of conversation in emails [8]. Many systems are developed to visualize conversations in chat programs and instant messaging services [2][3] [4][5][7]. Netscan thread trees display conversation thread for newsgroups. But visualizing email thread is more difficult because both senders and receivers are important and there are two kinds of messages - incoming and outgoing - unlike newsgroup. a mail they need. Thread-based view also makes it easy to extract related information by Figure 2. Outlook 2000 providing all messages in the same thread.

Related Work

Timestore [1] [9] organizes messages by time and sender in a two-dimensional grid as shown in Figure 1. Messages are displayed as dots encoding the number of messages as size. It allows narrowing of the search space using full- text searching. They also merged it with task and calendar management system. Timestore focused on time-based archiving and retrieving emails

Figure 3. Netscan Thread Tree

Figure 1. Timestore Time-based Visualization

. Features In this view, we display messages in a two see (Figure 6). If users change a range, then data dimensional grid, of which row is email address in the range will fit into the screen and data out of a person and column is date as shown in of the range is hidden. By moving slider bar, we Figure 4. Each grid has the messages that the can see the hidden data, too. The labels such as corresponding person sent/received on the given addresses or date fit dynamically to the chosen time. We encoded the number of messages as range by displaying more detailed information as height in bar chart or gradation in spot. zoomed more.

The first section shows email addresses of people . Message Selection who sent or received mails. The second section shows the number of mails the person As putting a mouse on the cell, the information sent/received in total, using bar chart. Users can of the cell- person and date - can be seen. Users choose the option whether they see incoming can see the detailed information by clicking the mails or outgoing or both. right mouse button on the cell. A pop-up window will show up with a list of the messages in the Users can choose date level as date, month, year cell. Each message has the subject and the that messages are aggregated by the level. When number of messages in the thread which it it is aggregated by date, there appear vertical belongs to. To see the thread view related to a lines by week to help users see weekly patterns. message, users choose a individual message in the list. Figure 7 shows the pop-up window for Sort can be done by the order of email addresses, message selection. domain names, and message counts. It has functions to filter people whose email address has a certain substring, especially filtering by Thread-based Visualization domain name is an interesting query. It is also possible to search messages by email addresses or subject. Thread view shows the relations of messages as shown Figure 8. For a chosen message, we find all messages that are related to it and display . Scalability them with all the people who participated in the thread. The rows are people and messages are - Bar chart vs. Gradation listed in the order of received/sent time. Note that unlike newsgroup data, both senders and To see the number of messages in each gird more receivers are important. accurately and compare with others, bar chart might be more helpful. But if we have many We represented senders as big red rectangles and people in a screen and a range of period is very receivers as small blue circles. There appear long, it is difficult to show the patterns using bar arrows between senders and receivers of the chart. For the case that we have many people and same mails to show we. If a mail is the reply long-term period, we have another view using mail to the other, then another kind of links gradation. Each cell has a spot and the gradation connects two mails, which is red thick lines in of the spot represents the number of messages. Figure 8. We divided time axis by date to help This view will give a good overview of messages understand time information of messages. in terms of people and date. While incoming and outgoing messages can be shown simultaneously in bar chart as color coding, spot s will only Problems in Visualization show the total number of messages as chosen. Figure 5 shows the views using bar chart. For outgoing mails, receivers are important because senders are always the owner. Receivers - Dynamic Query may not be one, so the same messages may appear several times in time-based view. This To manage large dataset, we also used dynamic may show us more messages in visualization query method for people and date. This will than really exists. But in some sense, we can dynamically filter and zoom the range of data so think that several messages that have the same that users can easily find the data they want to contents are sent to receivers. Our thread view can be detected only if users based on how frequently they were in the same write messages using "reply", which will add thread and visualize those groups as graphs. reply information in email headers. But sometimes users may send emails without using it although they are replies to other mails. In this Conclusion case, we should consider subjects, contents and receiver/senders group but it is much more We proposed two visualizations of email dataset: difficult to find the correct information. time-based view and thread-based view. Time- "Forward" information also can be useful for based view displays messages in a two- constructing thread, but it is not available in our dimensional table of which rows are people and implementation because this is not a part of columns are received/sent time and each cell has standard email headers. a list of messages for the person and the time. To manage large volume of data, we used dynamic In case that the same person use several email query, zooming and gradation in this view. This addresses, we cannot detect them. Especially, if view will give users temporal email exchange users are in a mailing list, we cannot find this patterns of correspondents. Thread-based view only with mailboxes. In this case, it should be shows emails exchanged using "reply". It possible that users can specify which email displays all senders who participated in the addresses are actually from the same person and thread and messages in the order of time with merge the data related to them. relations of those messages. This view is helpful to see view the history and track the status of conversation about the same topic. Future Work Acknowledgements In our visualization, users can see data in many ways using filter, sort, search, etc. But they may want to edit or annotate at messages for future We would like to thank Jihwang Yeo and use. This function can be useful, especially in Hyunmo Kang for their valuable comments. email dataset. For example, users may want to mark messages as it needs to be replied or as it is Reference a reminder for future tasks. [1] Baecker, R., Booth K., Jovicic, S., Search functions can be done only for subject, McGrenere, J., Moore, G. "Reducing the Gap and sender/receivers. But it will be useful to Between What Users Know and What They search contents. Specifically we might want to Need to Know" find a message that has URL, Email-address, or attached files. [2] Donath, J., K. Karahalios, and F. Viegas, "Visualizing conversations", In Proceedings of In time-based visualization, we can aggregate or HICSS 32, January 5-8, 1999 filter people based on domain name of their email addresses. But other aggregation/filtering [3] Rodenstein, Roy and Judith S. Donath. can be done if we define groups for people in (2000) "Talking in Circles: Designing A various ways. For example, we can make a group Spatially-Grounded AudioConferencing based on thread or users may define a group such Environment", In Proceedings of CHI '2000, pp. as family, friends, colleagues, etc. More 81-88 generally, it would be good if we can connect this visualization with databases that have information about people, and filter/aggregate [4] Smith, Marc A., Cadiz, JJ and Burkhalter, B., people based on the database. "Conversation Trees and Threaded Chats", the Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work We can think of another useful view of emails: group-based visualization. Email exchange pattern will give useful information about [5] Smith, Marc A. and Fiore, Andrew. relations between people. We may group people "Visualization Components for Persistent Conversations", ACM SIG CHI 2001 [6] Shneiderman, B., "Dynamic Queries for Visual Information Seeking", IEEE Software, 11(6), 70-77

[7] Viegas, F. B. and Donath., J. S. "Chat Circles", Proc. of CHI'99. 1999

[8] Whittaker, S. and Sidner, C. "Email overload: exploring personal information management of email", In Proceedings of Conference on Human Factors in Computing System `96

[9] Yiu, K., Baecker, R.M., Silver, N., and Long, B., "A Time-based Interface for Electronic Mail and Task Management," In Design of Computing Systems: Proceedings of HCI International '97, Volume 2, Elsevier, 1997, 19-22.