A New Data Model: Persistent Attribute-Centric Objects*
Total Page:16
File Type:pdf, Size:1020Kb
A New Data Mo del PersistentAttributeCentric Ob jects Terry Jones Ricardo A BaezaYates Distributed Cognition HCI Lab Dept of Computer Science Cognitive Science Dept University of Chile Univ of California Blanco Encalada San Diego La Jolla Santiago Chile CA USA rbaezadccuchilecl terryclisucsdedu Gregory J E Raw lins Dept of Computer Science Indiana University Blo omington IN USA rawlinscsindianaedu June Abstract Trying to nd information on the Web is like trying to nd something at a jumble sale its fun and you can make serendipitous discoveries but for directed search its b etter to go to a department store there someone has already done most of the arranging for you Unfortunately the Webs continuing explosion in size its enormous diversity of topics and its great volatility make unaided human indexing imp ossible This problem is just a sp ecial case of the general problem of organizing information to create knowledge A similar problem arises on the desktop when dealing with le systems where users must searchby name and often they do not rememb er the les name or lo cation File names are artifacts of current op erating systems but human understanding neither requires ob jects to b e named nor do es it have problems with multiple ob jects sharing prop ertiesnames for instance The limitations mentioned ab ove result b ecause to days computer systems do not analyze the les they are asked to store Instead they note only simple attributes like creation date and le typ e and leave the bulk of organizing naming annotating and nding those les to their users This mayhave made sense in the s when computers were slow and exp ensive but it makes no sense to day We argue for a new data mo del to information representation based on the use of p ersis tent ob jects with dynamic attributes and search op erations over them This representation is The work of the rst author was supp orted by Chile Fondecyt Grant and the second author byDARPA Contract NC and by a grantfromIntel organizationneutral thereby giving a exible substrate for anyone to build multiple simultane ous organizations In addition it is uniform allows easy sharing of ob jects and can b e simply extended to the Web We present an initial prototyp e of this idea AVS and to showitspo tential two system prototyp es that have as their main goal to organize p ersonal information DomainView and KnownSpace Intro duction Storing and organizing information is the kernel of any computer application The widespread use and exp onential growth of the World Wide Web as well as other data sources has had a crucial impact on the problem of managing p ersonal information Currently the abstraction of les and folders is ubiquitous as the main way to organize and store do cuments This abstraction however arose when computer resources were exp ensive which is not true nowadays The real limitations of any information system are not storage space or computing p ower They are the narrow communication pip elines b etween the computer and the human user typically through a screen and b etween the users eyes and brain In that sense any information system should not dep end on the abilities of the user it should try to represent information as closely as p ossible to the users mo del of it and anyinternal representation should b e transparenttothe user However the paradigms that are used to day do not satisfy any of these goals Why One main reason is that the historical developmentofsoftware has forced the user to b e aware of many unnecessary things The question is how to organize p ersonal information and how to present it to the user in a meaningful way through a simple but eective user interface Any system for managing information implemented either on top of a le system or a database is limited by the underlying data mo del of the storage mechanism On the other hand if we write down as the goals of the system what functionality the underlying data mo del should provide we do not necessarily obtain a known data mo del We can think of information as organized data which has to b e comprehended and managed So there are two main problems to b e solved how to organize data and how to communicate to a p erson that data through a user interface In this pap er we present a new approach for the rst problem a new mo del of the storage representation and organization of information based on what we call persistent attributecentric objects PACO PACO is based on storing data in ob jects having attributes and values in a uniform way eachhaving a dynamic structure b eing able to organize ob jects in collections which are also dynamic and indep endentofthe storage mechanism through p owerful search capabilities We present our vision of how to manage information relative to our data mo del in That pap er also discusses a few user interfaces This pap er is organized as follows We rst present some of the motivations b ehind this pap er which are the basic functions needed to build ecient user interfaces for managing information together with a list of the main limitations of to days le systems and databases Next we present the main concepts b ehind the new data mo del and how they can b e used to manage information clearly separating the storage representation and organization layers of the data b eing handled We also compare our data mo del with previous work and discuss some of the main dierences and consequences of it Next we show a sp ecic generic prototyp e of this data mo del AVS the Attribute Value System which maintains a collection of ob jects comp osed solely of attributevalue pairs and which provides facilities for creating altering and lo cating these ob jects This simple substrate provides exibility in the representation of information emphasizes the role of search generalizes hierarchical le systems provides for the dynamic construction of arbitrary data structures and interob ject relationships emphasizes the distinction b etween informational ob jects and structures that contain and organize them facilitates multiple views of the same information and provides a novel form of ob ject ownership To show the p otential of our mo del we summarize the functionalityoftwo systems to manage p ersonal information DomainView and KnownSpace Although they were develop ed with out using the data mo del prop osed their essence is based on it In fact the main ideas of this data mo del were develop ed indep endently by the three authors and later were unied into a single view We nish by presenting work in progress of the authors and the consequences of our vision of organizing information with resp ect to programming user interfaces and the Web Since we change basic assumptions wehave to start from basic principles and levels whichmayseemnaive for some readers or very radical for others Motivation Organizing Information Web search engine queries now often return millions of irrelevant pages Those pages are not spatially arranged collected by topic or distinguished in anyway other than by their titles so wehave no idea of the relevance of any page b efore reading it The same is true for mail and news After wesavewebpages mail messages news articles ftp pages or any pages pro duced with an editor or any other application those pages then b ecome lost on our desktops They are not analyzed in anyway group ed according to our interests or laid out spatially to show their similarities to other pages already there Even after we organize the pages on our desktops by hand there is no automation to help us reorganize them search them navigate through them or nd more pages like them In short our computers dont help us manage our own data and as we discuss in the next section wemust do that task ourselves by organizing information in a le system Hierarchical le systems are valid as an internal storage mechanism for op erating systems but the user need not b e aware of them Many users do not even understand this concept and typically use applications that each put all their les in one directory a at hierarchy Not all information can b e classied as a tree there are many other ways to classify a group of ob jects and many hierarchical ways to classify the same information Let us start over without assuming anything ab out computer resources What mightbethe rightway to store and organize information Which functionality should a data mo del provide to b est help users manage their information Wegive one p ossible answer to this question To help make our claims concrete we present them in the context of aiding the developmentof two prototyp es that organize p ersonal information They are DomainView DV and KnownSpace KS DomainView is a desktop user interface that organizes information in domains chosen by the user using a uniform and simple interface The interface is based on retrieval of do cuments by attribute values including the content in a at and dynamic universe of domains The desktop is tailored to and by a user who creates his or her own do cumentdriven and knowledgedomain world KnownSpace is an adaptive visual and autonomous information manager of all of a users information whether that information is data or programs and whether it originates on the Web via mail news ftp an editor or any other application It is not a search engine browser desktop or op erating system although it shares elements of all four programs Both prototyp es manage information using collections of ob jects eachobjecthaving attributes with values In our systems ob jects are also called entities pages or do cuments and collections