1. User Model-Statement of Purpose
Total Page:16
File Type:pdf, Size:1020Kb
202-11 Project 1 Gabriel Beeler Eric Bryan Louise Heim Heather Fucinari 1. User Model-Statement of Purpose
Potential Users:
1) Halloween candy collectors who want to organize their collection and see what
they are missing.
2) Conscientious parents who have children with allergies would use this database to
screen potential harmful ingredients in their children’s Halloween candy. They
most likely would be interested in our filling field and also in the name of the
candy, so that particular item could be avoided.
3) Party planners who want novelty candy to go with the theme of their Halloween
parties would be another target user.
4) Businesses and shops that are looking for something new and exciting to attract
customers to the candy section. They could match their stock records with the
database and find new themed candy items for the Halloween season.
Possible Queries:
1) Halloween collectors will search to see if they have an item already or to see if an
item exits. An example would be a search performed for Pez dispensers under
“brand” and “classification”. Keywords would be: Witch, Pez dispenser.
2) Conscientious parents would search under the fields candy “filling”. Ex:
Keywords would be, “peanut” for that specific allergen.
3) Party planners searching “Candy Character” for novelty characters to add that
special touch to their parties. Keywords would be “bugs” or “ghost”. 4) Businesses would be interested in searching by Candy “weight” for items in their
bulk bins and also by Candy “Name” and “Brand” to find new items to stock.
2. Description of database
Based on our user model, the group decided to treat each individual candy as an item.
Parents, for example, need their query to return an individual brand name such as
“Snickers” to avoid those that were populated by the keyword “peanuts”. The decision to provide specific characteristics were also taken into consideration for the user groups of candy collectors and party planners. These user groups often search on a theme. Perhaps they have a penchant for ghosts. This query will return items packaged in contains with the likeness of a ghost or candies in the shape of a ghost.
Although the query may result in similar returns, a user can easily differentiate the candy by a myriad of fields. Perhaps they want a ghost, but not chocolate. The “Candy
Character” field designates it is a ghost and the “Candy Packaging” defines it as a container, while the “Candy Type” designates the actual candy as gum. Some of these fields are predefined lists to prompt the user to enter the acceptable term such as
“cellophane” instead of entering “plastic” which can be broad.
Our team strove for a the best combination of variables and indexing to make this database both useful for our intended user model and well defined for the potential database builder.
3. Data Structure
Textbase Structure
Textbase: C:\Documents and Settings\gabriel\Desktop\202\202 project\candy\Candy
Created: 9/24/2006 1:47:46 PM Modified: 9/27/2006 7:12:59 PM
Field Summary:
1. Candy Number: Automatic Number(next avail=11, increm=1), Term
2. Candy Brand: Text, Term & Word
Validation: required
3. Candy Name: Text, Term & Word
Validation: required
4. Candy Character: Text, Term & Word
Validation: required
5. Candy Type: Text, Term & Word
Validation: required
6. Candy Packaging: Text, Term & Word
Validation: required, valid-list
7. Candy Weight - grams: Number, Term
8. Candy Flavor: Text, Term & Word
Validation: required
9. Candy Hardness: Text, Term
Validation: valid-list
10. Candy Filling: Text, Term & Word
Validation: required
Log file enabled, showing 'Candy Number'
Leading articles: a an the
Stop words: a an and by for from in of the to
Textbase Defaults:
Default indexing mode: SHARED IMMEDIATE
Default sort order:
Textbase passwords:
Master password = '' 0 Access passwords:
No Silent password
Validation list for Candy Packaging: cellophane dispenser foil other
Validation list for Candy Hardness: hard medium soft
4. Set of Rules for Indexing
The unit of description for this database is a package of candy. In some cases, two or more relatively small packets, rolls, tubes, or boxes of candy may be assembled within a larger container (i.e. a wrapping, bag, etc.). In such cases, the term “package” refers to the larger, exterior container.
The record for each package should contain the following information:
Candy number (numeric field). This field is fairly self-explanatory. However,
since the records are being numbered automatically, please be sure that if you
make a mistake in entering a record, you edit that record rather than deleting it
and starting over. If you delete the record, you will not be able to reuse the
number. Candy brand (text field). This is the name of the candy manufacturer (i.e. Nestlé),
analogous to the make of a car. If the brand is not known, please enter “other.”
Candy name (text field). This is the trade name of a specific type of candy (i.e.
Crunch bar), analogous to model of a car. If the name is not known, please enter
“other.”
Candy character (text field). If either a candy or its package (or dispenser) is
shaped or otherwise decorated to resemble a Halloween-themed character or
object (i.e. “ghost” or “witch”), please enter a word or term describing that
character or object. If not, enter “none.”
Candy type (text field). Enter the generic name term for this candy (i.e. licorice,
chocolate, taffy, etc.). If the candy has no commonly known generic name, enter
“other.” If you are aware of more than one name for a candy (i.e. “lollipop” and
“sucker”), please enter each name, separating them by pressing the F7 key.
Candy packaging (text field). From the validation list, please select the term that
most satisfactorily describes the outermost packaging of the candy. If none of the
terms are satisfactory, select “other.”
Weight (numeric field). Enter the weight found on the package of candy, in
grams, not ounces. Round up to the nearest tenth. For example, 26.25 g should
be entered as 26.3 g. If the weight isn’t marked, leave this field empty.
Candy flavor (text field). Enter a word or term describing the flavor of the candy
(i.e. “strawberry.”) If you don’t know the flavor, enter “other.” If there is more
than one different flavored candy within a package (see above for definition of
“package”), define as “assorted.” If there is more than one flavor (i.e. a layer of kiwi and a layer of strawberry) in an individual piece of candy, then enter each
flavor, separating them by pressing the F7 key. In some cases, the candy flavor
will be the same as the candy type (i.e. chocolate or licorice). This is o.k.
Candy hardness (text field). Please select “soft” from the validation list if the
candy is creamy, malleable, and/or easy to chew. Select “hard” if the candy is
brittle (i.e. you could break your teeth on it). If the candy is neither clearly hard
nor clearly soft, select “medium.”
Candy filling (text field). Enter a word or term describing the candy’s filling, if
any. For the purposes of this database, a “filling” is any edible substance or
object fully or partially encircled, circumscribed, supported, or surrounded by the
candy. If the candy has no filling, enter “none.”
5. Our Records
Candy Number 1
Candy Brand Wrigley's
Candy Name Hubba Bubba
Candy Character pumpkin
Candy Type gum
Candy Packaging Dispenser
Candy Weight - grams 42.5
Candy Flavor assorted
Candy Hardness Soft
Candy Filling no filling
Candy Number 2
Candy Brand Pez Candy, Inc.
Candy Name Pez
Candy Character witch Candy Type other
Candy Packaging Dispenser
Candy Weight - grams 16.4
Candy Flavor assorted
Candy Filling none
Candy Number 3
Candy Brand Oddzon, Inc
Candy Name Bug Factor
Candy Character insect
Candy Type lollipop
Candy Packaging Cellophane
Candy Weight - grams 22
Candy Flavor orange
Candy Hardness Hard
Candy Filling none
Candy Number 4
Candy Brand Wrigley's
Candy Name Hubba Bubba
Candy Character ghost
Candy Type gum
Candy Packaging Dispenser
Candy Weight - grams 42.5
Candy Flavor assorted
Candy Hardness Soft
Candy Filling none
Candy Number 5
Candy Brand Tootsie
Candy Name Caramel Apple Pops
Candy Character none Candy Type lollipop
Candy Packaging Cellophane
Candy Weight - grams 17.7
Candy Flavor caramel
apple
Candy Hardness Hard
Candy Filling none
Candy Number 6
Candy Brand Kencraft
Candy Name Halloween shaped pops
Candy Character ghost
Candy Type lollipop
Candy Packaging cellophane
Candy Weight - grams 28
Candy Flavor marshmallow
Candy Hardness Hard
Candy Filling none
Candy Number 7
Candy Brand Russel Stover
Candy Name Peanut butter ghost
Candy Character ghost
Candy Type chocolate
Candy Packaging foil
Candy Weight - grams 21
Candy Flavor chocolate
peanut butter
Candy Hardness Soft
Candy Filling peanut butter Candy Number 8
Candy Brand Mars, Inc
Candy Name Snickers
Candy Character pumpkin
Candy Type chocolate
Candy Packaging foil
Candy Weight - grams 34
Candy Flavor chocolate
peanut
Candy Hardness Soft
Candy Filling peanut
caramel
Candy Number 9
Candy Brand Russel Stover
Candy Name Buzzard Nest
Candy Character nest
Candy Type chocolate
Candy Packaging foil
Candy Weight - grams 28
Candy Flavor chocolate
coconut
Candy Hardness Medium
Candy Filling jelly beans
coconut
Candy Number 10
Candy Brand Kencraft
Candy Name Halloween Shaped Pops
Candy Character pumpkin Candy Type lollipop
Candy Packaging cellophane
Candy Weight - grams 28
Candy Flavor orange
Candy Hardness Hard
Candy Filling none
6. Exchange Team’s Records
Candy Number 1
Candy Brand Hubba Bubba
Candy Name Twist'n Pour
Candy Character Pumpkin
Candy Type Buble Gum
Candy Packaging dispenser
Candy Weight - grams 42.5
Candy Flavor Bubble Gum
Candy Hardness medium
Candy Filling gum
Candy Number 2
Candy Brand Pez
Candy Name Pez
Candy Character Witch
Candy Type Other
Candy Packaging cellophane
Candy Weight - grams 16.4
Candy Flavor orange
strawberry
Candy Hardness medium Candy Filling none
Candy Number 3
Candy Brand other
Candy Name Bug Factor Lollipop
Candy Character bug
Candy Type lollipop
Candy Packaging cellophane
Candy Weight - grams 22
Candy Flavor orange
Candy Hardness hard
Candy Filling none
Candy Number 4
Candy Brand Hubba Bubba
Candy Name Twist'n Pour
Candy Character Ghost
Candy Type gum
Candy Packaging dispenser
Candy Weight - grams 42.5
Candy Flavor bubble gum
Candy Hardness medium
Candy Filling gum
Candy Number 5
Candy Brand other
Candy Name Caramel Apple Pops
Candy Character none
Candy Type lollipop
Candy Packaging cellophane
Candy Weight - grams 17.7
Candy Flavor carmel apple
Candy Hardness hard
Candy Filling none
Candy Number 6
Candy Brand Kencraft
Candy Name Halloween Shaped Pops
Candy Character ghost
Candy Type lollipop
Candy Packaging cellophane
Candy Weight - grams 28
Candy Flavor other
Candy Hardness hard
Candy Filling none
Candy Number 7
Candy Brand Russell Stover
Candy Name Peanut Butter Ghost
Candy Character ghost
Candy Type peanut butter
chocolate
Candy Packaging foil
Candy Weight - grams 21
Candy Flavor peanut butter
chocolate
Candy Hardness soft
Candy Filling peanut butter
Candy Number 8
Candy Brand Mars Inc.
Candy Name Snickers Candy Character pumpkin
Candy Type chocolate
Candy Packaging foil
Candy Weight - grams 34
Candy Flavor chocolate
carmel
Candy Hardness medium
Candy Filling carmel
nougat
peanuts
Candy Number 9
Candy Brand Russell Stover
Candy Name Buzzard Nest
Candy Character none
Candy Type chocolate
Candy Packaging foil
Candy Weight - grams 28
Candy Flavor chocolate
coconut
Candy Hardness medium
Candy Filling coconut
Candy Number 10
Candy Brand Kencraft
Candy Name Halloween Shaped Pops
Candy Character Pupkin
Candy Type lollipop
Candy Packaging cellophane
Candy Weight - grams 28 Candy Flavor orange
Candy Hardness hard
Candy Filling none
202-11 Project 1 Gabriel Beeler Part B
EVALUATION
Our group put a lot of thought into developing a user model for our system. We asked ourselves who might be interested in consulting a Halloween candy database, and what kind of information they would seek, and chose our fields accordingly. For instance, at one point we considered including “candy color” as one of our fields. We found color was an easily discernible attribute that distinguishes one piece of candy from another. We decided against this, however, since we did not feel that either shoppers or collectors would choose Halloween candy because of its color.
In retrospect, the one field we chose that does not seem particularly relevant to users is “packaging,” as most people probably do not care whether their candy is wrapped in cellophane or foil. Nonetheless, it was useful to think about how the candy was packaged because it helped us define and clarify our unit of analysis. A particular concern was the Pez package, consisting of one dispenser, a roll of orange-flavored Pez candies, and a roll of strawberry-flavored Pez candies, all together in one cellophane wrapper. Would the indexers from Orange County Group #1 describe the flavor as
“strawberry” and “orange”, or as “assorted”? Moreover, would they interpret “weight” to mean the weight of each individual roll of candies, or the combined weight of the dispenser and the two rolls together? We tried to eliminate any uncertainty by writing our indexing instructions as
unambiguously as possible. On the surface, it seemed that we succeeded in doing this
since the feedback from our indexers was quite positive. They stated that the instructions
were “very clear,” and said that they were able to complete the records easily in less than
30 minutes. One indexer even commented that she felt “shortchanged,” since the
experience of working with unclear instructions would have helped her “grow” more as a
database designer!
In spite (or possibly because) of our supposedly clear instructions, there were a
surprising number of discrepancies between the records we made ourselves and those
made by our indexers, as shown in the table below, where big discrepancies between
entries are identified as “major errors” and smaller discrepancies (spelling mistakes, etc.)
are identified as “minor errors” :
Field # Bran Nam Charact Typ Packagi Weig Flav Hardne Fillin d e er e ng ht or ss g majo r 0 4 2 1 0 1 0 4 2 3 erro rs mino r 0 2 2 2 2 0 0 2 0 2 erro rs
From the table, we can see that our two most problematic fields were “brand” and
“flavor,” each with 4 major errors and 2 minor errors. This could result in unacceptably
low recall for certain searches. For instance, a user searching O.C. group #1’s records for
candies made by Wrigley’s would get zero hits, when in fact there were actually two
Wrigley’s candies in the collection. This would amount to a recall of 0%! Similarly, a searcher looking for caramel-flavored candies would also get 0%--this time because of a spelling error.
The errors in brand name were unexpected since the brand names are printed on every package. However, in many cases, the manufacturer’s name is printed in tiny letters in an obscure location (near the ingredient list or copyright notice, for instance). In addition, in some cases, the candy name has several parts (i.e. Hubba Bubba
Twist’n’Pour), where the first part of the name could easily be misconstrued as the brand name. This problem could probably be not be eliminated altogether, but could be lessened if our instructions included tips on where to find the brand name and/or encouraged the indexers to look carefully at the entire package, including the fine print, before making this entry.
We had anticipated that there would be errors in the flavor field, since flavor is somewhat subjective and is not always marked on the package. In addition, some candies or packages of candies contain multiple flavors. We tried to avoid problems in this area by providing explicit instructions on when to enter “other” in the flavor field, when to enter “assorted,” and when to enter several flavors individually. Unfortunately, we were not successful in this, since our indexers identified the Pez flavors individually rather than as “assorted” as we intended. In addition, they did not identify “marshmallow” as the flavor of the marshmallow-coated ghost lollipops as we did. On the other hand, they identified the Hubba Bubba bubble gum as “bubble gum” flavored, which could be a viable name for a variety of bubble gum flavors, as opposed to using the word “assorted”, which we chose. In order to reduce the error potential of this field, providing instructions to label assorted bubble gum flavors as simply “bubble gum” would probably work out well. Providing as many options and potential options as possible is key for this field, as the variety of flavors that candies have is exhaustive, and rather than trying to address each flavor individually, a more broad-ranging, umbrella term would be most helpful and user-friendly, so long as the proper flavor can be found in the end. Perhaps multiple flavor search fields would be the answer, with the initial field being very broad so as to eliminate the possibility of user error in attempting to define a specific flavor, and gradually becoming more specific (walking the user through the identification of the proper flavor).
The next most problematic field was “filling”, with three major errors and two minor ones. Once again, we anticipated problems and tried to prevent them by giving a detailed definition of the term “filling” in our instructions, but we were not completely successful. We thought that we made it clear that the bug in the Bug Factor lollipop should be considered a filling, but the indexers did not interpret it this way. We also wanted the indexers to record the jellybean eggs in the Russell Stover Buzzard Nest as a filling, but they did not. In the latter case, the problem may have been a failure to see the jelly beans listed on the ingredient list, rather than a misinterpretation of our instructions
If we had given more specific instructions relating to how certain fillings should be handled (jelly beans for example), we could have probably avoided the error. One of the biggest problems was finding the relevant information on the packaging, as very often the packaging is very ambiguous in providing clear details as to flavor, filling, brand, name, etc. More detailed instructions, possibly even providing diagrams as to how to identify the proper information for the given field, would solve this issue. In some respects, the minor errors in this field were more consequential than the major ones. After all, the number of users searching for “insect” or “jelly bean” fillings would probably be very small. The number of people searching for “caramel” or “peanut” would be greater, and the fact that people would miss records because of spelling errors
(or an added letter “s”) is bothersome. In addition, people searching for “caramel” or
“peanut” because of allergy concerns might have more at stake in the search then people simply looking for novelty candies to add to their collections. Fortunately, these minor errors could have been eliminated very easily if we had instructed the indexers (and ourselves) to use the Textbase spellchecker. In addition, it would be a good idea to instruct our end users to use truncation (i.e. peanut* rather than peanuts) in their queries so they would find what they wanted regardless of how the records were entered.
The results for the remainder of our fields were satisfactory, with two or fewer major errors in each, as follows:
Name: The errors here were a direct result of errors in the “brand” field, so by
fixing the latter, we would eliminate the former.
Hardness: These errors probably could not have been avoided since this is a
subjective field, but once again, there is some consolation in the fact that
searchers would probably not use this field as a point of entry.
Packaging: This is rather interesting, since the single error here occurred because
of us not following our own instructions (or of writing instructions that did not
match our own records), rather than a misinterpretation on the indexers’ side. OC
group #1 was correct in choosing “cellophane” for the packaging rather than
“dispenser,” since that what a literal reading of the instructions called for. However, “dispenser” seems like a more useful term to have in the records, since
that is what most collectors would actually be interested. Thus, it might be a good
idea to amend the instructions for this field to say something like:
If the candy is packaged in or with a dispenser, please
select “dispenser’” from the validation list. Otherwise,
please select the term that most satisfactorily describes the
outermost packaging of the candy. If none of the terms is
satisfactory, select “other.”
Type: Although there were no major errors here, it is interesting to note that once
again we did not follow our own instructions, which indicated that both “lollipop”
and “sucker” should be entered for the lollipops. Most likely this is because
“sucker” isn’t commonly used here in Southern California, although it is more
common in certain parts of the U.S. Probably rather than instructing the indexers
to list all synonymous terms they could think of, it would have been better to
provide a substitution list or thesaurus.
Character: The only major error here happened because the indexers did not
consider a buzzard nest to be a Halloween character, which is understandable.
Perhaps we could remedy this by including a definition of “character” in the
instructions (something like “any shape that resembles an object other than
candy”, or “any non-standard shape”). A minor error was that the indexers only
entered the term “bug” for the Bug Factor lollipop, where we would have liked
them to enter both “bug” and “insect”. An error that could have happened, but
didn’t, would have been if the indexers had used the term “jack ‘o’lantern” rather than “pumpkin.” Again, the best way around this would have been to provide a
substitution list or thesaurus.
In addition, a challenge field seemed to be candy weight and how to properly
measure it.
We ended up taking the easy route for this field by taking the weight in grams from the candy package. The problem with this would have been if we had a piece of candy with no information on it. In our instructions, we said to put a null value or a blank if this happened. A more inclusive way to measure weight would have been with a food scale and the blank label problem would have been eliminated. The problem with this way is it also would have left more room for user error and resulted in more varied results.
Overall, we left a bit too much up to our indexing instructions and our indexers and we expected to follow them flawlessly. Another validation list would have increased the chances that the other team accurately entering in values that we desired. Our project only included two validation lists and with the inclusion of more controlled vocabulary, we could have gained some more consistency and data integrity. A good field to add another list would have been in the candy Character field as the jack ‘o’lantern vs. pumpkin problem would have been resolved. As we found out it is a great deal harder to attain the desired results for uniform data entry using detailed instructions rather than validation lists. No matter how detailed, the indexer can break a rule and the system has no way of rejecting the entry. It seemed we stumbled upon many design issues because of the level of difficulty of our design but we seemed to learn more because of these obstacles.