Naming and Labeling Data

Naming and Labeling Data

Naming and Labeling Data Note: The more recently updated IPA Stata 102 training covers data cleaning, including naming and labeling. Also, see the guide to checking incoming data or the manual on high-frequency checks, applicable to both CAI and paper-and-pencil projects. Variable names and labels are usually a personal preference, and different PIs have different preferences, so there's no formal convention. “Check with your PI” is usually good advice, but a very busy PI might not respond to that, so here are some guidelines: 1. All variables should have labels, and all multiple choice variables have value labels. 2. The labeling system should be internally consistent. 3. It should be easy to connect the variable in the dataset with the question on the questionnaire. Most analysis is done with the questionnaire in hand. To learn the code to create labels in Stata, go here. The most common way to name variables is to use the question number from the questionnaire as the variable name and provide a descriptive label. The basic format here is: Variable name: question_number Variable label: descriptive label So if you had questions 101 through 103 from a questionnaire called “QA,” the names and labels might be: label var qa_101 "Has children under 15" label var qa_102a "Number boys under 15" label var qa_102b "Number boys in school" label var qa_103a "Number girls under 15" label var qa_103b "Number girls in school" A second good way is to use a descriptive variable name, then put the question number in the label. The basic format here is: Variable name: descriptive_name Variable label: [question_number] descriptive label There is an example of this in the document library that uses a style similar to: label var child15 "[QA.101] Has children under 15" label var child15G "[QA.102a] Number boys under 15" label var child15BS "[QA.102b] Number boys in school" label var child15G "[QA.103a] Number girls under 15" label var child15GS "[QA.103b] Number girls in school" A practical tip on creating value labels: it can be useful to change the delimiter to a semicolon so that a single command can take up several rows in your text editor, making it easier to read. See help delimit to learn about delimiters in Stata. An example would be: #delimit ; label def sex 0 "Male 0" 1 "Female 1" ; label def reg 1 "Northern 1" 2 "Southern 2" 3 "Western 3" 4 "Eastern 4" 5 "Central 5" ; #delimit cr label values female sex label values region reg Note how the labels have the number in the value label. This is not strictly necessary, but can be very useful if you want to refer to specific values. .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    2 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us