Stata for Students: Downloading Data from Qualtrics and Importing it into Stata

This article is part of the Stata for Students series. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section.

This article will teach you how to download data from a survey you've created using Qualtrics and import it into Stata. We'll assume you've already learned to use Qualtrics, created a survey, and collected data using it.

Downloading the Data

Begin by logging into Qualtrics, and opening your survey. Then go to Data & Analysis, click the Export & Import button, and choose Export Data...

Dialog for downloading data from Qualtrics.

Choose CSV, select Use numeric values and then Download. A CSV, or Comma-Separated Values file, is a text file containing data with one observation per line and a comma between each value. Many computers treat CSV files like Excel files, but they're really just text.

Note: for work that is not a class assignment it's usually easier to download an SPSS file and convert it to Stata format using Stat/Transfer, as it will come with variable and value labels already defined. If your instructor told you to download a CSV file (as is typical in Soc 357) it's to give you an opportunity to practice preparing a data set.

Qualtrics will give you a zip file containing the CSV file you actually want to work with. Usually if you tell your browser to open the zip file you'll be able to see the CSV file.

Your CSV file, seen inside the Zip file.

You'll need to put the file in a permanent location, most likely wherever you normally put files associated with your class. On the SSCC network, the U: drive is a good choice. One way to do that is to right click on the file, choose Copy, go to the permanent location, and choose Paste.

Create a do file in that location and double-click on it to start Stata. Have your do file create a log file and set up the Stata environment as usual. When it comes time to load your data, you'll need to import it, and Qualtrics files need a bit of special attention.

Importing the Data

The easy way to import a CSV file is to click File, Import, Text Data. This will open an import window with the various settings you can choose and a preview of how the data will be interpreted with the current settings. Thus you can tweak the settings until the preview makes sense and then click OK to actually import the data. Just be sure to copy the resulting command into your do file so you don't have to go through that process every time. You'll start with something like the following:

Importing the data. At first there will be rows and columns you don't need.

The CSV file Qualtrics gives you contains some variables you almost certainly don't care about, like the date and time the respondent started the survey, and some rows that don't actually represent observations. Instead they contain "metadata" or information about the data. Most of that metadata cannot be used by Stata, and if you tell Stata to treat those rows as observations it will be very confused. The key to importing Qualtrics data into Stata is to use the Set range button to only import the part of the data that you want and can use.

Scroll right until you find the actual questions in your survey. In this example, they start in column 18:

Importing the data 2: finding the actual data.

Note how the actual responses begin in row 5. Now you're ready to click Set range:

The dialog for setting the rows and columns to be read.

Under Rows, check First and set it to 5. Under Columns check First and set it to 18 (or the corresponding numbers for your data set). When you click OK, the preview window will look quite different:

With the metadata rows gone, Stata should recognize that the file contains variable names (q1, q2, etc.) and use them automatically. It will also recognize that the variables are numeric rather than text, so they'll be black instead of red. Click OK and you'll get a usable data set. Be sure to copy the command Stata runs into your do file so that in the future this whole process will happen automatically. It will look something like:

import delimited "U:\357\357 Example_March 7, 2018_09.38.csv", rowrange(5) colrange(18)

Preparing the Data

While the resulting data set can be used as-is, the variable names are not very informative, and variable and value labels would be very helpful (and possibly required for your class). Variable labels can only be 80 characters long, so you may need to abbreviate your questions. You may also want to create indicator variables for yes/no questions rather than using Qualtrics's default 1/2 coding. Instructions for doing all these things can be found in Stata for Students: Creating Variables and Labels. We won't repeat them here, but the following do file is an example of their use. You may need to refer to your survey in Qualtrics to get details about the questions and possible responses.

capture log close
log using prepdata.log, replace

clear all
set more off

import delimited "U:\357\357 Example_March 7, 2018_09.38.csv", rowrange(5) colrange(18)

rename q1 difficult
label variable difficult "Using Stata for assignments in this class was difficult for me."
label define agree 1 "Strongly Agree" 2 "Somewhat Agree" 3 "Neither Agree Nor Disagree" 4 "Somewhat Disagree" 5 "Strongly Disagree"
label values difficult agree

gen everProgram=(q2==1) if q2<.
label variable everProgram "Before this class, had you ever written a computer program?"

gen class=(q3==1) if q3<.
label variable class "Before this class, had you ever taken a programming class at the high school or college level?"

gen taught=(q4==1) if q4<.
label variable taught "Were you ever taught to program before high school?"

label define yn 1 "Yes" 0 "No"
label values everProgram class taught yn

drop q2 q3 q4

rename q5 device
label variable device "What kind computing device have you spent the most time using over the course of your life?"
rename q7 entDevice //Note how the variables were not numbered sequentially by Qualtrics. Watch for that.
label variable entDevice "What kind computing device have you spent the most time using for entertainment over the course of your life?"

label define device 1 "Windows Computer" 2 "Apple Computer" 3 "Smartphone or Tablet" 4 "Gaming Console" 5 "Other Device / Does not apply"
label values device entDevice device

save projectdata, replace
log close


Last Revised: 3/8/2018