The Basics
File Types:
There are three file types associated with SPSS, they are the data
file (.sav), the output file (.spo), and the syntax file (.sps).
Tabs:
There are two tabs in the SPSS data editor,
Data View and Variable View.
Data View (adapted from SPSS 12.0 Help Menu)
Many of the features of the Data view are similar to
those found in spreadsheet applications (Excel). There are, however,
several important distinctions:
- Rows are cases, subjects, or participants. Each
row represents a singe case, person or an observation. For example,
each individual respondent to a questionnaire is a case.
In the case of a repaeted measures design all the data is entered into
the same row.
- Columns are the different variables. Each column
represents a variable or characteristic being measured. For example,
each item on a questionnaire is a variable.
- The individual cells contain the values, or hard
data. Each cell contains a single value for a specific variable for an
individual case. The cell is the intersection of the case (row) and the
variable (column). Cells contain only data values. Unlike spreadsheet
programs, cells in the Data Editor cannot contain formulas instead you
need to create a new variable.
- The data file is rectangular. The dimensions of
the data file are determined by the number of cases and variables. You
can enter data in any cell. If you enter data in a cell outside the
boundaries of the defined data file, the data rectangle is extended to
include any rows and/or columns between that cell and the file
boundaries. There are no "empty" cells within the boundaries of the
data file. For numeric variables, blank cells are converted to the
system-missing value. For string variables, a blank is considered a
valid value.
Variable View (adapted from SPSS 12.0 Help Menu)
The Variable view contains descriptions of the
attributes of each variable in the data file. In the Variable view:
- Rows are now variables (THIS IS NOT THE CASE IN
DATA VIEW).
- Columns are variable attributes.
You can add or delete variables and modify attributes
of variables, including:
- Variable name, enter the name you want for your
variable- most of the time this includes shortened forms and
abbreviations.
- Data type is the type of variable you are entering
in the data set; the various forms include:
- Numeric. A
variable whose values are numbers. Values are displayed in standard
numeric format. The Data Editor accepts numeric values in standard
format or in scientific notation.
- Comma. A
numeric variable whose values are displayed with commas delimiting
every three places, and with the period as a decimal delimiter. The
Data Editor accepts numeric values for comma variables with or without
commas; or in scientific notation.
- Dot. A
numeric variable whose values are displayed with periods delimiting
every three places, and with the comma as a decimal delimiter. The Data
Editor accepts numeric values for dot variables with or without dots;
or in scientific notation.
- Scientific notation.
A numeric variable whose values are displayed with an imbedded E and a
signed power-of-ten exponent. The Data Editor accepts numeric values
for such variables with or without an exponent. The exponent can be
preceded either by E or D with an optional sign, or by the sign
alone--for example, 123, 1.23E2, 1.23D2, 1.23E+2, and even 1.23+2.
- Date. A
numeric variable whose values are displayed in one of several
calendar-date or clock-time formats. Select a format from the list. You
can enter dates with slashes, hyphens, periods, commas, or blank spaces
as delimiters. The century range for 2-digit year values is determined
by your Options settings (from the Edit menu, choose Options and click the Data
tab).
- Custom currency.
A numeric variable whose values are displayed in one of the custom
currency formats that you have defined in the Currency tab of the
Options dialog box. Defined custom currency characters cannot be used
in data entry but are displayed in the Data Editor.
- String. Values
of a string variable are not numeric, and hence not used in
calculations. They can contain any characters up to the defined length.
Uppercase and lowercase letters are considered distinct. Also known as
an alphanumeric variable
- Number of digits or characters that you place in
the cell
- Number of decimal places that each value will have
- Descriptive variable and value labels, this is
where you lable exactly what the data or variable you are inputting
means
- User-defined missing values:
Missing Values defines specified data values as user-missing. It is often useful to know
why information is missing. For example, you might want to distinguish
between data missing because a respondent refused to answer and data
missing because the question didn't apply to that respondent. Data
values specified as user-missing are flagged for special treatment and
are excluded from most calculations.
- You can enter up to three discrete (individual) missing
values, a range of missing values, or a range plus one discrete value.
- Ranges can be specified only for numeric
variables.
- You cannot define missing values for long string
variables (string variables longer than eight characters).
Missing values for string
variables. All string values, including null or blank values,
are considered valid values unless you explicitly define them as
missing. To
define null or blank values as missing for a string variable, enter a
single space in one of the fields for Discrete
missing values.
- Column width, this defines how long a cell is
- Measurement level, there are three types:
- Scale. Data values are
numeric values on an interval or ratio scale--for example, age or
income. Scale variables must be numeric.
- Ordinal. Data values
represent categories with some intrinsic order (for example, low,
medium, high; strongly agree, agree, disagree, strongly disagree).
Ordinal variables can be either string (alphanumeric) or numeric values
that represent distinct categories (for example, 1 = low, 2 = medium, 3
= high). Note: For ordinal string
variables, the alphabetic order of string values is assumed to reflect
the true order of the categories. For example, for a string variable
with the values of low, medium, high, the order of the categories is
interpreted as high, low, medium--which is not the correct order. In
general, it is more reliable to use numeric codes to represent ordinal
data.
- Nominal. Data values
represent categories with no intrinsic order--for example, job category
or company division. Nominal variables can be either string
(alphanumeric) or numeric values that represent distinct
categories--for example, 1 = Male, 2 = Female.
In addition to defining variable properties in the
Variable view, there are two other methods for defining variable
properties:
- The Copy Data Properties wizard provides
the ability to use an external SPSS data file as a template for
defining file and variable properties in the working data file. You can
also use variables in the working data file as templates for other
variables in the working data file. Copy Data Properties is available
on the Data menu in the Data Editor window.
- Define Variable Properties (also available
on the Data menu in the Data Editor window) scans your data and lists
all unique data values for any selected variables, identifies unlabeled
values, and provides an auto-label feature. This is particularly useful
for categorical variables that use numeric codes to represent
categories--for example, 1 = Male, 2 = Female.
Sample Files:
SPSS comes with sample files that can be found on
every machine. These files can be located by going to the c: - program files - spss.
Page created by Ryan
Pohlig