# Calculating descriptives

## Video: Calculating descriptives

## Calculating descriptives

One of the first steps in any data analysis is to thoroughly investigate each of your variables one at a time, that is, to get univariate analyses. I've already described one procedure for getting univariate information with the frequencies procedure, and that work for both categorical and scale variables. Another important option for univariate statistics is the Descriptives command. This command and the Frequencies command do a lot of the same things but there are some important differences. The most significant is that frequencies can work with Categorical variables and Scale variables but descriptives works only with Scale variables.

In this movie, I will highlight the similarities as well as point out some of the unique advantages of the Descriptives command. For this example I will be using the same data set that I used in the last one. That's information about the stocks in the NASDAQ index, NASDAQ.sav. To get the descriptives, I go up to Analyze, to Descriptive Statistics, to Descriptives. From here, I select the variables that I want. You will notice it doesn't list all of the variables. It only lists the ones that are numeric.

The symbol and the name variables, as well as industry, are text variables and they are categorical and it simply doesn't list them here. So I am going to take the two that I used in the last example. That's LastSale-- so I am just going to click to move that over to the right--and MarketCap-- I am also going to move that over to the right. Then what I can do is I can get options where I select the statistics that I want. Now, by default, the descriptives gives me the mean, the standard deviation, the minimum and the maximum, and these are a good list.

I can also get Kurtosis and Skewness if I want. What's important though is I cannot get the quartiles. I can't get the 1st quartile, or 25th percentile score, I can't get the 3rd quartile, or 75th percentile score, and I can't get the median, and for a skewed distribution those are important statistics. So that is one reason to sometimes use the Frequencies command over the descriptives, is if you need the median and the quartiles. But I will just click Continue. Now, I have another option here. You will see at the bottom-left it says Save standardized values as variables.

This is one of the big perks of the descriptives command. If you want to take a variable that is in some metric like dollars or an arbitrary metric that may be a foreign currency you're not familiar with, sometimes you want to save things as standardized variables. That makes it so that the mean is 0 and the standard deviation is 1, and the individual cases get scores that indicate how many standard deviations above or below the mean they are. These are also called Z scores. I have seen people demonstrate how to do this manually by calculating everything.

That's very tedious. The descriptives gives you one stop way of doing this: you simply click the box and it will add standardized values for these things. And we can see how many standard deviations above or below the mean some of the companies are on these items for last sale and market capitalization. Now, there is also an option here for Bootstrap. I am not going to get into that one because the Bootstrap is an add-in feature that you pay extra for in SPSS. I am just going to deal with the ones that come standard. So now I can press OK and what I get is a small table. In the Frequencies command, the variables were listed as columns across the top and the statistics were listed as rows down the side, but Descriptives, it's flipped around.

But what I have here is the number of cases that I have information on. So for last sale I have 2817 companies with information on that. The minimum value is \$00.1, the maximum value is \$1,000 or \$1,132, the mean is 18.7, and the standard deviation is 34.65 and then you have similar statistics for market capitalization. Now, an interesting trick is if we go back to the data set, and you see that we have two new columns here at the end, ZLastSale for the Z score or standardized value, where you can see that most of the scores are close to 0 or 1. We do have a major outlier at 9.99.

That's nearly 10 standard deviations above the mean. That's Apple computers, where their stock is costs about 10 times as much as most others. And then we have Z market capitalization. That's, again, a Z score, how many standard deviations above or below, and then Apple is again 32 standard deviations above the mean on this particular one. Hopefully, from all of this you can see that the Descriptives command is a really useful way of getting a variety of univariate statistics for your data. Like the Frequencies command, it can give the mean, the standard deviation, minimum, maximum, and other statistics.

It can give you the standardized scores, which the Frequencies command can't do. On the other hand, Frequencies can give the percentile statistics like the quartiles and the median. It can give the mode, it can give frequency tables and charts, and it can work with string variables and categorical variables. Now, for these reasons I generally prefer to use the Frequencies command, but either one will get you a very long way towards a sound understanding of your data and a solid foundation for further analysis.

1. ### Introduction

2m 58s
1. Welcome
1m 5s
2. Using the exercise files
40s
3. Using a different version of the software
1m 13s
2. ### 1. Getting Started

19m 0s
1. Taking a first look at the interface
11m 49s
7m 11s
3. ### 2. Charts for One Variable

21m 54s
1. Creating bar charts for categorical variables
7m 18s
2. Creating pie charts for categorical variables
2m 54s
3. Creating histograms for quantitative variables
5m 45s
4. Creating box plots for quantitative variables
5m 57s
4. ### 3. Modifying Data

33m 10s
1. Recoding variables
5m 33s
2. Recoding with visual binning
5m 33s
3. Recoding by ranking cases
5m 26s
4. Computing new variables
5m 37s
5. Combining or excluding outliers
5m 21s
6. Transforming outliers
5m 40s
5. ### 4. Working with the Data File

28m 12s
1. Selecting cases
6m 44s
2. Using the Split File command
5m 12s
3. Merging files
5m 33s
4. Using the Multiple Response command
10m 43s
6. ### 5. Descriptive Statistics for One Variable

22m 14s
1. Calculating frequencies
8m 43s
2. Calculating descriptives
5m 31s
3. Using the Explore command
8m 0s
7. ### 6. Inferential Statistics for One Variable

16m 3s
1. Calculating inferential statistics for a single proportion
6m 6s
2. Calculating inferential statistics for a single mean
5m 39s
3. Calculating inferential statistics for a single categorical variable
4m 18s
8. ### 7. Charts for Two Variables

30m 43s
1. Creating clustered bar charts
7m 10s
2. Creating scatterplots
5m 8s
3. Creating time series
3m 24s
4. Creating simple bar charts of group means
4m 17s
5. Creating population pyramids
3m 0s
6. Creating simple boxplots for groups
3m 3s
7. Creating side-by-side boxplots
4m 41s
9. ### 8. Descriptive and Inferential Statistics for Two Variables

45m 28s
1. Calculating correlations
8m 17s
2. Computing a bivariate regression
6m 27s
3. Creating crosstabs for categorical variables
6m 34s
4. Comparing means with the Means procedure
6m 33s
5. Comparing means with the t-test
6m 4s
6. Comparing means with a one-way ANOVA
6m 30s
7. Comparing paired means
5m 3s
10. ### 9. Charts for Three or More Variables

24m 30s
1. Creating clustered bar charts for frequencies
6m 34s
2. Creating clustered bar charts for means
3m 45s
3. Creating scatterplots by group
4m 13s
4. Creating 3-D scatterplots
4m 25s
5. Creating scatterplot matrices
5m 33s
11. ### 10. Descriptive Statistics for Three or More Variables

30m 57s
1. Using Automatic Linear Models
11m 52s
2. Calculating multiple regression
9m 3s
3. Comparing means with a two-factor ANOVA
10m 2s
12. ### 11. Formatting and Exporting Tables and Charts

29m 29s
1. Formatting descriptive statistics
6m 1s
2. Formatting correlations
7m 49s
3. Formatting regression
10m 19s
4. Exporting charts and tables
5m 20s
13. ### Conclusion

51s
1. What's next
51s

