1、ch03ExcelManualChapter 3Descriptive StatisticsChapter 2 presented graphical techniques for organizing and displaying data. Even though such graphical techniques allow the researcher to make some general observations about the shape and spread of the data, a more complete understanding of the data ca
2、n be attained by summarizing the data using statistics. This chapter presents such statistical measures, including measures of central tendency, measures of variability, and measures of shape. The computation of these measures is different for ungrouped and grouped data. 3.1 Measures of Central Tend
3、ency: Ungrouped DataOne type of measure that is used to describe a set of data is the measure of central tendency. Measures of central tendency yield information about the center, or middle part, of a group of numbers. Measures of central tendency do not focus on the span of the data set or how far
4、values are from the middle numbers. The measures of central tendency presented here for ungrouped data are the mode, the median, the mean, percentiles, and quartiles.Mode, Median, and MeanThe mode is the most frequently occurring value in a set of data. The median is the middle value in an ordered a
5、rray of numbers. For an array with an odd number of terms, the median is the middle number. For an array with an even number of terms, the median is the average of the two middle numbers. The mean is the average of a group of numbers and is computed by summing all numbers and dividing by the number
6、of numbers.Demonstration Problem 3.1Shown below is a list of the 11 largest motor vehicle producers in the world and the number of vehicles produced by each in 2009 (cited in text).Auto Manufacturer Production (millions)Toyota Motor 7.2General Motors 6.5Volkswagen Group 6.1Ford Motor 4.7Hyundai 4.6P
7、SA Peugeot Citron 3.0Honda 3.0Nissan 2.7Fiat 2.5Suzuki 2.4Renault 2.31. Input the data into Excel. Save as Demo_3.12. Click on the Data tab and Data Analysis (if you dont see this option, see Chapter 1 to install the Analysis Tookpak).3. Select Descriptive Statistics and then select into the Input R
8、ange box, select a cell to the right of the data for the Output Range and select Summary statistics. If you input and selected the label, select Labels in first row (make sure it is in the cell directly above the data).4. Widen the column with the text to see all of the text (click and drag or doubl
9、e-click the line between the column letters).5. The mean uses all the data, and each data item influences the mean. It is also a disadvantage because extremely large or small values can cause the mean to be pulled toward the extreme value.RemarksThe mean uses all the data, and each data item influen
10、ces the mean. It is also a disadvantage because extremely large or small values can cause the mean to be pulled toward the extreme value. In this data set, the mean value is 4.09 and the median is 3 showing that the large values pull the mean to a higher value whereas a typical value would be more l
11、ike 3. The mode or most common value is also 3.PercentilesPercentiles are measures of central tendency that divide a group of data into 100 parts. There are 99 percentiles because it takes 99 dividers to separate a group of data into 100 parts. Lets use our data set to find specific percentiles usin
12、g an Excel function.Demonstration Problem 3.21. Input the following data into Excel in a column: 14, 12, 19, 23, 5, 13, 28, 17.2. Click in a cell to the right of the data and input the function (=PERCENTILE (Select range of data,0.3). The 30th percentile is represented by 0.3.3. The answer is 13.1 a
13、nd the whole number would be 13. A percentile may or may not be one of the data values.Note: The Rank and Percentile feature of the Data Analysis tool of Excel has the capability of ordering the data, assigning ranks to the data, and yielding the percentiles of the data. To access this command, clic
14、k on Data Analysis and select Rank and Percentile from the menu. In the Rank and Percentile dialog box, enter the location of the data to be analyzed in Input Range. For this data set, the output looks like the output on the right:QuartilesQuartiles are measures of central tendency that divide a gro
15、up of data into four subgroups or parts. If the observations are ordered from smallest to largest, each quartile represents 25% of the observations. The first quartile (Q1) represents the median of the observations ordered from the minimum to the overall median M. The second quartile is the overall
16、median M and represents 50% of all observations. The third quartile represents the median of the upper 50% of the observations. A five-number summary gives a complete description of the distribution, including the minimum number, Q1, M (median), Q3, and the maximum number. A boxplot is a graph of th
17、e five-number summary. Side-by-side boxplots are useful to compare several distributions.Demonstration Problem 3.31. Open the Demo_3.3 file from the folder titled Demonstration Problem Data Sets on the student companion site located at 2. Below the data, input the following labels and formulas accor
18、ding to the instructions in the Chapter 1 using the Function Wizard or by inputting the formulas manually. You could also use the Quartile function for all of the values by inserting 0,1,2,3,4 for the second argument. For example, Maximum would be =QUARTILE(B2:B17,4).3. To view the functions used on
19、 a worksheet, there is an option in Excel to display all equations. Select File Options Advanced. Scroll down to Display options for this worksheet and select Show formulas in cells instead of their calculated results. Click OK and you will be able to view all of the formulas used on the current wor
20、ksheet. Reverse the selection when you want to see only the results. You can always click on a cell and see the formula that was input in the Formula Bar.4. The resulting values calculated are shown as follows:Note: These values are not quite the same as the values calculated in the textbook. That i
21、s because the quartiles are calculated by a different algorithm in different software programs. If you use the Min, Max, and Median functions, those values will be the same. The values that differ are Q1 and Q3.3.1 Measures of Variability: Ungrouped DataBusiness researchers can use another group of
22、analytic tools, measures of variability, to describe the spread or the dispersion of a set of data. Using measures of variability in conjunction with measures of central tendency makes possible a more complete numerical description of the data.Methods of computing measures of variability differ for
23、ungrouped data and grouped data. This section focuses on seven measures of variability for ungrouped data: range, interquartile range, mean absolute deviation, variance, standard deviation, z scores, and coefficient of variation.RangeThe range is the difference between the largest value of a data se
24、t and the smallest value of a set. Although it is usually a single numeric value, some business researchers define the range of data as the ordered pair of smallest and largest numbers (smallest, largest). It is a crude measure of variability, describing the distance to the outer bounds of the data
25、set. An advantage of the range is its ease of computation. A disadvantage of the range is that, because it is computed with the values that are on the extremes of the data, it is affected by extreme values, and its application as a measure of variability is limited.Interquartile RangeAnother measure
26、 of variability is the interquartile range. The interquartile range is the range of values between the first and third quartile. Essentially, it is the range of the middle 50% of the data and is determined by computing the value of Q3 - Q1. The interquartile range is especially useful in situations
27、where data users are more interested in values toward the middle and less interested in extremes. In addition, the interquartile range is used in the construction of box-and-whisker plots.The interquartile range value can differ slightly when using different software programs due to the underlying a
28、lgorithms defining the quartiles. Demonstration Problem 3.3 cont.1. Open the Demo_3.3_Results file from the folder titled Demonstration Problem Data Sets on the student companion site located at or use the results calculated in the previous exercise on quartiles.2. To calculate the range in Excel, u
29、se a simple subtraction formula. Click on a cell below the quartile calculations and input = and then click on the computed maximum value, type a - and click on the computed minimum value. For our example, it should look like this: The result: 3. To calculate the interquartile range, use a simple su
30、btraction formula with Q1 and Q3. Click on a cell below the range calculation and input = and then click on the computed Q3 value, type a - and click on the computed Q1 value. For our example, it should look like this: The result:Mean Absolute Deviation, Variance, and Standard DeviationThree other m
31、easures of variability are the variance, the standard deviation, and the mean absolute deviation. The variance and standard deviation are widely used in statistics. Although the standard deviation has some stand-alone potential, the importance of variance and standard deviation lies mainly in their
32、role as tools used in conjunction with other statistical devices.Mean Absolute DeviationThe mean absolute deviation (MAD) is the average of the absolute values of the deviations around the mean for a set of numbers. There is no functions of this in Excel but you can set up a table and use formulas to calculate.MAD Problem 1. A small company started a production line to build computers. Durin
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1