Colorado State University Data Values Discusssion
The coefficient of variation and the standard deviation are two measures of variability or dispersion among data values. Provide two different sets of ten data points each as examples.Read both samples and explain what your take away for each examples. Please add information if you can. Your document should be minimum 300 words. Please have credible references. Example 1:The data I collected represents the heights of tallest buildings – data set #1 is for the 10 tallest building heights in the U.S. and data set #2 is for the 10 tallest building heights in the world. Included are Excel calculations for the mean, standard deviation, coefficient of variation (CV), Q1 and Q3.The standard deviation for set 1 is 165.37 versus 308.95 for set 2. The CV for set 1 is 11.7% versus 16.1% for set 2. The standard variation gives one version of variability, that is easy to understand and visualize - data set #2 is almost twice as far from its mean as data set #1 is from its mean (308.95 vs 165.37). The CV, however, shows that the relative variability is relatively small, with set #1 at 11.7% and set #2 at 16.1%, a difference of only 4.4 percentage points and a relatively low CV. In this example, the CV gives a truer understanding of the variability of the data.Interesting information: Only one U.S. building is in the world’s top 10 in height -One World Trade Center at 1776 feet (coming in at #6). Of the remaining 9, 5 are in China, and one each in UAE, Saudi Arabia, South Korea, and Taiwan. While most of the technology used for creating these engineering marvels still comes from the U.S., rising construction costs ($375/ft in New York vs $75/ft in Beijing), expansive building codes, and changes in corporate priorities are the reasons for fewer of them being built domestically (Petras, Padille, & Sergent, 2019).The QUARTILE.EXC function in Excel provides a calculation function that quickly determines the first and third quartiles of an identified data set. On Excel, the calculation can be done in one step, using the formula =QUARTILE.EXC(A1:A10,1), an example for Q1 for 10 data points. Performing this calculation manually, as in the text, is a multi-step calculation. The data needs to be ordered, the median determined, then the medians of the bottom half of the data (Q1) and top half of the data (Q3) calculated, resulting in 4 equal divisions of the data points. This data can then also be used to determine the interquartile range (IQR), which is the difference between Q1 and Q3 and measures and identifies any outliers in the data.Microsoft, QUARTILE.EXC function. Microsoft Support. Retrieved from https://support.microsoft.com/en-us/office/quartile-exc-function-5a355b7a-840b-4a01-b0f1-f538c2864cad (Links to an external site.)Petras, G, Padilla, R. & Sergent, J. (2019, April 8). The world’s tallest buildings aren’t being built in the USA anymore. USA Today. Retrieved from https://www.usatoday.com/in-depth/news/2019/04/08/worlds-tallest-buildings-why-america-isnt-building-them/3160931002/Example 2:Data SetsMeanSDCV%Data Set 16567697175757882899176.28.89194411.66922Data Set 27273757777848789929682.28.52186710.36724Provided above is the excel sheet that I used to calculate my work for each data set. To calculate the standard deviation for each data set I used the formula: =STDEV proceeded by the cells in which the data sets were located, for example:=STDEV(B2:K2) . In order to calculate the coefficient of variation for each data set, I first had to have the values for both the standard deviation and the mean. To calculate the coefficient of variation I divided the standard deviation by the mean and multiplied that by 100. I did this for each data set. Data set 2 has a lower standard deviation and coefficient of variation meaning that the data represented are all closely concentrated to the mean and have less variation between the data within the set (OpenStax, 2019). When comparing the two data sets, due to data set 1 having a higher standard deviation and higher CV, there is more variability of the given data. Of the two mentioned measures the coefficient of variation more accurately specifies the variability of the data sets because in order to calculate CV, the user must have all the represented data (mean/SD). This gives the user a more accurate representation of the variability due to having more information available, rather than just finding the standard deviation alone.The Excel function “QUARTILE.EXC” for calculating quartiles for a given data set seems to be an exclusive or greater than way to calculate the 25th percentile or Q1, the 50th percentile or the median, and the 75th percentile or Q3 of a data set (Bruns, 2020). Unlike the textbook and lecture methods, this method of calculation is limited to the values of the first, second, ad third quartiles only. This means that the 0 and the 4 quartile or minimum and maximum quartile when listed will give the user a #NUM! error. The QUARTILE.INC function is inclusive or greater than or equal to which allows for all of the quartiles to be calculated.References:Burns, D. (2020). Excel QUARTILE.EXC function. Exceljet. Retrieved from https://exceljet.net/excel-functions/excel-quartil... function#:~:text=EXC%20function%20to%20get%20the,the%20in%20the%20table%20below.OpenStax. (2019). Introductory statistics. Houston, TX: OpenStax College. CC BY-SA. Retrieved from https://cnx.org/contents/MBiUQmmY@23.21:kcV4GRqc@1...