Sarah Radcliff
SLCC e-Portfolio
On the first day of Statistics 1040, each student in my class was given a bag of Skittles. We were asked to count and record the number of each color of Skittle and share it with the class for our term project. My Skittles bag resulted in the following dataset:
​
​
​
​
​
​
​
​
Frequency represents the number of occurrences for each color of Skittle. The relative frequency is the proportion that the frequency represents of the total. Before looking at the class data, each student was asked to make a prediction about the relative frequency of each color for the entire class. Since each skittles package weighs the same, I figured that the Skittles company most likely measures the amount of each color that goes into the bag by weight. My assumption would be that some sort of automated dispenser on the Skittles line is set to release a certain weight of each color of Skittle per bag, likely around 20% of 2.17 oz. There are five colors of Skittles and each would make up about 20% of the bag’s weight, thus my prediction for relative frequency for each color of Skittle present in each bag would be as follows:
​
​
​
​
​
​
​
​
​
Below is a relative frequency chart and two visual aids, a pareto/bar graph and a pie chart, that graph the relative frequency of the actual class data:
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
The data here represents the relative frequencies recorded of each color from every bag of Skittles handed out in our class. We can look at this data in a few different ways to interpret its meaning. First, we want to ask ourselves, “Who is the group that is being studied?” and “What are we going to do with this data?”.
An entire group that is being studied is called a population. We could consider the population of this study to be very broad by considering the population to be all bags of Skittles in the entire world; Or, we could think smaller and consider the population to be all of the bags of Skittles that were handed out to our class. If we are thinking in broad terms, the Skittles that we observed in this study would be considered a sample. This particular type of sample would most likely be convenience sampling. Convenience sampling is a type of random sampling that occurs very conveniently. In our case, it was very easy to acquire our data because our teacher handed all of us a bag of Skittles. This type of sample is somewhat random, but is usually subject to bias. Bias means that the results of sample aren’t always representative of a population as a whole.
Now, to compare the data! Below is a frequency chart of the actual frequency and relative frequency of the class compared to the relative frequency predictions that I made:
​
​
​
​
​
​
​
​
​
​
​
As you can see from the table above, my predictions for the class data were pretty close to accurate! The graphs and the relative frequency were somewhat what I expected to see. I expected to see a set proportion close to .20 for each of the colors of Skittles. This turned out to be almost true. I was surprised to see green Skittles have a higher frequency. There aren’t any outliers based on the class frequency as a whole. However, in the individual bags there were several outliers to this data. When added into our class data total, they were able to be eliminated and not interrupt the distribution of the data. My data and the class data are slightly different. The relative frequenciesn for my bag of Skittles aren’t quite as consistently close to .20 for each category as the class data. It was, however, close enough for me to make a reasonable prediction about how the colors of candies would be distributed.



