CPA Tutoring

View Original

Regression Analysis

Another important type of analysis is examining the relationship between different items. For example, consider a restaurant located near a sports stadium. You might want to look at the relationship between the sales it generates on days when there are games at the stadium versus days when there are not.

The need to analyze the relationship between different items is why we have what is called regression analysis. Regression analysis looks at the relationship between two different variables. When we use regression analysis, we’re observing how the independent variable affects the dependent variable.

When thinking about the dependent and independent variables, the dependent variable is determined by the independent variable. In other words, the independent variable is not influenced by changes in the dependent variable.

Now let’s use regression analysis to determine our total production costs. If we’re calculating our cost of production, then the number of units we produce will be the independent variable (X), and our total cost will be the dependent variable (Y) because the cost is determined by how many units we produce. The slope of M is the variable cost per unit.

Essentially, we’re trying to figure out how much our total costs (the dependent variable)

change based on changes in the number of units we produce (the independent variable).

Let’s look at the formula for how we analyze this: Y = MX + B.

Here, Y is what we’re trying to figure out; it’s the dependent variable. In our example, it’s the total costs. We’re trying to determine what our total costs are. Then, X is the independent variable; in our example, it’s the number of units that we’re producing.

Next, we have M, which is the slope. M represents how much the dependent variable (Y) changes for every one-unit increase in the independent variable (X). With the example of solving for total costs, M will be the variable cost per unit.

Lastly, B represents the Y-intercept. In accounting terms, this refers to fixed costs, which are not determined by anything else and remain the same. Thus, we don’t multiply them by anything in this formula.

Study Tip 😀

Regression Analysis Formula is Y = MX + B.

Example – Regression Analysis

Let’s set up an example of this regression formula with some numbers. This month, company A produced 1,000 widgets. The fixed costs are $2,000, and the variable costs per unit are $3. What were Company A’s total costs?

We have the equation Y = MX + B. To plug these numbers into our formula, Y represents the total costs, which is the dependent variable and what we’re trying to figure out. X, the number of units we’re producing (1,000), is the independent variable. M, the slope, is the variable cost per unit ($3). Finally, B is the Y-intercept, or the fixed costs ($2,000).

Y = $3 X 1,000 + $2,000 = $5,000 

Coefficient of Determination

We’ve been examining two different variables, the independent variable, and the dependent variable. When analyzing them, we need to question the strength of the relationship between the two. How closely does one influence the other?

This is why we introduce a term called the coefficient of determination, represented as R^2. R^2 attempts to articulate what proportion of Y (the dependent variable) is determined by X (the independent variable).

R^2 can range between 0-1. Zero indicates that none of the changes in Y can be explained by X, the independent variable. Conversely, a positive one signifies that all the changes in Y can be explained by X, the independent variable.

Study Tip 😀

The coefficient of determination is measured by R^2 and can range between 0-1.

The coefficient of determination can also be understood as the “goodness of fit”, suggesting we’re trying to demonstrate the fit between X and Y. If there’s a good closeness of fit, then R^2 would be closer to one. That’s because the closer it is to one, the more the changes in X explain the changes in Y.

In practice questions, if you are given two possible formulas to solve, choose the one with the highest R^2 value. 

Coefficient of Correlation

Having discussed the coefficient of determination, let’s talk about another important coefficient, the coefficient of correlation. The coefficient of correlation is used to describe how related two items are. It doesn’t necessarily imply that one is influenced by the other; they just happen to be related. The coefficient of correlation can be represented by R (not R^2).

The coefficient of correlation (R) can range between –1 to +1. We don’t typically use a formula for correlation; it simply aids us in seeing how related two items are. For example, consider a coffee company trying to determine the relationship between the days it snows and the number of cups of coffee sold.

Study Tip 😀

The coefficient of correlation is measured by R and can range between –1 to +1.

Through analysis, the company discovers that on days when it rains 10% more than usual, it sells 10% more cups of coffee than normal. This one-to-one ratio implies that R is +1, indicating a perfect positive correlation. Conversely, suppose that every day it rains 10% more, the company sells 10% fewer cups of coffee. In this case, R would be –1, signifying a perfect negative correlation.

High-Low Method

Now, we’re going to transition into talking about the high-low method. Sometimes we’re not directly given the amount of fixed costs or a variable cost per unit. We have to use what is called the high-low method to figure this out. 

Study Tip 😀

The high-low method involves taking the difference between the month with the highest activity and the month with the lowest activity. 

Let’s look at how we performed over the past three months. In June, we sold 700 units and it cost $6,000. In July, we sold 750 units and it cost $6,200. In August, we sold 800 units and it cost us $6,500.

We call this the high-low method because we’re going to choose the month with the highest sales and the month with the lowest sales. We’ll look at June and August. There was a 100-unit difference between what we sold in June and August (800 units – 700 units), and the cost changed over that time by $500 ($6,500 – $6,000). 

We divide our $500 change in costs by 100 units, showing us that the variable cost is $5 per unit.

For June, if variable costs are $5 per unit, then our total variable costs were $3,500 (700 units X $5 variable cost per unit). Means that our total fixed costs were $2,500 ($6,000 total costs – $3,500 variable costs).