Regression Analysis is one of many ways in which forecasting and prediction can be done. This presents a brief stepbystep approach, using observed incident counts, to predict future incident trends to create a strong business case to executives for investing in further actions and improvements.
Regression Analysis is a statistical approach that can be utilized to predict future values based on a timeseries of observations of some independent variable. This approach can be used to perform forecasting using generally industry accepted statistical means. Regression attempts to find a "best fitting" straight line between data points (plotted X and Y coordinates on a graph) such that the line can then be extended to determine future points on that graph. This approach is best used with some confidence that meet the following criteria:
 It makes sense to predict the future behavior of incidents being analyzed based on past performance. Trending makes sense to show what might happen if no actions are taken.
 Past data, upon which the analysis is based, represents a true trend and does not vary widely based on major changes in business or IT activities (like a big merger, acquisition or deployment of a major new application).
 Enough data is available to trend (e.g. recommended minimum 6 months of incidents, but 12 months or more presents more reliable trends).
 Incident data has no outstanding outliers (e.g. each month has a steady stream of incidents that occurred without one month being unusually high or low – in this case you could remove the data for that month from the analysis).
The approach presented here for Regression Analysis will utilize a statistical regression called the Least Square Method. Least Square attempts to determine the best fitting straight line that exists between data points of X and Y coordinates on a graph. Here is a picture of what we’re trying to achieve:
In the above, we are potting monthly incident counts over time. The red line presents the calculated trend. The blue line actual observed monthly incident counts. The trend shows the prediction of how incident counts will rise if nothing is done.
Using the above example, a monthly incident count will represent the Y value, the month in which it is observed will represent the X value. The goal is to predict future values of Y (incident counts) for X values (months) in which there are currently no observations. The Least Square calculation for each Y value is calculated as follows:
Y = a + bX
The Y and X values represent plot points, monthly predicted incident count and month it occurs in respectively. The a and b values are only calculated once using the regression method for observed values (to be described later). The above formula is then executed for each X,Y data pair to determine the plot point locations for the "best fitting" trend line.
Working our example backwards, the following table shows the data points for the actual incident counts followed by trend line incident counts:
Month
(Qualitative)

Month
(Quantitative)

Incident Count (Actual)

Incident Count (Trended)

Sep

1

7966

7283.69

Oct

2

7497

7566.25

Nov

3

6699

7848.80

Dec

4

5828

8131.36

Jan

5

7289

8413.92

Feb

6

7187

8696.47

Mar

7

8363

8979.03

Apr

8

10164

9261.58

May

9

26029

9544.14

Jun

10

6260

9826.70

Jul

11

6152

10109.25

Aug

12

6619

10391.81

Sep

13


10674.36

Oct

14


10956.92

Nov

15


11239.48

Dec

16


11522.03

Jan

17


11804.59

Feb

18


12087.14

Mar

19


12369.70

Apr

20


12652.26

May

21


12934.81

Jun

22


13217.37

Jul

23


13499.92

Aug

24


13782.48

Those rows without an actually observed incident count value now just present the forecasted incident value based on a trend of the actual incidents that came before them.
How did we calculate the incident trend count values? Here are the steps:
 First calculate the b value from observed data.
 Then calculate the a value from observed data.
 Calculate the trend value for each row in your data table
Step 1  Calculating The “b” Value
Note: these steps only apply to data rows that have the Incident Count Actual values. In our example, this would be the 12 months where data was observed (rows 112 in the above table)
 Count the rows that have Incident Count Actual values (e.g. 12)
 Sum the Month Qualitative values (e.g. 1 + 2 + 3 + 4, etc. or 78)
 Sum the Incident Count Actual values (e.g. 106,053)
 Square each Month Qualitative value (e.g. row 1 would be 1, row 2 would be 4, row 3 would be 9, etc.)
 Sum the squares derived in Step 4 (e.g. you should get 650)
 Multiply each Month Qualitative value by its corresponding Incident Count Actual Value (e.g. row 1 would be 1 * 7,966, , row 2 would be 2 * 7,497, row 3 would be 3 * 6,699, etc.)
 Sum the values derived in Step 6 (e.g. you should get 729,750)
 Multiply the value in Step 7 by Step 1 (e.g. 729,750 * 12 or 8,757,000)
 Subtract Step 7 from Step 8 (e.g. 8,757,000 – 729,750 or 8,027,250
 Multiply Step 2 by Step 3 (e.g. 78 * 106,053 or 8,272,134)
 Subtract Step 10 from Step 8 (e.g. 8,757,000 – 8,272,134 or 484,866)
 Multiply Step 1 by Step 5 (e.g. 12 * 650 or 7,800)
 Square Step 2 (e.g. 78 * 78 or 6,084)
 Subtract Step 13 from Step 12 (e.g. 7,800 – 6,084 or 1,716)
 Divide Step 11 by Step 14 (e.g. 484,866 / 1,716 or 282.5599
Your "b" value is step 15 or 282.5599
Step 2  Calculating The “a” Value
Note: these steps only apply to data rows that have the Incident Count Actual values. In our example, this would be the 12 months where data was observed (rows 112 in the previous table).
Continuing the steps from above:
 Divide Step 3 by Step 1 (e.g. 106,053 / 12 or 8,837.75)
 Multiply Step 2 by Step 15 (your “b” number – 282.5599 * 78 or 22,039.39)
 Divide Step 17 by Step 1 (e.g. 22,039.67 / 12 or 1,836.6134)
 Subtract Step 18 from Step 16 (e.g. 8,837.75 – 1,836.6134 or 7,001.1367)
Your “a” number is Step 19 or 7,001.1367
Step 3 – Calculate The Trend Value For Each Data Row In Your Table
Now that values have been determined for a and b based on the observed (actual) incident counts, the forecast analysis can be run. The formula presented again is:
Y = a + bX
This can now be run for each observed and nonobserved row in your table. Note that X = month quantitative value and Y = forecasted incident count.
Using our example:
Month Quantitative Value or X

a Value (from steps above)

b Value (from steps above)

bX Value (first column * 3rd column)

Incident Count (Forecasted) or Y Value (2nd column + 4th column)

1

7001.1367

282.5599

282.5559441

7283.69

2

7001.1367

282.5599

565.1118881

7566.25

3

7001.1367

282.5599

847.6678322

7848.80

4

7001.1367

282.5599

1130.223776

8131.36

Etc..repeat for every row in your table  you can go out into the future as many months as you desire...

You can now plot the Incident Count Y values as your trend line similar to the graph example shown earlier.
Other Powerful Uses Of This Forecasting Approach
The results as shown here make a business case for how incidents will trend of no actions are taken to improve anything. It shows how high the incident counts can go and when those impacts may occur.
Management may sometimes push back on this. After all, its just numbers. Who knows? In this case simply track incidents that actually take place in succeeding months and compare them to your trend line. In some cases, the actual counts were seen to go above the trend line (“hey, it’s worse than you thought – convinced yet?).
Another approach is to tie financials into this. For example, if your hourly labor cost to deal with an incident is $62 and incidents average 2 hours (or $124) to deal with, than multiply the forecasted numbers by that cost. Now management can see a financial penalty to taking no action. These can add up to some big numbers that get attention.
You can also apply a little bit of machine learning to this. Keep your forecasted results as an outcome model. For a training model, update a copy of this with ongoing monthly observations to see if that fine tunes your initial forecasts.
You can also use this to estimate the impact of improvement initiatives. For each initiative estimate the reduction in incident counts. For example, if we fix XYZ it will lower monthly incident rates by 5% (as an example). Apply that as a factor to your forecasted rates and show how that can impact the trend line.
To be even more adventurous, you can also show the impact of the forecasted rates. For example, would more support staff have to be hired when incidents get over a certain level? Can the Service Desk handle call volumes or will it break at some point?