I am studing some capacity planning metrics and I am curious on how should events in a time period should be tracked for use in analysis.
Lets say I have the following:
Week #, Usage, Conversions, ST1 Starts, ST1 Ends, ST2 Starts, ST Ends 1,345,0,1,0,0,0 2,456,0,0,1,0,0 3,234,0,0,0,1,1 //ST2 Starts and ends in the same week 4,567,1,0,0,0,0 5,673,2,0,0,0,0 // There are 2 conversion that week 6,879,0,1,0,1,0 // ST1 and ST2 have started 7,545,0,0,0,0,0 // ST1 and 2 are still running 8,789,2,0,1,0,0 // 2 conversions, ST1 ends, ST2 still going 9,342,0,0,0,0,1 // ST2 ends
The data after Week # and Usage is the output of a crosstab table of a calendar which I massaged into a weekly sample (I only get weekly data even though I get the calendar stuff on specific days). As you can see they only have the start and end times indicated for the system tests. So this brings me to question #1:
For statistical analysis should I indicate, for running tasks a 1 during all periods that something is running:
e.g. Week,ST1, ST2 1,1,0 // ST1 starts 2,1,0 3,1,1 // ST1 ends, ST2 Starts 4,0,1 5,0,1 // ST2 ends 6,0,0 7,1,0 //ST1 starts and ends in this week 8,0,0
Or for time series analysis is the start and ends only significant?