Introduction and data loading

Adam Kucharski

2024-01-06

There are two approaches in project management to present multifunctional projects on a graph:

  1. Activity on Arc (AoA) notation, where each arrow (edge) means an activity. Nodes determine the start or end of one or more activities.
  2. Activity on Node (AoN) notation, in which the node denotes an activity. Arrows represent the relationships between given activity and the activities immediately preceding and immediately following it.

The current version of the package uses the AoA notation.

The Critical Path Method

In the CPM (Critical Path Method) we assume that the duration of each activity is exactly known (deterministic).

Let’s take the AoA notation. The activity starts at node number \(i\) and ends at node number \(j\). The whole project begins and ends in one node. Let’s look at the figure below. Starting and ending moments differs because activities may occur in parallel and serially within one project. Besides, activities will vary in duration.

Starting and ending moments
Starting and ending moments

\(t_{i}^{0}\) - the earliest possible occurrence of the event \(i\);

\(t_{i}^{1}\) - the latest possible occurrence of the event \(i\);

\(t_{j}^{0}\) - the earliest possible occurrence of the event \(j\);

\(t_{j}^{1}\) - the latest possible occurrence of the event \(j\).

The moment of completing the entire project is called the directive term. CPM analysis can be divided into the following stages:

  1. Setting the earliest times for tasks.
  2. Setting the latest times for tasks.
  3. Determining slacks for activities.
  4. Preparation of the project schedule.
  5. Determining the critical path of the project.

Additionally, we can make a Gantt chart and ASAP (As Soon As Possible) and ALAP (As Late As Possible) charts.

The preparation of a project schedule consists in determining all possible starting and completion times for the activities. In addition, we calculate a total float time (total slack) for each activity. During this time, the activity can be delayed without delaying the completion of the entire project.

Let’s denote:

\(ES_{ij}=t_{i}^{0}\) - the earliest possible start time of activity \((i,j)\);

\(LF_{ij}=t_{i}^{1}\) - the latest possible end time of activity \((i,j)\);

\(LS_{ij}=LF_{ij}-t_{ij}\) - the latest possible start time of activity \((i,j)\);

\(EF_{ij}=ES_{ij}+t_{ij}\) - the earliest possible end time of activity \((i,j)\);

\(TF_{ij}=LF_{ij}-ES_{ij}-t_{ij}\) - total float for activity \((i,j)\). A reserve of time that can be used to perform a given activity without affecting the completion time of the project.

\(FF_{ij}=t_{j}^{0}-t_{i}^{0}-t_{ij}\) - free float for activity \((i,j)\). Maximum amount of time by which an activity can be delayed without delaying the earliest possible start time of any following activity.

\(CF_{ij}=t_{j}^{1}-t_{i}^{1}-t_{ij}\) - conditional float for activity \((i,j)\). Maximum amount of time by which an activity can be delayed without delaying without affecting time slacks of activities preceding the activity (lying on the same path).

We assume that the directive deadline (DD) equals to the earliest time of the final event \(\left(DD=t_{n}^{0}\right)\) so the slacks for critical activities equal to zero. Extension of the duration of any critical activity by \(\tau\) time units will postpone the completion time of the entire project by \(\tau\) units.

If we assume that \(DD>t_{n}^{0}\) then extension of the duration of any critical activity by \(\tau\) time units will postpone the completion time of the entire project by \(\tau-\left(DD-t_{n}^{0}\right)\) units for \(\tau>\left(DD-t_{n}^{0}\right)\). Smaller delays do not affect the duration of the entire project.

The PERT method

The abbreviation PERT stands for Program Evaluation and Review Technique. This method is based on the following assumptions:

Experts estimate the durations of individual activities. As a result, we get:

The classical PERT time analysis begins with the introductory phase, which involves estimating the expected value of duration and duration variance for each activity.

Expected duration of the activity (): \[ m_{ij}=\frac{t_{ij}^{a}+4t_{ij}^{m}+t_{ij}^{b}}{6} \] Variance of activity () duration : \[ s_{ij}^{2}=\left(\frac{t_{ij}^{b}-t_{ij}^{a}}{6}\right)^{2} \]

However, the critpath package also offers other methods for estimating expected value and variance. For more information, see the help to the solve_pathAOA() function.

The calculations in the stochastic problem are analogous to the deterministic (CPM) except that we now use the expected durations instead of the fixed ones.

The classical PERT approach uses the central limit theorem. This theorem states that the sum (difference) of a large number of independent random variables with the same distribution has an asymptotically normal distribution.

The chance of meeting the expected directive deadline () equal to the earliest time of occurence for the last activity (\(t_{n}^{0}\)) is 50%.

In the PERT method, it is assumed that the should be selected in such a way that its probability would be between 30% and 60%:

\[ 0{,}3 \leqslant P\left(t_{n}<DD\right)\leqslant 0{,}6 \]

A schedule with less than 30% chance of realization is called the risk taker’s schedule, and the one with more than 60% chance is called the belayer’s schedule.

Two ways to load project data

The package in its current version adopts the concept of AoA. For this reason, there are two ways to enter the problem data. They differ in their approach to introducing the structure of the activity graph. The first one assumes that the user gives the start and end nodes of all activities, including the dummy ones. The second way is to provide the activities immediately preceding it for each activity. In this case, dummy activities are usually not introduced. The function that reads the data will try to determine these activities and add them to the list. For this purpose, the algorithm described in the work by Cohen & Sadeh in 2006 was used (see DECRIPTION file for the full bibliographic data).

Let us look at a small example of a project in which the durations of activities are known and predetermined. The table below provides information on the structure of the graph and the duration of activities in some example project. The same dataset is available in the critpath package as cpmexample1. As you can see, the start and end nodes are given for each activity. The data set should be prepared in the form of a data frame. This approach results from the use of the DiagrammeR package, which requires specifying the graph structure as pairs of nodes that start and end a given activity.

Tab. 1. Data loaded based on the start and end nodes
Start. node End. node Name Duration
1 2 A 3
1 3 B 2
1 4 C 5
2 4 D 1
3 4 E 1
4 5 F 3

Usually we know what the graph looks like before solving the project management problem but let’s assume that at the moment we don’t. With the help of the plot_graphAOA() function we will create a graph based on the data from Table 1. The function takes two arguments: data frame with … well, data and whether the durations are deterministic.

The data frame should have the same structure as in Table 1. The order of the columns is important, not their names. The first two columns contain the numbers of the starting and ending nodes. The next column contains the names of activities and the last one times of their duration. All package functions that use the data frame as an argument must keep this structure.

plot_graphAOA(cpmexample1)

Fig. 1. Graph for the cpmexample1 dataset

That was the first method of loading an AOA project. Now we’re going to use the list of predecessors. Let’s look at the next table. This example comes from a book (Miszczyńska D., Miszczyński M. 2000) unfortunately available only in Polish.

Tab. 2. Data loaded from the list of activity predecessors
Name Predecessors Duration
A NA 6
B NA 2
C A 4
D A 6
E A 3
F C,D 2
G D 5
H B,E 3
I H 2
J F,G,I 2

A very important difference is the appearance of the NA symbol, which in this case means lack of a immediately preceding activity. Activities A and B are not preceded by other activities, so the whole project starts from them. If the activity has several predecessors, their names should be separated by commas. Moreover, this example contains one dummy activity because C and D have the same start and end nodes. This dataset is available in the critpath package as cpmexample2.

plot_graphAOA(cpmexample2, predecessors = TRUE)

Fig. 2. Graph for the cpmexample2 dataset

Dummy activities are marked with a dashed line. The examples above assumed a deterministic durations of activities, but for durations as a random variables, the graph structure should also be given in one of the ways listed.

The list of immediately preceding activities must be transformed into a set of pairs of start and end nodes. This is the only way to create a graph in the DiagrammeR package. We used an algorithm by Cohen and Sadeh. This algorithm not only creates a from-to list for activities, but also allows dummy activities to be added to the graph. However, it has a disadvantage. After adding dummies, it is very likely that for some activities the start node number will be higher than the end node number. This can lead the critpath package to incorrect results, e.g. incorrect project completion date. Therefore, a function has been added that renumbers the nodes so that the starting node has a lower number than the end one.

Graphs are drawn by the render_graph() function from the DiagrammeR package using the Fruchterman-Reingold algorithm. It is to guarantee, among others not intersecting the edges, but this is not always possible. Besides, subsequent runs of this function for the same problem can produce (and often do) a graph that is placed on the plane in a slightly different way each time. However, this can be controlled using the seed parameter.