Natasha Nigar1* and Oki O2
1Department of Computer Science, University of Engineering and technology (RCET), Pakistan
2Department of Computer Science, University of Engineering and technology (RCET), Pakistan
*Corresponding author: Natasha Nigar, Department of Computer Science, University of Engineering and technology (RCET), Pakistan
Submission: May 30, 2022;Published: August 10, 2022
ISSN:2832-4463 Volume2 Issue3
Context: During project management, software project scheduling is performed to generate a schedule
for system. It is one of the most important steps taken before project development.
Objective: This study aims to identify and analyze common trends, gaps and future studies in line with
formulated research questions.
Method: According to the guidelines proposed in evidence-based software engineering literature, a state
of art literature review has been conducted.
Result: We selected a total of 41 primary studies from the search process. This selection comprises of 14
journal articles, 25 conference papers, 1 book chapter, and 1 article.
Conclusion: Project scheduling has been discussed by researchers significantly. There are number of
limitations in existing scheduling proposed techniques which include: lack of scalability, inclusion of
less dynamic events, inefficiency in the presence of more dynamics, a low number of project scenarios
handling. The existing techniques have been applied in complex and real settings in a limited means too.
Keywords: Dynamic software project scheduling; Search-based optimization; Mathematical model
Due to exponential increasing trends in software market, software projects’ success heavily depends on efficient project plans and reduced development costs. This refers to a Scheduling Problem (SP) where decisions are made about, who does what during project life cycle [1]. Software Project Management (SPM) is a complex task due to unpredictable dimensions. Software project scheduling is a main domain of SPM and it involves the management of large teams in a dynamic environment with uncertain parameters [2]. There exists many software project management supporting tools such as MS Project [3] but they fail in a dynamic environment when dealing with uncertainties. Search-Based Software Engineering (SBSE) is an emerging field to cure such complex software engineering problems. Search-based optimization techniques have been applied to wide variety of software engineering problems [1] and, this research focuses on the software project scheduling.
SBSE search algorithms can be categorized into three main groups as listed below:
A. Exact optimization methods: Branch and Bound algorithm, and Integer Linear
Programming are examples of such methods and they guarantee to find an optimal
solution [4].
B. Heuristic algorithms: These algorithms find a ‘good’ or near-optimal solution such
as greedy algorithms.
C. Metaheuristics: is the most preferred approach in the SBSE
field [5] that continue the search beyond the first encountered
local optimum. Examples are Genetic Algorithms (GA), Particle
Swarm Optimization (PSO) and Ant Colony Optimization (ACO).
Some hybrid approaches are also used that combine techniques from these three main groups. The study aims to chronologically review and select the published literature and present an overview of existing software project scheduling problems and techniques used to solve those problems.
Existing published literature review has been classified as
follows:
A. Traditional Literature Review (TLR): TLR attempts to
establish current research trends. A critical summary is written
by examining the body of published literature.
B. Systematic Literature Review (SLR): SLR uses a structured
methodology towards clarifying the precise set of formulated
research questions and identifies gaps, contradictions, and
inconsistencies in the literature
To the authors’ best knowledge, no SLR exists that focuses on software project scheduling taxonomies, techniques, targeted software models for scheduling, data sets used and limitations/ benefits of existing scheduling techniques. The essence of this SLR is to present the available evidence regarding (1) the existing scheduling taxonomies (2) techniques (3) software models used (4) types of data sets used (5) their limitations/benefits. Therefore, this SLR will provide insight for both researchers and project managers in the academia and industries to create more efficient project plans. This article is structured as follows. Research method used in this study is described in section 2. Section 3 presents the results and discussions. Research finding are described in section 4. This paper concludes in section 6.
We have adopted the approach proposed by [6] in performing this SLR (Figure 1). Refereeing to Figure 1, the review process consists of six phases. In first phase, a set of research questions were formulated. Search strategies were designed in phase 2 in accordance with formulated research question. In this phase, search terms were identified and choice of literature resources was made. Data is extracted from literature resources in phase 3. Phase 4 concentrates on refinement of extracted data by scrutinizing titles. A quality assessment criterion is applied in fifth phase to further evaluate data. In last phase, studies for analysis are selected and subsequent actions are performed.
Figure 1:Phases of review process.
Research questions
The basic purpose of this study is to summarize the scheduling
scenarios and techniques and identify area of further research. Five
Research Questions (RQs) were formulated as presented below:
A. RQ1: What are taxonomies of scheduling?
B. RQ2: What are software models that have been targeted
for scheduling?
C. RQ3: What are existing optimization techniques used for
scheduling problem?
D. RQ4: What kind of dataset have used in existing proposed
techniques?
E. RQ5: What are limitations/benefits of existing scheduling
techniques?
PICOC [6] is described below, based on above research
questions:
Population (P): Scheduling for software projects development
methods (medium to large scale).
Intervention (I): Scheduling Methods/techniques
Comparison (C): No comparison intervention.
Outcomes (O): Accurate scheduler for software projects under
dynamic environment.
Context (C): Software project scheduling
Search strategy
The search strategy consists of identification of search terms, data sources and search process as explained below:
Search strings: According to the guidelines in [6], search terms are built as presented below:
a) Major terms are derived from research questions.
b) Synonyms are identified relevant to major terms.
c) Keywords are identified in relevant research papers or
books.
d) Boolean OR operator is used to incorporate alternative
spellings and synonyms.
e) Boolean AND operator is used to link the major terms.
The resulting search terms are written as follows:
1. Software project scheduling OR (problems/OR techniques/)
2. (Dynamic/OR Optimization) AND Software project
scheduling
Data sources: We conducted search on five electronic database resources and used title, abstract and index terms for journals papers, conference proceedings, and workshops, symposiums, and books chapters to primarily extract data. These database resources include: IEEE Xplore, ACM Digital Library, ScienceDirect, Springer, and Google Scholar.
Search process: The search process employed in this research consists of two stages as depicted in Figure 2.
Figure 2:Search and selection process.
Stage 1: Papers were collected by performing as extensive search on five electronic database sources.
Stage 2: In this stage, the screening of reference list of all relevant papers (collected in stage 1) is performed to identify any additional relevant papers and combine them with ones in stage 1.
Data extraction
In search process stage 1, 4808 papers were collected from five electronic databases. Thereafter, titles of these studies were used to eliminate duplicate and irrelevant studies and we got 200 relevant studies consequently. Next, the reference list of each of these 200 papers was screened to detect any relevant study that might have been missed out during initial search process. This effort led to the identification of 5 additional relevant papers and total of selected relevant papers was 205. Table 1 describes inclusion and exclusion criteria.
Table 1: Inclusion/Exclusion Criteria.
Quality assessment
Table 2: Quality assessment criteria.
The quality assessment (QA) checklist for selected studies was customized based on guidelines mentioned in [6]. In Table 2, we formulated the quality assessment questions (Table 2) to evaluate the credibility, completeness and relevance of the selected studies. Each question has only three optional answers: ‘‘Yes’’, ‘‘partly’’ or ‘‘No’’. These three answers have been assigned scores as follows: ‘‘Yes’’ = 1, ‘‘Partly’’ = 0.5 and ‘‘No’’ = 0. Each study could obtain 0-5 points. We used the 2.5 (50%) point as the cut off point for including a study. If a study score less than 2.5 it would be eliminated from our final list of primary studies. Finally, we applied quality assessment criteria and 41 studies were selected. Table 2 quality assessment criteria.
Data synthesis
In this phase, selected studies are summarized to address the research questions precisely. The aim is to synchronize data to enhance the clarity. To synthesize data 41 selected studies were further processed based on criteria defined in Table 3 to access detailed contents of each study. The data extracted in this phase consists of both quantitative and qualitative data. Data related to RQ1 and RQ2, result was presented in a tabular and text form to address taxonomies of scheduling and targeted software models respectively. RQ3 and RQ4 was organized using visualization tools such as pie chart to present the multi-objective techniques used for scheduling problem and types of data sets used respectively. In RQ5, limitations/benefits of existing scheduling technique are described in a tabular form.
Table 3:Data synthesis criteria.
This section describes the detailed description of the finding of this review in line with formulated research questions. We selected 41 studies. Out of which 14 were journal papers, 25 conference papers, 1 book chapter, and 1 was article.
Taxonomies of scheduling (RQ1)
The applications of scheduling have wide areas. We divide the
taxonomies of scheduling into three different categories:
5.1.1. Project scheduling: It refers to the arranging activities
in a sequence and allocation of resources to these activities.
The software project scheduling problem is a variant of project
scheduling problem [1]. We can define SPSP as the allocating
software engineering (employees) to software tasks (activities) in
such a way that all tasks are covered to develop a software project.
It is a NP-hard problem. The main objective of project scheduling
is to minimize the project cost, duration and maximize the quality
of project. The most traditional popular methods used in project
scheduling are CPM [7] and PERT [8]. Many SBSE approaches have
been proposed for it as well. Which will be discussed further in
detail.
Machine scheduling: Machine scheduling deals with jobs and machine as resources. It can be further divided into ‘Single machine scheduling’ and ‘Parallel machine scheduling’:
A. Single machine scheduling
Single Machine Scheduling Problem (SMSP) constitutes the
foundation of scheduling theory. SMSP is the simplest form of
scheduling. All other problems arise from it and their role is vital
in both theory and practical application. In SMSP, multiple jobs
sequencing is done on a single machine. It can be elaborated by
running of multiple processes on a single CPU. In a single machine
environment makespan (total time) is independent to the schedule.
The primary rules for solving SMSP are Shortest-Processing-Time
(SPT), Earliest Due Date (EDD), Minimum Slack Time (MST) and
Weighted Shortest Processing Time (WSPT). Several techniques
like Dynamic Programming (DP), branch & bound approach have
been adopted to solve this problem.
B. Parallel machine scheduling
Parallel Machine Scheduling Problem (PMSP) is a generalization
of SMSP. If we extend SMSP, the first area is PMSP. Assigning
processes on a multi-process computer is an example of PMSP.
Performance measures for PMSP are makespan, mean flow time,
weighted mean flow time, number of tardy jobs and maximum
lateness. Makespan performance measure is meaningful and
objective for PMSP. According to the types of machines, Brucker et
al. [9] has categorized PMSP into three classes.
a) Identical machines: In this class, the specification of all
the machines is same. There is no difference among machines
regarding the processing of jobs. All machines process the jobs
in the same way.
b) Uniform machines: Each machine have different speed to
process a job. In this class, each job requires different processing
requirement.
c) Unrelated machines: It is generalized from uniform
machines. The processing time of each job is different on
different machines.
In PMSP, jobs may have precedence constraints or may be independent of each other.
Resource-constrained project scheduling problem: Resource- Constrained Project Scheduling Problem (RCPSP) [10] finds an optimal schedule that minimizes the project duration and meets the precedence and resource requirements. RCPSP have several kind of resources and each activity requires different quantities of resources. RCPSP is considered as general scheduling problem and open-shop, job-shop, and flow-shop scheduling problems are considered its special cases.
A. Flow-shop scheduling
Flow-shop is a special case of job-shop scheduling problem in
which the flow control enables an appropriate sequencing for each
job and for processing on a set of machines or with other resources
in compliance with given processing orders. There are ‘m’ machines
and ‘n’ jobs and each machine is bound not to perform more
than one operation simultaneously. Performance measures are
flowtime, makespan, and tardiness. Flow-shop scheduling problem
can be solved by either exact method such as branch and bound
or heuristics algorithms such as genetic algorithm. A special type
of flow shop scheduling problem is called permutation flow shop
scheduling.
B. Job-shop scheduling
The Job-Shop Scheduling Problem (JSP) is a generalization
of flow-shop scheduling problem. In JSP, there are ‘n’ jobs and
‘m’ machines, and each job is made of sequence of ‘o’ operations.
Each operation has attached processing time and precedence
constraints of jobs are defined between each job operation. The
objective is to find the optimal and feasible schedule. The main
difference between job-shop and flow-shop scheduling problem is
that flow-shop follows a unidirectional sequence while workflow is
not unidirectional. Therefore, in the route of jobs machine number
consideration is necessary in job-shop scheduling.
C. Open-shop scheduling
This problem is a special case of flow-shop scheduling problem.
In open-shop there are no precedence constraints in between
the operations of jobs. If there are ‘n’ jobs and ‘m’ machines and
each job is made of sequence of ‘o’ operations. Operation ‘o’ must
be processed on machine ‘m’. The main purpose is to find the job
sequences and machine sequences. Job sequences means that same
job order of operations and machine sequences means that on same
machine, the order of operations to be performed.
Software models for scheduling (RQ2)
In this research question, we have identified how many studies clearly defined which software process model they have used in their software project scheduling problem. By doing detailed SLR, we have come to know that out of 41 studies; only five studies clearly defined that which software model they have used. Four studies [11-14] use waterfall model and one study [15] use Agile methodology. For rest of studies, we get perception that they are working in waterfall model but still not sure.
Existing optimization techniques (RQ3)
Figure 3:Existing optimization techniques.
In SLR related to software project scheduling, out of 41 studies that were identified, The 13(32%) studies [1,11,13,16- 25] used genetic algorithm as optimization technique. Ant colony optimization was used by 7(17%) studies [26-32] to solve the SPS problem. Particle swarm optimization technique was used by only 3(7%) studies [12,33,34]. 5(12%) studies [35-39] used their proposed algorithm to solve this NP-hard problem. Hybrid algorithms were used by 6(15%) studies [14,15,40-43,]. In addition, 7(17%) studies [44-50] did comparison between different metaheuristics algorithms. Hence, we can conclude that genetic algorithm has been used widely to solve the software project scheduling problem. After genetic algorithm, researchers have used ACO to deal with this problem. Following pie chart denotes percentage of each algorithm used (Figure 3).
Data sets used (RQ4)
Figure 4:Data used.
In this research question, we have identified that what kind of data sets have been used by researchers to evaluate their proposed techniques. By doing detailed SLR, we found that out of 41 studies, only 6 (15%) studies have used real-world data [14,15,19,36,39,42], 6 (15%) studies considered case studies [12,17, 18,25,29,43] to evaluate their approach. 11(27%) studies [1, 24,30,31,35,41,44- 48] used Alba et al. [1] (The employee allocation model of SPSP was first designed by Alba et al. [1]). 9 (22%) studies proved their approach by using examples [13,16,20,32-34,38,40]. One study used combination of real-world and hypothetical data [50]. One study used real-world and benchmark data (Alba et al. [1], [36]. Combination of real world and randomly generated instances was used by one study [27]. Two studies evaluated their approach by using hypothetical project [22,25]. One study used data from PSPLIB [33]. One study used data generated from simulation tool [37]. Data generated by ProGen was used by one study [21]. Other real instances were considered by one study [50]. Most researchers (27% studies) used Alba et al [1] instances to evaluate their proposed approach and only few studies used real-world data. Hence, we can conclude that more real-world data should be used to evaluate an approach. Following pie chart denotes the percentage of data used (Figure 4).
Limitations and benefits of existing scheduling techniques (RQ5)
We identified following limitations and benefits.
Limitations:
a) Most of the studies deal with two objectives to be
optimized usually time and cost.
b) Lack of real world data set usage while evaluating their
techniques. Most of the studies have used case studies or some
benchmark problems.
c) Most of the studies deal with static SPS considering that
no disruption occurs.
d) Lack of real-world and dynamic scenarios inclusion is
ignored in most of the studies.
e) Researchers don’t clearly define which software model
they are being used while dealing with SPS problem.
f) In most of the studies, tasks’ efforts are known in advance.
g) While proposing their technique, most of the studies do
not do comparison with other already existing techniques.
h) Several studies have used the same problem formulation
as defined by Alba et al. [1].
i) The changing objectives problem has not been addressed
by any study by considering them as extra dynamic events.
j) Most of the studies evaluate their proposed techniques on
small-scale projects.
k) Very few studies have deal with Agile as software model.
l) Developers’ skills and expertise, communication overhead
models, are not considered.
5.5.1. Benefits:
a) Many studies have included ‘robustness’ as objective
in their proposed technique. Provide robust solution in the
presence of uncertainty.
b) Some studies also allowed variation of human factors.
c) Instance generator have been used by researchers to
analyze different project scenarios.
d) Some models have also been proposed to consider
‘quality’ as important objective.
e) Many studies did comparison of different metaheuristic
techniques (using same problem formulation of Alba et al. [1])
to identify metaheuristics works best for SPSP.
f) Some studies have dealt with SPSP for global software
development.
Software project scheduling under dynamic and uncertain
environment is a big challenge for software engineering community.
From SLR, the following finding were discovered.
A. Genetic algorithm is most widely used to solve the
software project scheduling problem.
B. Most of the studies lack of inclusion of dynamic scenarios.
C. Most studies do not evaluate their approach on real world
data.
D. Most studies have used Alba et al. [1] instances to evaluate
their approach.
E. Most of the studies use same problem formulation as
defined by Alba et al. [1].
This work aims to examine and identify the status quo for dynamic software project scheduling [51,52]. The research method utilized was systematic literature review. In this method, some research questions are formulated and this study revolves around to identify answers of these questions. The essence of this study was to examine the software models that have been targeted for scheduling, scheduling techniques, their taxonomies and their limitations/benefits and ultimately to identify areas for future research.
© 2022 Natasha Nigar. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.