crowdsourcing software markets

HOW PROJECT DESCRIPTION LENGTH AND EXPECTED

DURATION AFFECT BIDDING AND PROJECT SUCCESS IN

CROWDSOURCING SOFTWARE

DEVELOPMENTMade By:Rozmin Noorddin (13K-2082)Rahim Punjwani (13K-2070)Musab Mehboob (13K-2088)Saim Ali Baloch (13K-2076)Section: A

BACKGROUND• Crowdsourcing is a method of letting out work to many potential providers on

the Internet by publishing the request for proposals (RFP) through an online marketplace.

• Crowdsourcing software development markets (CSM) provide a new way of outsourcing software to developers.

• The developers bid on the projects based on the RFP description and expected

completion time.

• CSM provides buyers the flexibility to get small scale, less time consuming

projects done in an efficient manner by developers at a competitive price.

BACKGROUND (CON’T)

• In a general software development environment, risks are involved with the

person outsourcing the project.

• However, in a crowd sourcing environment, bidders are the ones at risk as

flexibility is provided by the crowdsourcing market to the buyers to refuse to accept the developed project because it fails to meet the requirements.

• This issue is caused by two main reasons:

– The buyers not being adequate in their description.

– Underestimation of the time requirement by the bidders.

BACKGROUND (CON’T)

• Since the crowdsourcing market is so different than the traditional

software development market it has its own risks.

• In this environment the bidder is the sole bearer of all risks because the contract is fixed price and if at the time of signing, he has misjudged the pricing it could prove to be a fatal loss.

• This research helps to analyze the risks involved to the bidders in a

crowdsourcing environment by reversing the Agency Theory.

AGENCY THEORY

• Agency Theory views business transactions as contracts between principals (who let out work) and agents (who perform it).

• In a crowdsourcing market, the buyer is referred to as the principal, i.e. the person

who posts the RFP, and the providers bidding on the RFP to provide its demanded

services are referred to as the agents.

• The Agency theory generally explains the risks involving the principal. It has

categorized these risks into three categories:

– Adverse Selection Risks.

– Moral Hazard Risks.

– Unexpected Contingencies.

RISKS INVOLVING THE PRINCIPAL

IN THE AGENCY THEORY• Adverse Selection Risks:– The principal faces the risk of choosing the wrong person for his job

because there is no mechanism for checking the qualification or work standard of an agent.

– Agents can misguide the principals about their competencies.• Moral Hazard Risks:

– Since there is no way for the principal to check on the agent, the agents

may benefit and guard and work in their own interests which may affect the

quality of the project.• Unexpected Contingencies:

– Contingencies (unforeseen events) usually occur in projects with large time limits because the requirements might change during the software development process.

– This may occur if the technical environment changes and new software

aspects are then needed or if the buyer recognizes the need of more and

different specifications.

REVERSING THE AGENCY THEORY

• The Agency theory unveils the risks involving the principals, but in

crowdsourcing environments the risks can be totally different.

• Reason being that the principal can always back out and refuse to accept the developed project.

• Therefore, the agent is at real risk here and therefore the theory needs to be reversed.

• In order to achieve this, some hypotheses have been made, which will

be then proved by analysis of relevant data and post hoc analysis.

REVERSING THE AGENCY THEORY

1. Adverse Selection Risks: Whether to bid or not – based on RFP and expected project

duration. Bidding amount – Bidding amount is a function of time invested

by the agent. Choose RFP based on the available information – where the

information might be incomplete and vague.

2. Moral Hazard Risks: Not a concern because agent does not need to worry about what the principal is doing as he is not serving the agent.

3. Contingency Risks: Not a concern because projects are too short for

changing specifications risks to be a major concern.

HYPOTHESIS

The hypotheses are as follows:

H1. Expected project duration as posted by the principal will affect the amount

bid by the agents.

H2. Project description length will affect the amount bid by the agents.

H3. Projects described at greater length will overall be more successful.

H4. Longer duration projects will be more successful.

H5. Projects with higher bid amounts will be more successful.

H6. Unsuccessful projects will have larger positive residuals in the amount bid.

HYPOTHESIS (CON’T)

THE DATA

• The data has been collected from one of the renowned CSMs, located in US. It includes all information regarding RFPs made in 2005 and 2006 and the bids made on them.

• The projects handled by the CSM in consideration include the design of user interfaces, codes done in several programming Languages, project

tests and other related tasks regarding Software Development. The scope of the projects is observed to be usually small.

• In the CSM, members register by giving their short bio-data. Members

can then post RFPs on the projects acting as a principal or they can even

bid on the RFP’s acting as agents.

THE DATA (CON’T)• A principal determines about the deliverables and description of the project

whereas an agent bid on the entire package of deliverables and description.

• After choosing the agent, principal pay the money into the account managed by

CSM site named as escrow account. The payment is released when principal gets

the project and is satisfied with it. The CSM site also imposes service charge for

each RFP.

• Principals accessing the CSM: All of these principals posted 31,276 RFP

collectively.

• Agents accessing the CSM: The total bids made by agents on the RFP were

332,637.

Countries US UK Canada

Australia

Germany

India

Sweden

Netherlands

# Of Principals

17,919 3,874

1,942 1,921 611 442 317 317

Countries

US India Romania

Pakistan

Russia

Canada

UK Ukraine

# Of Agents

6,070 5,632 4,968 1,670 1,327 1,241 1,236 1,228

THE DATA (CON’T)• Following is the list of variables used in the analyses, their description,

and descriptive statistics:Name Definition Mean Std.De

vMedian

Min Max

Project duration

Number of days principal expect the project to take

13.46 15.28 7 0.01 99.35

Actual duration (used in the post-hoc analysis)

Number of days that elapsed between the project being signed and when it was delivered

23.85 34.86 13.00 1 496

Project description length

Numbers of words used in the title, description and deliverables depicting the project

241.85 56.99 239 11 856

Amount Bid Amount the agent bid on this project 95.51 91.82 70 4 499.99

Project Success A dichotomous variable indicating whether the project was eventually paid for (=1) or not paid out (=0).

0.9989 0.0332 1 0 1

Residual This is an intermediary variable calculated as the difference between the amount bid and the best fitting line that predicts what that amount should be based on duration and project description length.

-0.0002 0.9997 -0.2262 -2.2548 6.2896

DATA ANALYSIS

• Stage 1 (Testing H1 and H2):

– GLM (Generalized Linear Model) were used to test H1 and H2.

– The covariates were Project description Length, and Project Duration; and

the unit of analysis was bid.

– Hypotheses H1 and H2 were accepted based on the test.

– Time is directly proportional to both the covariates.

– This is because longer the projects are and deeper its details are; greater

will the agent need to spend time on it, and so, greater will he charge.

– The standard beta results reflected that project duration contributed 4 times more than project description length did to shaping bidding amount.

– This means that agents were more influenced in making their bid amounts

by the assessment of duration that the principal made than by the describing

lengths of projects and their deliverables.

DATA ANALYSIS (CON’T)

• Stage 2 (Testing H3, H4, and H5):

– This stage used logistic regression to determine how the project

description length, project duration and bidding amounts affect the

success of project.

– The unit of analysis was RFP, and the test correctly categorized

70.2% of the projects as successful or unsuccessful.

– The results show that description lengths and project duration

affected whether the project was successful, supporting H3, and H4.

– However, bidding amount remained ineffective in determining the

success of the project, thus, refuting H5.

DATA ANALYSIS (CON’T)

• Stage 3 (Testing H6):

– T-test with RFP as the unit of analysis was used to determine

whether unsuccessful projects had larger positive residual in the biding amount.

– H6 was supported by the test.

– The results demonstrate that unsuccessful projects were measured

by the principal to require a shorter duration, were briefly described,

and were bid at poor amounts by the agents accordingly.

– The residual analysis showed that usually, the residual of bidding

amount (i.e. the difference between the actual bidding amount and the predicted amount) is higher in unsuccessful projects.

– This is because the estimated value by the principal is not based on

the actual duration and project description length of the project.

RESULTSAfter analyzing a renowned crowdsourcing site for a period of two years some

behavioral conclusions can be drawn by analysis of data. The results can be

categorized into two categories:

•The projects having greater description length and longer time duration

defined by the principal are bid higher by the agents because this shows that the

principal is aware that the project will take a relatively longer time to build and

requires more description of what needs to be done. Therefore the response from

the agents is quite natural. The success rates in these projects is generally higher.

•The projects having shorter description and shorter time are also bid higher

but unlike the first category they’re not over-bid. The over bidding is also a

response from the agents because they are aware of the risks involved and therefore try to overcome this problem by controlling the only aspect of the project that they can control that is bidding. The failure rate in these projects is generally higher.

RESULTS IN VIEW OF THEORY

• In the light of the agency theory the principals are at risk because the

agent knows more about the project itself than the principal but in the

crowdsourcing market this concept has to be totally reversed.

• In the crowdsourcing market the agents become at risk.

• Analysis of the unsuccessful projects shows that failed projects had a

shorter description and less time expectancy.

• This may be caused by the principals inability to judge the project on a technical basis.

• This presents with the issue of information asymmetry.

• To solve this agents usually bid higher to reduce the risks they face but

still the project is at risk of failure due to information asymmetry.

RESULTS IN VIEW OF THEORY (CON’T)

• In a usual crowdsourcing environment the concept of lowballing can be seen.

• Lowballing is the act of bidding lower than what actually should be bid in order to gain the contract.

• In a typical software development environment lowballing does not

occur because of the low number of bidders.

• But in a crowdsourcing environment where agents are numerous

lowballing should have been seen.

• That is the projects with shorter description and shorter time should

have been bid low but this behaviors was not observed during the course of this study.

PRACTICAL IMPLICATIONS

• In a crowdsourcing market all the players involved namely the agents,

the principals and the site itself need the project to be successful.

• The data suggests that if the principal gives too much description and

too much time the project would be bid higher.

• If it gives less description and less time it is at risk of failure along with having to pay more than the project itself.

• So in order for the project to be successful and the price to be fair a

mechanism needs to be developed to calculate the time required for the

project to facilitate the agents.

LIMITATIONSThe data studied in this paper is archival data. Archival data has certain

merits, in that, it is impartial, and reflects exactly how the market behaved. Also the data deals with real-world settings, and not games or simulations. However, on the darker side, the constraints related to archival data are:

•The data is already existing, so new constructs of interest cannot be added. This may cause failure of data to address some issues; and let it present only a part of the story.•Researchers are forced to assign meaning to the data. However, in survey

research, the survey can be planned so that the construct and content validity of

each measure is established.•The data results might be inapplicable to other software development

auction markets, as the CSM studied deals with small projects, with median of

project duration being a week. Additional research is obligatory to verify whether

the results apply to more complex markets.•The data is old, as it was collected 10 years ago. The statistics might be

inapplicable today.

CONCLUSION• The agents take a safety precaution by higher bidding and they tend to

be more unsuccessful.

• Charging more applies to deal with information asymmetry and also

favors the principal as the success rate goes up.

• CSMs have unique growth and importance in software development

crowdsourcing.

• Reasons include that the agents take a safety margin by bidding higher

based on expected duration and description length in projects that will

eventually be unsuccessful especially in the upper range of pricing.

• Theoretically, this bidding higher than expected may represent a

method agents apply to deal with information asymmetry in contexts that

favor the principal.

THE ENDTHANK YOU !

crowdsourcing software markets

Documents