1 competitive intelligence and the web presented at amcis2003 tampa, florida by dr. robert j....

37
1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

Upload: skyla-hetherington

Post on 01-Apr-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

1

Competitive Intelligence and the WebPresented atAMCIS2003

Tampa, Floridaby

Dr. Robert J. BoncellaWashburn University

Page 2: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

2

Competitive Intelligence

“the process of ethically collecting, analyzing and disseminating accurate, relevant, specific, timely, foresighted and actionable intelligence regarding the implications of the business environment, competitors and the organization itself”

Page 3: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

3

Competitive Intelligence Process

– Planning and direction • working with decision makers to discover and hone their intelligence needs

– Collection activities • conducted legally and ethically

– Analysis • interpreting data and compiling recommended actions

– Dissemination • presenting findings to decision makers

– Feedback • taking into account the response of decision makers and their needs for

continued intelligence

Page 4: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

4

CI and The Web• A business Web site will contain a variety of useful information,

– company history, corporate overviews, business visions

– product overviews, financial data, sales figures

– annual reports, press releases, biographies of top executives, locations of offices, and hiring ads.

– An example of this information is http://www.google.com/about.html

• The cost of this information is, for the most part, free.

• Access to open sources does not require proprietary software such as a number of commercial database

Page 5: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

5

The Web Structure and Information Retrieval

• HTTP protocol and the use of Uniform Resource Locators (URL)

• Mathematical network of nodes and arcs• Information Retrieval (IR)

– follows the links (arcs)

– from document to document (node to node)

• Retrieve documents so their content can be evaluated and a new set of URLs would be available to follow

Page 6: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

6

Issues Associated With CI and The Web

• Information Gathering

• Information Analysis

• Information Verification

• Information Security

Page 7: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

7

Information Gathering

Page 8: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

8

General Web Search Engines

• Architecture– Web Crawlers (Web Spiders) are used to collect Web

pages using graph searching techniques

– An indexing method is used to index collected Web pages and store the indices into a database.

– Retrieval and ranking methods that are used to retrieve search results from the database and present ranked results to users.

– A user interface• allow users to query the database and customize their searches

Page 9: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

9

Domain Specific Web Search Engines

• Northern Light, a search engine for commercial publications, in the domains of business and general interest.

• EDGAR is the United States Securities and Exchange Commission clearinghouse of publicly available information on company information and filings.

• Westlaw is a search engine for legal materials.

• OVID Technologies provides a user interface that unifies searching across many subfields and databases of medical information.

Page 10: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

10

Meta-search engine

• Upon receipt of query connects to several general search engines

• Returns integrated results of searches

• examples– www.metacrawler.com– www.dogpile.com

Page 11: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

11

Difficulties with Information Gathering

• Time to carry out search• Number of pages returned• Currency of information• Accessible pages

– Web contains 552.5 billion pages– Growth rate of 7.3 million per day

• “Surface Web” v.s. “Deep Web”– Surface Web page freely available to public– Deep Web

• dynamic pages, intranets & proprietary databases – Surface Web contains about 2.5 billion– Deep Web contains about 550 billion (200 times more)

• Charge for Web retrieval

Page 12: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

12

Information Analysis(Web Mining)

Page 13: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

13

Web Page Content

• Focused Spiders (On Line)– Return Appropriate Set of Pages

• Intelligent Agent

• User Interface– CI Spider by Chau & Chen - University of Arizona

– Answers On-line by Answer Chase

Page 14: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

14

Search Result Mining

• Text Mining (Off Line)– Automate the task of organizing and summarizing

numerous pages– Requires automated analysis of natural language

texts– Commercially available text mining applications e.g.

TextAnalyst by Megacomputer– ANN solution SITEX by Fukuda et. al.

Page 15: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

15

Web Structure– Page Rank

• Utilized in keyword searching of web• Measure of the number of “back links” to a page• Importance of page determined by number links to the page• Page’s priority determined by this measure• Implemented in the Google search engine

– Hyperlink-Induced Topic Search (HITS)• Hub & Authority measures associated with page

– Hub - a page that contains links to authoritative pages– Authoritative - best page (sources) for requested informatiom

• Starts with a keyword search that returns a set of pages– hubs and authoritative

Page 16: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

16

Web Usage

– Data mining on Web logs

– Web logs contain “clickstream” data• Server side

– Information about pages provided

• Client side– Information about pages requested

Page 17: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

17

Information Verification

Page 18: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

18

Techniques to Verify Accuracy of Information

• Deep web sources more reliable that surface web sources• Confirm with non-web source

• Answer the following – Who is the author?

– Who maintains the web site?

– How current is the web page?

• Observe the Top Level Domain (TLD) of the URL– “~” within URL denotes a personal web page

Page 19: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

19

Domain Names

• Original TLDs

– .com

– .edu

– .gov

– .net

– .org

• New TLDs

– .aero (for the air-transport industry)

– .biz (for businesses),

– .coop (for cooperatives)

– .info (for all uses)

– .museum (for museums)

– .name (for individuals)

– .pro (for professions).

Page 20: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

20

Information Security

Page 21: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

21

Information Security Issues

• Assuring the privacy and integrity of private information– Managed with usual computer and network security methods

• Assuring the accuracy of a firm’s public information – Defend against:

• Web hijacking

• Web defacing

• Cognitive hacking (semantic attack)

• Negative information

• Reference - Cybenko, Giani, & Thompson

• Avoiding unintentionally revealing information that ought to be private

Page 22: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

22

Due to a bug in CNN’s software, when people at the spoofed site clicked on the “E-mail This” link, the real CNN system distributed a real CNN e-mail to recipients with a link to the spoofed page.

With each click at the bogus site, the real site’s tally of most popular stories was incremented for the bogus story.

Allegedly this hoax was started by a researcher who sent the spoofed story to three users of AOL’s Instant Messenger chat software.

Within 12 hours more than 150,000 people had viewed the spoofed page.

Web Hijacking

Page 23: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

23

In February 2001 the New York Times web site was defaced by a hacker identified as “splurge” from a group called “Sm0ked Crew”, which had a few days previously defaced sites belonging to Hewlett-Packard, Compaq, and Intel.

THE-REV | SPLURGE

Sm0ked crew is back and better than ever!

“Well, admin I’m sorry to say by you have just got sm0ked by splurge.

Don’t be scared though, everything will be all right, first fire your current

security advisor . . .”

Web Defacing

Page 24: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

24

Cognitive Hacking

• Cognitive hacking is the manipulation of perception.• Causes

– disgruntled customers/employees

– competition

– random act of vandalism

Page 25: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

25

Two types of cognitive hacking

• single source cognitive hacking. – when a reader reads information and the reader

does not know who posted the information and has no way of verifying the information or contacting the author of the information.

• multiple source cognitive hacking– occurs when there are several sources for a

topic, and this becomes a concern when the information is not accurate.

Page 26: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

26

Categories of Cognitive Attacks

• Overt– No attempt is made to conceal overt cognitive attacks

• website defacements.

• Covert– Provision of misinformation

• the intentional distribution or insertion of false or misleading information intended to influence reader’s decisions and/or activities

Page 27: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

27

Emulex & Mark Jakob

• On 8/25/2000 a press release distributed by financial news services stated that Emulex revised its per share gain to a per share loss

• Price per share of Emulex moved from $104.00 to $43.00 in 16 minutes

• The press released was false - fabricated by Mark Jakob who was currently on the wrong side of a stock short sale.

• Jakob launched this press release via Internet Wire - LA based firm that distributes press releases.

Page 28: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

28

The Jonathan Lebed CaseDATE: 2/03/00 3:43pm Pacific Standard TimeFROM: LebedTG1FTEC is starting to break out! Next week, this thing will EXPLODE . . .Currently FTEC is trading for just $21/2. I am expecting to see FTEC at$20VERYSOON . . .Let me explain why . . .Revenues for the year should very conservatively be around $20 million.The average company in the industry trades with a price/salesratio of 3.45. With 1.57 million shares outstanding, this will value FTECat . . . $44. It is very possible that FTEC will see $44, but since I wouldlike to remain very conservative . . . my short term price target onFTEC is still $20!The FTEC offices are extremely busy . . . I am hearing that a number ofHUGE deals are being worked on. Once we get some news from FTECand the word gets out about the company . . . it will take-off to MUCHHIGHER LEVELS!I see little risk when purchasing FTEC at these DIRT-CHEAP PRICES.FTEC is making TREMENDOUS PROFITS and is trading UNDER BOOKVALUE!!!This is the #1 INDUSTRY you can POSSIBLY be in RIGHT NOW.There are thousands of schools nationwide who need FTEC to installsecurity systems . . . You can’t find a better positioned company thanFTEC!These prices are GROUND-FLOOR! My prediction is that this will bethe #1 performing stock on the NASDAQ in 2000. I am loading up withall of the shares of FTEC I possibly can before it makes a run to $20.Be sure to take the time to do your research on FTEC! You will probablynever come across an opportunity this HUGE ever again in yourentire life.

According to the US Security Exchange Commission, 15-year-old Jonathan Lebed earned between $12,000 and $74,000 daily over six months - for a total gain of $800,000. Lebed would buy a block of FTEC stock and then using only AOL accounts with fictitious names he would post a message like the one in the next text box. Doing this a number of times he increased the daily trading volume of FTEC from 60,000 shares to more than one million.

Page 29: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

29

POSSIBLE COUNTERMEASURES

• Single source– Authentication of source

– Information "trajectory" modeling

– Ulam games

• Multiple Sources– Source Reliability via Collaborative Filtering and

Reliability reporting

– Detection of Collusion by Information Sources

– Byzantine Generals Models

Page 30: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

30

Countermeasures: Single Source• Authentication of Source

– Due diligence– Implied verification - PKI (Digital Signature)

• Information Trajectory– Variation on a theme

• e.g. Lebed case variation of the “pump & dump” scheme

• Ulam Games– Model that assumes false information– How fast can that be determined using questions & answers

of source

Page 31: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

31

Countermeasures: Multiple Sources

• Collaborative filtering and reliability reporting – when a site keeps records and uses those records to verify future

claims by those with access to publishing on the site.

• Detection of Collusion by Information Sources– Linguistic analysis– Determine if different sources are by same author

• Byzantine generals model – message communicating system has two types of processes:

reliable and unreliable. – Given a number of processes from this system determine which of

type is each process.

Page 32: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

32

Countermeasures:Negative Information

• Monitor Web Sites– 5360 URLs with the phrase “Microsoft sucks”– Use an IA to monitor– Text mining for type of negative information– Respond accordingly

Page 33: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

33

Countermeasures:Unintentional Disclosure

• Carry out a CI project against yourself

Page 34: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

34

Conclusions• Reconcile “deep web” v.s. “surface web”

• Determine when all pages are needed vs “right” set of pages

• Automate “authoritative page selection”– “Consumer Reports” type process– e.g. posting a Web page in early 90s (Yahoo)

• Automate detection of – false information– inaccurate information– negative information

Page 35: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

35

Slides:

http://www.washburn.edu/cas/cis/boncella

E-mail:

[email protected]

Page 36: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

36

ReferencesAaron, R. D. and Naylor, E. “Tools for Searching the ‘Deep Web’ ”, Competitive Intelligence Magazine, (4:4), Online at http://www.scip.org/news/cimagazine_article.asp?id=156. (date of access April 18, 2003).

Calishain, T. and Dornfest, R. (2003) Google Hacks: 100 Industrial-Strength Tips & Tools, Sebastopool, CA: O’Reilly & Associates.

Chakrabarti, S. (2003) Mining the Web: Discovering Knowledge from Hypertext Data, San Francisco, CA: Morgan Kaufmann.

Chen, H., Chau, M.l, and Zebg, D. (2002) “CI Spider: A Tool for Competitive Intelligence on the Web”, Decision Support Systems, (34:1) pp. 1-17.

Cybenko, G., Giani, A., and Thompson, P. (2002) “Cognitive Hacking: A Battle for the Mind”, IEEE Computer (35:8) August, pp. 50–56.

Dunham. M. H. (2003), Data Mining: Introductory and Advanced Topics, Upper Saddle River, NJ: Prentice Hall.

Fleisher, C. S. and Bensoussan, B. E. (2000) Strategic and Competitive Analysis, Upper Saddle River, NJ: Prentice Hall, 2003.

Fuld, L. (1995) The New Competitor Intelligence, New York: Wiley.

Herring, J. P. (1998) "What Is Intelligence Analysis?" Competitive Intelligence Magazine, (1:2), pp., 13-16. http://www.scip.org/news/cimagazine_article.asp?id=196

Page 37: 1 Competitive Intelligence and the Web Presented at AMCIS2003 Tampa, Florida by Dr. Robert J. Boncella Washburn University

37

ReferencesKleinberg, J. M. (1999), “Authoritative Sources in a Hyperlinked Environment”, Journal of the ACM (46:5), pp. 604-632, September.

Krasnow, J. D. (2000), “The Competitive Intelligence and National Security Threat from Website Job Listings” http://csrc.nist.gov/nissc/2000/proceedings/papers/600.pdf. (date of access April 18, 2003).

Lyman, P. and Varian, H.R. (2000) “Internet Summary” Berkeley, CA: How Much Information Project, University of California, Berkeley, http://www.sims.berkeley.edu/research/projects/how-much-info/internet.html. (date of access April 18, 2003).

Murray, M. and Narayanaswamy, R. (2003) “The Development of a Taxonomy of Pricing Structures to Support the Emerging E-business Model of ‘Some Free, Some Fee’”, Proceedings of SAIS 2003, pp. 51-54.

Page, Lawrence, and Brin, Sergey, ”The Anatomy of a Large-Scale Hypertextual Web Search Engine”, http://www-db.stanford.edu/~backrub/google.html , 1998.(date of access April 22, 2003).

Schneier, Bruce (2000) “Semantic Attacks: The Third Wave of Network Attacks”, Crypto-gram Newsletter, October 15, 2000, http://www.counterpane.com/crypto-gram-0010.html. (Date of access April 18, 2003).

SCIP (Society of Competitive Intelligence Professionals) http://www.scip.org/. (date of access April 18, 2003).