from words to meaning to insight julia cretchley & mike neal

21
From Words to Meaning to Insight Julia Cretchley & Mike Neal

Upload: badrani

Post on 16-Mar-2016

56 views

Category:

Documents


0 download

DESCRIPTION

From Words to Meaning to Insight Julia Cretchley & Mike Neal. Leximancer: Your First Analysis. Outline. Getting started Creating projects and loading data Run the project Initial results interpretations The Concept Map. Getting Started. Help Button-->. About --> - PowerPoint PPT Presentation

TRANSCRIPT

From Words to Meaning to Insight

Julia Cretchley & Mike Neal

OutlineGetting startedCreating projects and loading dataRun the project Initial results interpretations• The Concept Map

Getting Started

Help Button-->About -->

Shows version of Leximancer

Manual -->Access PDF

ManualContact-->Starts email

Projects

Manage Projects-->

Create folders under Leximancer

Projects to organize your own projects.

Create Project-->Create projects in

current folder.

Interviews-->Double Click to

Open Project Panel

Planning Projects

Fast, first-cut analysis for pure discovery (grounded theory)

1.Load Data2.Run steps with no editing or configuration3.Examine results and explore data

Deliberate, planned analysis1.Load Data2.Set up custom configuration (tags, sentiment

analysis)3.Examine results; explore data; modify settings4.Repeat 2 and 3

Project Control PanelFour main interaction areas

1234

Current status

Configure and option editorsReporting and Exploration Buttons

Steps to Analysis

Fast, first-cut analysis for pure discovery (grounded theory)

1.Load Data2.Run steps with no editing or configuration3.Examine results and explore data

Deliberate, planned analysis1.Load Data2.Set up custom configuration (tags, sentiment

analysis)3.Examine results; explore data; modify settings4.Repeat 2 and 3

Stages: Load Data

Load Data

Data formats• xls, cvs, tsv for spreadsheet loading• pdf, doc, docx, rtf, txt, html, xml, xhtml

Two options1. Spreadsheet2. Files and file folders of documents

Tags (briefly...)• Organize data into folders or spreadsheet

columns (automatic) by date or topic for Dashboard later

Stages Run Project

What Did Leximancer Just Do?

Split the text into sentences, paragraphs, and documents Divided the text into blocks of 2 sentences (by default) Identified Proper Nouns and multi-word (compound) names Removed non-lexical and weak semantic information (i.e., stop word list) Determined seed words via most frequent words and relationships

Used seed words to build coding dictionary (i.e., thesaurus) Use thesaurus to code text and tagged the blocks the concepts they

contain Measured co-occurrence between concepts Produced concepts, themes, final thesaurus

• Statistics (frequencies, measurements)• Outputs (Dashboard only if configured)

View Results

Concept Map and Concept Cloud are key interfaces

Activities analyst typically performs now• Understand the initial run and data• Explore thesaurus; links to actual data• Look for concepts to merge, remove, or make

compound• Create Dashboard Report, export data; save map

Run analysis again; repeat as necessary

Concept MapColored spheres are ThemesDots are concepts (size matters)Connections shown

Control % of concepts % of themes Rotate for better display

Controls to toggle concept map, network display, center, zoom, save, export

Theme Summary• Ranked list• Examples• more...

Concept Summary• Ranked list• Name-like• word-like

Leximancer uses concept frequency and co-occurrence data to compile a matrix of concept co-occurrences • You can export this matrix to Excel for your own

visualizationsA statistical algorithm is then used to create a

two-dimensional concept map based on the matrix

Initially, concepts are dispersed randomly in the map space. Then the relationships between concepts act like attractive forces to guide concepts to their resting places.

Concept Map

Concept CloudConcept Relationships highlightedColors are heat mapped (Themes) Rotate for better viewSave Map/Export Image in case of new run

Top Name-like concepts at top (Proper names by capital first letter)

Click name and get ranked list of related concepts

Count is number of times word (concept) appears in entire corpus (2-sentence blocks)

Relevance is most frequent concept (Japan:7010) as 100%. Divide counts by 7010 for percentages.• Shows proportionality

(representative) relative to each other

Concept Tab

"We use the laser 500 printer here at the office. We are pretty happy with it. Once there was a leak and all the toner spilled out of the machine, but a technician came out and fixed the problem for us. We still have to top the toner up often. The printer goes through ink quickly and the cartridges are expensive, but we put up with this because it delivers good results reliably. We are pleased with the quality of rinting we get. The laser 500 can batch process, and collate the pages to save us time. Sometimes paper gets jammed in the laser 500. Then we have to open it up to remove the crumpled paper. We have tried other machines in the past, but have not found an alternative that works better for us.”

For printer concept: ____occurrences from Leximanceroccurrences by ordinary keyword text search

____25

Concept Extraction A Test!

printer

printer

laser 500

laser 500

laser 500

toner

tonermachine

machines

rinting

Redcross clicked

Select a Concept

Lines drawn to related conceptsCount is number of times concept

is mentioned with Redcross. Example donation: 196.

So, of all comments about donation, 68% mention Redcross.

ThesaurusConcepts here listed in abc order

Click concept to see thesaurus: evidence words describing

concept.

Score is z-score.Higher score is more

relevant.

Higher relevance value means:

- Occur often in sentences containing the concept

- Rarely occur in sentences not containing the concept

Questions?