sad03 - natural language analysis

32
+ Systems Analysis and Design Natural Language Analysis

Post on 18-Oct-2014

33 views

Category:

Software


0 download

DESCRIPTION

This is a lecture on Systems Analysis and design. It focuses on natural language analysis.

TRANSCRIPT

Page 1: SAD03 - Natural Language Analysis

+

Systems Analysis and DesignNatural Language Analysis

Page 2: SAD03 - Natural Language Analysis

+Introduction

It’s all fine and well to tell you how to draw class diagrams. Important and all, but not exactly a useful first step.

In order to know what classes you need, you must perform the appropriate amount of analysis. Look at the problem, work out how it relates to objects and classes.

Last week we talked about objects and classes. Classes are blueprints for objects.

We talked about attributes and behaviours. They belong to the class.

We talked about states. They belong to the object.

Page 3: SAD03 - Natural Language Analysis

+The Problem Statement

Remember how in the first lecture we spoke about the importance of a problem statement? It comes back to haunt us now.

How do we go from a problem statement to a requirements specification? Lots of analysis.

We talk to people.

We look around to see what’s going on. We ask people what are the happy haps.

We use that to build a sensible description of what people need. Rather than what they initially asked for.

Page 4: SAD03 - Natural Language Analysis

+The Problem Domain

How do we know which parts of the problem domain are ours to worry about? In other words, how do we break down a problem domain

into project scope?

There are various techniques for this. Some of them are just ‘gut feeling’.

Today we’re going to talk about a particular technique I find useful.

It’s used only as an example of how you can progress through the analysis. Its use is not mandatory.

Page 5: SAD03 - Natural Language Analysis

+Natural Language Analysis

We take the problem statement. Ideally, one refined by our discussions and observations.

We perform a process called natural language analysis on it. We identify nouns

Proper and common We identify them separately

We identify verbs We identify attributes We identify relationships

And then we extract out all the meaning.

Page 6: SAD03 - Natural Language Analysis

+Natural Language Analysis

This is not a precise or especially accurate method. Our analysis will be first draft at best.

However, it is an effective technique for bridging the gap. ‘I know what must be done’ to ‘I know roughly how to

model it’

We will identify plenty of false positives.

We will identify plenty of things outside our scope.

Analysis is as much about your personal judgment as it is about tools and techniques.

Page 7: SAD03 - Natural Language Analysis

+Natural Language Analysis

We do the analysis in several passes. One for nouns One for verbs One for adjectives One for relationships.

We begin with the nouns. They form the core of our class diagram.

Common nouns are potential classes. We call them candidate classes.

Candidates for inclusion in our class diagram.

Proper nouns are potential objects.

Page 8: SAD03 - Natural Language Analysis

+Our First Example

Epitaph is a multiplayer text game that is set in the grim darkness of the zombie apocalyse. Players take on the role of survivors, and must battle to find food, weapons and supplies as well as fend off hostile zombies, feral animals and other survivors. They must battle against the weather and the environment, balancing their character’s wellbeing against their desire to improve skills, gain new commands, and earn new knacks.

Epitaph incorporates a random event generator, which throws new events at players based on their current situation – where they are, what they’re doing, the noises they are making, the light they are emitting, and so forth.s

Page 9: SAD03 - Natural Language Analysis

+Find the Nouns

This is the first we’re encountering our case study in the module. This is a simplified description so as to allow for us to

explore the idea of natural language analysis.

It serves as a reasonably refined problem statement. It is more readable and logical than many.

If you were working from first principles, this problem statement would have been derived from discussions with players and developers.

Let’s FIND THE NOUNS.

Page 10: SAD03 - Natural Language Analysis

+Common Nouns

Epitaph is a multiplayer text game that is set in the grim darkness of the zombie apocalypse. Players take on the role of survivors, and must battle to find food, weapons and supplies as well as fend off hostile zombies, feral animals and other survivors. They must battle against the weather and the environment, balancing their character’s wellbeing against their desire to improve skills, gain new commands, and earn new knacks.

Epitaph incorporates a random event generator, which throws new events at players based on their current situation – where they are, what they’re doing, the noises they are making, the light they are emitting, and so forth.

Page 11: SAD03 - Natural Language Analysis

+Proper Nouns

Epitaph is a multiplayer text game that is set in the grim darkness of the zombie apocalypse. Players take on the role of survivors, and must battle to find food, weapons and supplies as well as fend off hostile zombies, feral animals and other survivors. They must battle against the weather and the environment, balancing their character’s wellbeing against their desire to improve skills, gain new commands, and earn new knacks.

Epitaph incorporates a random event generator, which throws new events at players based on their current situation – where they are, what they’re doing, the noises they are making, the light they are emitting, and so forth.

Page 12: SAD03 - Natural Language Analysis

+Candidate Classes

This process gives us our list of candidate classes. They’re not all going to be winners.

From our analysis, we have the following candidates:

Game Darkness apocalypse Player Survivor

Food Weapon Supply Zombie Animal

Weather Environment

Wellbeing Skill Knack

Command Generator Event Situation Noise

light

Page 13: SAD03 - Natural Language Analysis

+Managing Candidates

We now go through and prune candidate classes. Obvious synonyms Irrelevant Outside our scope Too specific Too vague Too abstract

This gives us a more manageable list of candidate classes that we can later consider for inclusion in our class diagram. This is an iterative process, of course.

Page 14: SAD03 - Natural Language Analysis

+Get Rid of Synonyms

We look at likely synonyms that we can get rid of. Player and Survivor.

Players take on the role of survivors.

We look for overlapping cases. Supply versus Weapon/Food

Game Darkness apocalypse Player Survivor

Food Weapon Supply Zombie Animal

Weather Environment

Wellbeing Skill Knack

Command Generator Event Situation Noise

light

Page 15: SAD03 - Natural Language Analysis

+Get Rid of Synonyms

In the case of overlapping classes, we make a note that there is an implied relationship. Food is a supply Weapon is a supply

For the others, we pick one and discard the other.

Game Darkness apocalypse Player

Food Weapon Supply Zombie Animal

Weather Environment

Wellbeing Skill Knack

Command Generator Event Situation Noise

light

Page 16: SAD03 - Natural Language Analysis

+Irrelevancies

Problem statements are often wooly. Because ‘precise’ writing is not entertaining writing.

As such, they often contain information that isn’t really relevant to us. It sets the scene It provides context It just veers off on a tangent.

We discard those things that are unlikely to be relevant to those of us developing a system. Just bear in mind that this is a first draft, and we need to be

prepared to reassess that in the future.

Page 17: SAD03 - Natural Language Analysis

+Irrelevancies

Darkness?

Game?

Apocalpyse?

Player

Food Weapon Supply Zombie Animal

Weather Environment

Wellbeing Skill Knack

Command Generator Event Situation Noise

light

Page 18: SAD03 - Natural Language Analysis

+Irrelevancies

The word ‘irrelevance’ is harsh. They may be relevant to the project, just not relevant to our

first draft of a class structure.

The fact that this game is apocalypse themed is going to be hugely important. But it’s not something we need to worry about just yet.

Some systems must concern themselves with everything to begin with.

This system has two main components. The Engine The Content

Page 19: SAD03 - Natural Language Analysis

+Outside our Scope

Some classes may fall outside the boundaries of our system. ‘This game will interact with an apache web server to

deliver dynamic content through the web’

When encountering classes like these, we make a note but we remove them from the diagram. They’re going to be external entitites.

This problem statement doesn’t contain any. It’s entirely self descriptive.

No need to prune anything at this stage.

Page 20: SAD03 - Natural Language Analysis

+Fuzzy Classes

In some cases, we may be dealing with classes that just don’t ‘feel’ right. Not enough meat Too much meat Too abstract a concept.

We don’t discard these, but we will often rename them. Sometimes merging two closely related classes together.

Such as event and scenario. Light and Noise

This is all based on a judgement call. Consider it a kind of ‘exploring the solution space’

Page 21: SAD03 - Natural Language Analysis

+The Candidate Class List

The candidate class list serves as the basis for our next pass. Finding adjectives.

Adjectives often imply attributes. In order for an adjective to be useful, it must (presumably) be

one of a range of options. Apocalypse is the noun.

Zombie apocalypse Swine flu apocalypse Alien apocalypse

We make a note when we see adjectives attached to nouns. And consider if there may be reason to incorporate an attribute

to support it.

Page 22: SAD03 - Natural Language Analysis

+Adjectives

Epitaph is a multiplayer text game that is set in the grim darkness of the zombie apocalypse. Players take on the role of survivors, and must battle to find food, weapons and supplies as well as fend off hostile zombies, feral animals and other survivors. They must battle against the weather and the environment, balancing their character’s wellbeing against their desire to improve skills, gain new commands, and earn new knacks.

Epitaph incorporates a random event generator, which throws new events at players based on their current situation – where they are, what they’re doing, the noises they are making, the light they are emitting, and so forth.

Page 23: SAD03 - Natural Language Analysis

+Adjectives

Those adjectives that belong to nouns we have discarded are ignored. For now.

Look at what we have left: Hostile zombies Feral animals New commands New knacks Random events

Each of these implies something about the nouns.

Page 24: SAD03 - Natural Language Analysis

+Adjectives

Feral and hostile imply that Non Player Characters (NPCs) within the game may not necessarily attack players. There can be passive zombies There can be domesticated animals.

If this was not the case, why would the adjective be required? Sloppy writing?

This implies in turn an attribute about NPCs. Aggression

Each of our NPCs will need to keep track of some kind of aggression determination.

Page 25: SAD03 - Natural Language Analysis

+Adjectives

New commands and new knacks implies some kind of mechanism for earning these. We can’t tell what it is, but we can tell we will need to have

something in place.

It also implies that players may not have access to particular commands. A model like World of Warcraft, as opposed to one like

Quake.

As such, we are going to need to keep track of the commands our players have. As well as the full set of commands available in the game.

Page 26: SAD03 - Natural Language Analysis

+Adjectives

Random events implies that some events may not be random. Time related Event related

Thus, we need some kind of ‘delivery mechanism’ to go with events. They’ll need to be able to tell when they should be firing.

Thus, our random event generator is going to need to keep track of different categories of events. And work out how and when each should be called.

Page 27: SAD03 - Natural Language Analysis

+Verbs

Verbs imply behaviours. Again, we ignore them when they link to a class that has

been discounted as a candidate.

Behaviours in problem statements will rarely be specific enough to be directly useful. But they will imply a whole host of things that will follow.

We apply NLA once again in a further pass to tease out what behaviours are possible to know from the problem statement. At which point, we’re done with our draft of the analysis.

Page 28: SAD03 - Natural Language Analysis

+Verbs

Epitaph is a multiplayer text game that is set in the grim darkness of the zombie apocalypse. Players take on the role of survivors, and must battle to find food, weapons and supplies as well as fend off hostile zombies, feral animals and other survivors. They must battle against the weather and the environment, balancing their character’s wellbeing against their desire to improve skills, gain new commands, and earn new knacks.

Epitaph incorporates a random event generator, which throws new events at players based on their current situation – where they are, what they’re doing, the noises they are making, the light they are emitting, and so forth.

Page 29: SAD03 - Natural Language Analysis

+Verbs

As usual we discount synonyms. ‘Fend off’ and ‘Battle’

Verbs imply a lot about the functionality of the system. We need a combat system We need a way of the generator ‘throwing’ events at players Players can improve the skills that they have. Players can gain new commands (we knew that already) Players can earn knacks

Is that a synonym? A judgment call, perhaps.

At the end of this process, we have a list of things that can serve as the basis for a rudimentary class diagram. More on that next week.

Page 30: SAD03 - Natural Language Analysis

+What then?

NLA is not a one time process. You use it to build your understanding of a system.

Having teased out some candidate classes, you go back out to the client and ask follow-ups. ‘How do skills improve?’ ‘How does the combat system work?’ ‘How do we tell if NPCs are hostile?’

The answer to each of these questions will yield further detail. And this in turn will bend to your NLA.

Page 31: SAD03 - Natural Language Analysis

+It’ll be done when it’s done.

There are diminishing returns to this kind of exercise. You can’t keep applying it in the hope you get a working

system out of it in the end.

However, a few iterations around this process will yield considerable benefits.

It takes you from the general ‘What am I supposed to do with that nonsense?’ to the specific ‘I need to find out more about this specific nonsense’

As time goes by, the class diagram you develop will emerge as the architectural representation of your understanding.

Page 32: SAD03 - Natural Language Analysis

+Conclusion

Natural Language Analysis is a very useful tool. But it’s not mandatory.

It works by performing very simply linguistical extractions on a problem statement. Hauling out nouns, verbs and adjectives.

This serves as the basis for your first draft of the class diagram.

Its an iterative process. You use it to understand the system. You ask questions based on what you uncover.