redpen, a document checker

16
RedPen, a Document Checker Takahiko Ito 1

Upload: recruit-technologies

Post on 25-May-2015

9.932 views

Category:

Software


6 download

DESCRIPTION

Introduction of RedPen which is a open source proofreading tool.

TRANSCRIPT

Page 1: RedPen, a document checker

RedPen, a Document Checker

Takahiko Ito

1

Page 2: RedPen, a document checker

Background: programming environment

Software engineers make use of many tools in the development of software.

Tool: CheckStyle, FindBugs, lint, Valgrind, CI etc…

➔ Tools contribute to keep the quality.

2

Page 3: RedPen, a document checker

Background: writing situations

Software engineers write large amount of natural language documents

Example: Manuals, tutorial, Blog, Specification

Unfortunately, there is no handy checking tool for the quality of documents.

➔ Quality of documents is not improved.3

Page 4: RedPen, a document checker

MotivationChecking formatting issues can be done automatically.

Writers can concentrate on the contents of documents.

➔We have made RedPen, a document checker.

4

Page 5: RedPen, a document checker

What is RedPen?a validation tool for document written in natural languages

E.g., English, Japanese, Chinese

Target: technical papers, manuals and so on.

5

Page 6: RedPen, a document checker

Function of RedPenRedPen detects the problems in input documents.

Problems:

Sentence Length

Inconsistency of terminology

Spell-miss

6

Page 7: RedPen, a document checker

Example: low quality text

7

Some of software works in more than one machines and such distributed software can handle large amount of data or works in severe environments because such software make use of much computer resources. In this paper we call a server works in a cluster as ‘instance.’ for example, in search engines or distributed databases, the fractions of indexes are stored in multiple instances.Such system need a component to merge the query results before the return the results to the users.

Too long sentence!small letter!

Need space!

Page 8: RedPen, a document checker

Features of RedPenHandy configuration

Language independent

8

Page 9: RedPen, a document checker

Usage: RedPenUsers pick up the checking items (validators)

RedPen provides many validators

9

Page 10: RedPen, a document checker

Example of RedPen configuration

10

<validator-list> <validator name=“SentenceLength" /> <validator name="InvalidCharacter" /> <validator name=“SpellCheck" />  <validator name=“SectionLength” /> </validator-list>

Sentence length

Invalid character

spell check

Page 11: RedPen, a document checker

Available validatorsSentenceLength InvalidExpression SpaceAfterPeriod CommaNumber WordNumber SuggestExpression InvalidCharacter SpaceWithSymbol KatakanaEndHyphen KatakanaSpellCheck SectionLength ParagraphNumber ParagraphStartWith

11

Page 12: RedPen, a document checker

CommandRedPen provides a simple command.

!

Supported format: Markdown、Textile、PlainText

12

$ redpen -c config-file input

Page 13: RedPen, a document checker

Sample serverLaunched by the following command.

13

$ java -jar redpen.war

Page 14: RedPen, a document checker

Demo

14

Page 15: RedPen, a document checker

Future workCurrent RedPen focuses on the simple functions

In the future, RedPen will support more sophisticated and experimental functions proposed in research fields.

Provide plugin system

15

Page 16: RedPen, a document checker

SummaryIntroduction of RedPen

Validation tool for documents written in natural languages.

Usage:

Configurations

Handy command and server

Future work

16