(12) unlted states patent (10) patent n0.2 us 7,451,155 b2 ... toolkits/the... · pp 6’ ' ar...

26
US007451155B2 (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 Slackman et a]. (45) Date of Patent: Nov. 11, 2008 (54) STATISTICAL METHODS AND APPARATUS 6,795,820 B2* 9/2004 Barnett ........................ .. 707/3 FOR RECORDS MANAGEMENT 6,912,549 B2 6/2005 Rotter et a1. 6,938,053 B2* 8/2005 Jaro ...................... .. 707/104.1 (75) Inventors: Richard Slackman, Shiloh, IL (US); lslkmlntczi’k et al' ~~~~~~ ~~~70771ig54mf 9 - . - - , , ung e a. ............. .. . K011105181’ €enlgkg(gss))’wllham 2002/0022956 A1* 2/2002 Ukrainczyk et a1. .......... .. 704/9 PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ............ .. 702/127 . 2002/0129021 Al* 9/2002 Brown ....................... .. 707/10 (73) Ass1gnee: AT&T Intellectual Property I, L.P., 2002/0l56793 A1 10/2002 Jam Reno, NV (Us) 2002/0161761 A1 10/2002 Forman etal. _ _ _ _ _ 2002/0188424 A1 12/2002 Grinstein et a1. ( * ) Not1ce: Subject to any disclaimer, the term of th1s 2003/0014420 A1 1/2003 lessee er a1, patent is extended or adjusted under 35 2003/0182304 A1 9/2003 Summerlin et a1. U.S.C. 15405) by 234 days. 2003/0233350 A1* 12/2003 Dedhia etal. ................ .. 707/3 2005/0060295 Al * 3/2005 Gould et a1. ................. .. 707/3 (21) APPLNQ; 11/245365 2005/0144184 A1 6/2005 Carus et a1. 2005/0149546 Al 7/2005 Prakash et a1. (22) Filed: 0&5 2005 2006/0101095 A1* 5/2006 Episale etal. ............. .. 707/204 , 2006/0106885 Al * 5/2006 Blumenau et a1. ......... .. 707/200 . . . 2006/0161423 Al * 7/2006 Scott et a1. . . . . . . . . . . . .. 704/10 (65) Pmr Pubhcatlon Data 2007/0174179 A1* 7/2007 Avery ........................ .. 705/37 US 2007/0088715 A1 Apr. 19, 2007 . . * cited by exammer (51) Int‘ Cl‘ Primary ExamineriYicun Wu G06F 17/30 (200601) (74) Attorney, Agent, or FirmiHanley, Flight & (52) us. Cl. ...................... .. 707/100; 707/101; 707/102 Zimmerman, LLC (58) Field of Classi?cation Search ................ .. 707/ 100 See application ?le for complete search history. (57) ABSTRACT (56) References Cited U.S. PATENT DOCUMENTS Statistical methods and apparatus for records management are disclosed. An example method for records management disclosed herein comprises classifying a record into at least one of a plurality of categories based on a statistical Value, and performing an operation With respect to the record according to a rule associated With the at least one of the plurality of categories, Wherein the operation is performed based on the statistical Value. 44 Claims, 13 Drawing Sheets 5,537,586 A * 7/1996 Amrametal. ............... .. 707/3 5,680,606 A * 10/1997 Nakajima m1. ............ .. 707/4 5,813,009 A 9/1998 Johnson et a1. 5,895,470 A * 4/1999 Pirolli et a1. 707/102 6,148,301 A * 11/2000 Rosenthal .................. .. 707/10 6,553,365 B1 4/2003 Summerlin et 31. 6,751,600 131* 6/2004 Wolin ........................ .. 706/12 6,782,390 B2 8/2004 Lee 851. /_21o 230 AL CATEGORY _ STQ/REFEC ASSESSOR DETERMINER ' : i - 1 l ' 220 ' 240 TATl T AL CATEGORY S VAEU'EC ASSESSOR DETERMINER ll 1 ‘y 250 270 STATISTICAL = 001582? —~» COMBINER : l | | |

Upload: phamhanh

Post on 04-Jul-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US007451155B2

(12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 Slackman et a]. (45) Date of Patent: Nov. 11, 2008

(54) STATISTICAL METHODS AND APPARATUS 6,795,820 B2* 9/2004 Barnett ........................ .. 707/3 FOR RECORDS MANAGEMENT 6,912,549 B2 6/2005 Rotter et a1.

6,938,053 B2* 8/2005 Jaro ...................... .. 707/104.1

(75) Inventors: Richard Slackman, Shiloh, IL (US); lslkmlntczi’k et al' ~~~~~~ ~~~70771ig54mf 9 - . - - , , ung e a. ............. .. .

K011105181’ €enlgkg(gss))’wllham 2002/0022956 A1* 2/2002 Ukrainczyk et a1. .......... .. 704/9 PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ............ .. 702/127

. 2002/0129021 Al* 9/2002 Brown ....................... .. 707/10

(73) Ass1gnee: AT&T Intellectual Property I, L.P., 2002/0l56793 A1 10/2002 Jam Reno, NV (Us) 2002/0161761 A1 10/2002 Forman etal.

_ _ _ _ _ 2002/0188424 A1 12/2002 Grinstein et a1.

( * ) Not1ce: Subject to any disclaimer, the term of th1s 2003/0014420 A1 1/2003 lessee er a1, patent is extended or adjusted under 35 2003/0182304 A1 9/2003 Summerlin et a1. U.S.C. 15405) by 234 days. 2003/0233350 A1* 12/2003 Dedhia etal. ................ .. 707/3

2005/0060295 Al * 3/2005 Gould et a1. ................. .. 707/3

(21) APPLNQ; 11/245365 2005/0144184 A1 6/2005 Carus et a1. 2005/0149546 Al 7/2005 Prakash et a1.

(22) Filed: 0&5 2005 2006/0101095 A1* 5/2006 Episale etal. ............. .. 707/204 , 2006/0106885 Al * 5/2006 Blumenau et a1. ......... .. 707/200

. . . 2006/0161423 Al * 7/2006 Scott et a1. . . . . . . . . . . . .. 704/10

(65) Pmr Pubhcatlon Data 2007/0174179 A1* 7/2007 Avery ........................ .. 705/37

US 2007/0088715 A1 Apr. 19, 2007 . . * cited by exammer

(51) Int‘ Cl‘ Primary ExamineriYicun Wu G06F 17/30 (200601) (74) Attorney, Agent, or FirmiHanley, Flight &

(52) us. Cl. ...................... .. 707/100; 707/101; 707/102 Zimmerman, LLC (58) Field of Classi?cation Search ................ .. 707/ 100

See application ?le for complete search history. (57) ABSTRACT

(56) References Cited

U.S. PATENT DOCUMENTS

Statistical methods and apparatus for records management are disclosed. An example method for records management disclosed herein comprises classifying a record into at least one of a plurality of categories based on a statistical Value, and performing an operation With respect to the record according to a rule associated With the at least one of the plurality of categories, Wherein the operation is performed based on the statistical Value.

44 Claims, 13 Drawing Sheets

5,537,586 A * 7/1996 Amrametal. ............... .. 707/3

5,680,606 A * 10/1997 Nakajima m1. ............ .. 707/4

5,813,009 A 9/1998 Johnson et a1. 5,895,470 A * 4/1999 Pirolli et a1. 707/102 6,148,301 A * 11/2000 Rosenthal .................. .. 707/10

6,553,365 B1 4/2003 Summerlin et 31. 6,751,600 131* 6/2004 Wolin ........................ .. 706/12

6,782,390 B2 8/2004 Lee 851.

/_21o 230

AL CATEGORY _ STQ/REFEC ASSESSOR DETERMINER

' : i - 1 l ' 220 ' 240

TATl T AL CATEGORY S VAEU'EC ASSESSOR DETERMINER

ll 1

‘y 250 270

STATISTICAL = 001582? —~»

COMBINER : l | | |

Page 2: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 1 0113 US 7,451,155 B2

.223 ZOFODQOWE _\ .GE a .523 ZOFZmFmE

(a: v/looF

om?

Page 3: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 4: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 5: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 4 0f 13 US 7,451,155 B2

CATEGORY MAP SELECTOR / PARSER

RULE RETRIEVER

RULE COMBINER

RULE EVALUATOR

FIG. 4

Page 6: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 5 0113 US 7,451,155 B2

350

340

330\ \ r160 \ \ \ C1,L1,R

+__

v /-120

380

RETENTION UNIT

%510 FIG. 5

Page 7: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

Sheet 6 0f 13 US 7,451,155 B2 Nov. 11, 2008 US. Patent

.................................................... I |_ mowwmuoml mwwm?

?w?wmmmwml AI. w3<> All \ 1065mm

29555 n_<_>_ E0320

Q8 |\ 0% |\ o 6 |\ _ I I l l | | I l | | | l l l I l | I I | l | | l | I I l l | | | | | | | | l i I I \ | | I | | I | I I .I .I O2

Page 8: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 9: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent

BEGIN RECORD CLASSIFICATION

Nov. 11, 2008 Sheet 8 0f 13

800 /—

US 7,451,155 B2

YES

ADDITIONAL CATEGORIES?

812 /- 804 /- 808

ASSESS NEXT DETERMINE CATEGORY ’ LIKELIHOOD

ADDITIONAL YES CLASSIFIERS?

s24 -\

L?é??gt) STATISTICAL VALUES To CONICIIERJSION STATISTICAL REQUIRED? VALUES '

NO’ I, 828

STATISTICAL VALUE

COMBINING ENABLED?

NO

r 832

STATISTICAL COMBINE

VALUES

/— 836

DETERMINE CANDIDATE CATEGORIES

CATEGORY MAP

v 840

SIZE OF

RESTRICTED?

NO

/— 844

CAN DIDATE YES CATEGORIES

PRUNE

CATEGORY MAP DETERMINE

END RECORD CLASSIFICATION

FIG. 8

Page 10: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 9 0113 US 7,451,155 B2

Linguistic Assessment Numeric Statistical Value Absolutely in This Category 100% Very Likely in This Category 80%

Likely in This Category 50% Possibly in This Category 20% Unlikely in This Category 5%

Not in This Category 0%

FIG. 9

1010 —\ CATEGORY Cn STATISTICAL VALUE Ln] CLASSIFIER ACCURACY W1

,— 1030

\ CATEGORY C,, STATISTICAL VALUE

n I/V] + W2 n] VV] + W2 n2

1020 _\ / CATEGORY C,, STATISTICAL VALUE L,,2 CLASSIFIER ACCURACY W2

FIG. 10

Page 11: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 10 0113 US 7,451,155 B2

BEGIN RETENTION PROCESSING

1100 /- 1104 f

GET RECORD CATEGORY MAP

MULTIPLE CATEGORIES?

/- 1116

COMMON RULE COMBINE CATEGORIES WITH COMBINING COMMON RETENTION RULES ENABLED’? BASED ON STATISTICAL VALUES

/- 1120

SELECT RETENTION RULE(S) WITH HIGHEST STATISTICAL VALUE(S)

[- 112s

SELECT RETENTION RULE WITH LONGEST RETENTION PERIOD

NO

11:12T EVALUATE RETENTION PERIOD OF

SELECTED RETENTION RULE

/-1140 RETENTION PERIOD

EXCEEDED?

NO

FLAG RECORD FOR YES DESTRUCTION

4

II END RETENTION PROCESSING FIG. 11

Page 12: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 11 0113 US 7,451,155 B2

1210 —\ CATEGORY C1 STATISTICAL VALUE L1 RETENTION RULE R

/— 1230

\ CATEGORY cw2 STATISTICAL VALUE

LlvZ : L1 + L2 _L1L2

1220 —\ CATEGORY C 2 STATISTICAL VALUE L2 RETENTION RULE R

FIG. 12

Page 13: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 12 0113 US 7,451,155 B2

BEGIN PRODUCTION PROCESSING

/- 1304

GET RECORD CATEGORY MAP

1300 _\

RECORD CLASSIFIED INTO REQUESTED CATEGORY?

STATISTICAL VALUE

REQUIREMENT PRESENT?

/- 1316

COMPARE STATISTICAL VALUE TO REQUIREMENT

YES

1324 \

FLAG RECORD FOR PRODUCTION

FIG. 13

END PRODUCTION PROCESSING

Page 14: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US. Patent Nov. 11,2008 Sheet 13 0113 US 7,451,155 B2

MASS STORAGE

RANDOM ACCESS MEMORY H

H CODED

INSTRUCTIONS 1416

OUTPUT DEVICE(S)

H

H INTERFACE <—I>

/— 1420

READ ONLY MEMORY

1422? /— 1412

PROCESSOR

LOCAL MEMORY

1414

FIG. 14

Page 15: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US 7,451,155 B2 1

STATISTICAL METHODS AND APPARATUS FOR RECORDS MANAGEMENT

FIELD OF THE DISCLOSURE

This disclosure relates generally to records management, and, more particularly, to statistical methods and apparatus for records management.

BACKGROUND

Records management is employed by many entities, such as corporations, academic institutions, non-pro?t organiZa tions, etc., to manage records to meet regulatory and/or other compliance requirements. A typical records management program encompasses many procedures, including record identi?cation, classi?cation, retention, searching, production and destruction. Records management procedures may be dif?cult and/or expensive to implement, especially in cases in which the entity possesses and/ or processes a large number of records, utiliZes a complex record categorization scheme, employs personnel have a wide range of expertise to imple ment its records management program, etc. Known records management programs are usually based

on a ?le plan which, for example, in the case of a corporation, may be created by a corporate records manager in consulta tion with the corporation’s legal department/counsel. The ?le plan typically de?nes multiple categories into which a record may be classi?ed. Associated with each category are implied or explicit membership rules which de?ne membership in the particular category, and implied or explicit management rules which determine how records in the particular category should be managed. After the creation of the ?le plan, com pany records are classi?ed into one or more categories based on the membership rules of each category. After classi?ca tion, a record is managed under the records management program according to the management rules associated with the category or categories into which the record is classi?ed. As an example, an accounts receivable bill may be classi

?ed under a records management program into an accounts receivable category of an example ?le plan. Furthermore, the accounts receivable category may include a management rule which indicates that all accounts receivable records should be retained for a minimum number of years, such as, for example, seven years. In another example, a record may be classi?ed into both an accounts receivable category and an accounts payable category of an example ?le plan. In this example, a management rule associated with the accounts receivable category may indicate that all accounts receivable records should be retained for seven years, whereas another management rule associated with the accounts payable cat egory may indicate that all accounts payable records should be retained for ?ve years. In such a case, the record retention procedure of the records management program may require that the record be retained for the longer of the two periods, which would be seven years.

Accuracy is often an important concern of the entity imple menting a records management program. For example, records management programs designed to address federal regulations, such as the Sarbanes-Oxley Act or the Health Insurance Portability and Accountability Act (HIPAA), may require accurate records classi?cation and retention/destruc tion procedures to achieve desired compliance levels. How ever, the desired accuracy may be di?icult to achieve with existing records management programs because of limited ?exibility in the rules governing category membership and record management. For example, classi?cation of a record

20

25

30

35

40

45

50

55

60

65

2 into a particular category is typically a binary decision to either include or not include the record in the category. Addi tionally, in cases in which the record is classi?ed into multiple categories having different management rules, processing the record according to these multiple management rules is typi cally restricted to simply selecting one of the rules according to some predetermined criteria. Furthermore, another factor affecting the accuracy of existing records management pro grams is the variability in classi?cation results between two or more different people and/or tools performing the classi ?cation procedure and evaluating the category membership rules.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an example records manage ment system.

FIG. 2 is a block diagram of an example classi?er that may be used in the example records management system of FIG. 1.

FIG. 3 illustrates an example operation of the example classi?er of FIG. 2.

FIG. 4 is a block diagram of an example retention unit that may be used in the example records management system of FIG. 1.

FIG. 5 illustrates an example operation of the example retention unit of FIG. 4.

FIG. 6 is a block diagram of an example production unit that may be used in the example records management system of FIG. 1.

FIG. 7 illustrates an example operation of the example production unit of FIG. 2.

FIG. 8 is a ?owchart representative of example machine readable instructions that may be executed to implement the example classi?er of FIG. 2.

FIG. 9 illustrates an example conversion table that may be used to implement the example classi?er of FIG. 2.

FIG. 10 illustrates an example technique to combine sta tistical values that may be used to implement the example classi?er of FIG. 2.

FIG. 11 is a ?owchart representative of example machine readable instructions that may be executed to implement the example retention unit of FIG. 4.

FIG. 12 illustrates an example technique to combine sta tistical values that may be used to implement the example retention unit of FIG. 4.

FIG. 13 is a ?owchart representative of example machine readable instructions that may be executed to implement the example production unit of FIG. 6.

FIG. 14 is a block diagram of an example computer that may execute the example machine readable instructions of FIGS. 8, 11 and/or 13 to implement the example records management system of FIG. 1.

DETAILED DESCRIPTION

An example records management system 100 that may be used to implement a records management program is illus trated in FIG. 1. In the illustrated example, the records man agement system 100 supports classifying a record into at least one of a plurality of categories based on a statistical value. The records management system 100 also supports perform ing an operation with respect to the record according to a rule associated with the at least one of the plurality of categories, wherein the operation is performed based on the statistical value.

Example records that may be processed by the example records management system 100 include, but are not limited

Page 16: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US 7,451,155 B2 3

to, medical records, accounting records, legal opinions, insur ance claims, employment applications, payroll records, ship ping records, invoices, accounts receive and/or payable records, etc. Such example records may exist in a tangible form by comprising, for example, one or more paper docu ments, or the records may exist in an intangible form, for example, as data stored in an electronic or other suitable medium, or any combination thereof. Example rules by Which the example records management system 100 may perform one or more operations on a record include, but are not limited to, retention rules that govern the conditions under Which the record is retained, con?dentiality rules that govern to What extent the contents of the record may be disclosed, security rules that govern to What extent the record may be accessed and/ or distributed, etc.

The example records management system 100 of FIG. 1 includes a classi?er 110, a retention unit 120, a production unit 130 and a storage unit 140 to process records, such as record 150, according to a ?le plan. As discussed above, an entity, such as a corporation, academic institution, non-pro?t organization, etc., may create the ?le plan as part of a records management program, for example, to meet regulatory and/ or other compliance requirements.

The classi?er 110 of the example records management system 100 is con?gured to classify records, such as the record 150, into one or more categories de?ned by the ?le plan. The classi?er 110 begins the classi?cation procedure by assessing the contents of the record 150 against membership rules associated With each category. The assessment may be made, for example, by an automated tool capable of process ing the contents of the record 150 and comparing the contents to the membership rules, by one or more people assessing the contents of the record 150 manually, and/or any combination thereof. After performing an assessment for a particular cat egory, the classi?er 110 then determines a likelihood that the record 150 should be classi?ed into the particular category. Again, the likelihood may be determined by an automated tool, one or more people performing the assessment, and/or any combination thereof. Then, based on this likelihood, the classi?er 110 decides Whether to classify the record 150 into the particular category being assessed.

The classi?er 110 classi?es the record 150 into a particular category by determining a statistical value to represent the association of the record 150 With the category. The statistical value may be, for example, a probability, in Which case the likelihood resulting from the assessment procedure may rep resent the probability that the record 150 belongs in the cat egory based on assessing the record contents against the membership rules. Additionally or alternatively, the statisti cal value may be, for example, a con?dence level, in Which case the likelihood resulting from the assessment procedure may represent an expected accuracy that the assessment pro cedure correctly classi?ed the record 150 into the category. In the example ofFlG. 1, the classi?er uses a category map 160 to represent the association of the record 150 With the par ticular category. After the classi?er 110 ?nishes processing the record 150, the category map 160 represents the classi? cation of the record 150 into one or more categories de?ned by the ?le plan. The contents of the category map 160 are discussed in greater detail beloW in connection With FIG. 3. Example machine readable instructions 800 that may be executed to implement the classi?er 10 are discussed in the detailed description of FIG. 8 beloW.

The retention unit 120 of the example records management system 100 is con?gured to operate on the record 150 accord ing to one or more managements rules associated With the one or more categories into Which the record 150 is classi?ed to

20

25

30

35

40

45

50

55

60

65

4 determine Whether the record 150 should be retained or destroyed. A records management program may use different management rules to specify different retention periods for records falling into different categories. In such cases, a par ticular category may have an associated retention manage ment rule having one or more criteria (e.g., such as a duration) Which specify the retention period for a record in the particu lar category. Thus, the retention unit 120 may examine, for example, the category map 160 to determine into Which cat egory or categories the record 150 Was classi?ed and evaluate the retention management rule or rules associated With respective category or categories. Furthermore, the retention unit 120 may evaluate the statistical value for each category in the category map 160 to determine Whether the record 150 should be considered a member of the particular category for purposes of making a retention decision concerning the record 150. Additionally or alternatively, the retention unit 120 may employ one or more techniques to combine retention management rules if the record 150 is determined to be clas si?ed into more than one category. Example machine read able instructions 1100 that may be executed to implement the retention unit 120 are discussed in the detailed description of FIG. 11 beloW.

The production unit 130 of the example records manage ment system 100 is con?gured to operate on the record 150 according to one or more criteria of the records management program to determine Whether the record 150 should be pro duced for subsequent inspection and/ or use. The production unit 130 may be con?gured to produce the record 150, for example, in response to a request for a speci?c record, a query to produce records classi?ed into a particular category, the occurrence of an event, a litigation discovery request, etc. Furthermore, depending on the reason for determining Whether to produce the one or more records 150, the produc tion unit 130 may evaluate the statistical values included in the category map 160 of one or more records 150 to make its determination. For example, in the case of a query to produce records classi?ed into a particular category, the production unit 130 may examine the category map 160 of a particular record 150 to determine Whether the particular category is included in the category map 160. If the category is present, the production unit 130 may additionally evaluate the statis tical value associated With the category to determine Whether it meets one or more record production criteria (e.g., such as Whether the statistical value exceeds a threshold). Then, if the statistical value meets the one or more criteria, the production unit 130 may decide to produce the record 150 in response to the query. Example machine readable instructions 1300 that may be executed to implement the production unit 130 are discussed in the detailed description of FIG. 13 beloW.

The storage unit 140 of the example records management system 100 is con?gured to store category maps 160 corre sponding to records 150. For example, category maps 160 may be stored as one or more data ?les, as entries in one or

more databases, etc. Depending on the type of record 150, the storage unit 140 may be con?gured to also store the record 150 itself (e.g., such as in the case of an electronic record) or an indirect reference to the location of the record 150 (e.g., such as in the case of a paper record physically stored in a ?ling room or other storage area). The storage unit 140 is con?gured to associate, either explicitly or implicitly, a stored category map 160 With its corresponding record 150 to alloW other elements of the records management system 100 (e. g., such as the retention unit 120 and/or the production unit 130) to operate on a particular record 150 based on its category map 160.

Page 17: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US 7,451,155 B2 5

An example implementation of the classi?er 110 of the records management system 100 of FIG. 1 is shown in FIG. 2. The example classi?er 110 of FIG. 2 is con?gured to classify records subject to a records management program into one or more categories de?ned by a ?le plan. The classi?er 110 of FIG. 2 includes one or more category assessors, shoWn as category assessors 210-220. In some examples, the classi?er 110 may employ a single category assessor 210, such as in the case of record classi?cation performed automatically by a single automated tool or manually by a single person. In other examples, the classi?er 110 may employ multiple category assessors 210-220, such as in cases in Which automatic clas si?cationperforrned by an automatic tool is supervised and/or checked by a human operator, or in Which manual classi?ca tion performed by a person is supervised/ checked by another person, or in Which separate manual classi?cations are per formed by multiple people, etc. Persons of ordinary skill in the art Will appreciate that the multiple category assessors 210-220 may perform their respective assessments at sub stantially the same time (e. g., such as in the case of multiple assessments of a record Which are later combined) or at sub stantially different times (e.g., such as in the case of a ?rst assessment of a record With is later veri?ed by a second

assessment). A category assessor 210 or 220 examines or otherWise

operates on an input record to assess a likelihood of Whether the record should be classi?ed into one more categories de?ned by a ?le plan created to implement a records manage ment program. For example, a category assessor 210 imple mented by an automated tool may process and compare some or all of the contents of the input record to one or more membership rules associated With a particular category to determine a numeric likelihood assessment of Whether the record should be classi?ed into the particular category. The numeric assessment could be a correlation score from Zero to some maximum value representative of the correlation of some or all of the contents of the record to the category membership rules, With Zero indicating that the record con tents have no correlation With the category membership rules, and the maximum value indicating that the record contents have the strongest possible correlation With the category membership rules. In another example, a category assessor 220 implemented manually by a person may process and compare some or all of the contents of the input record to the one or more membership rules to determine a linguistic like lihood assessment of Whether the record belongs in the par ticular category. Example linguistic assessments include phrases such as “absolutely in this category,” “very likely in this category,” “likely in this category,” “possibly in this cat egory,” “unlikely in this category, not in this category,” etc. In yet another example, the classi?er 110 may include both a category assessor 210 implemented by an automated tool and yielding a numeric likelihood assessment and a category assessor 220 implemented manually by a person and yielding a linguistic likelihood assessment.

The example classi?er 110 of FIG. 2 includes one or more statistical value determiners 230-240. The statistical values determined by the statistical value determiners 230-240 rep resent the association betWeen the input record and the par ticular category as assessed by the category assessors 210 220. For example, in the case of a category assessor 210 implemented by an automatic tool and yielding a numeric assessment, the statistical value determiner 230 may be con ?gured to convert the numeric likelihood assessment indi cated by a correlation score into a probability that the record belongs in the particular category, a con?dence level associ ated With the correctness of classifying the record into the

20

25

30

35

40

45

50

55

60

65

6 particular category, etc. As another example, in the case of a category assessor 220 implemented manually by a person, the statistical value determiner 240 may be con?gured to map the linguistic likelihood assessment to a probability or con? dence level, or to ?rst convert the linguistic assessment to a numeric assessment and then convert the numeric assessment to a probability or con?dence level, etc. Persons of ordinary skill in the art Will recogniZe that statistical value determiners 230 and/or 240 may not be required if the category assessors 210 and/or 220 yield an assessment Which is already in the form of the expected statistical value, such as a probability, a con?dence level, etc. The example classi?er 110 also includes a statistical value

combiner 250. If the classi?er 110 includes multiple category assessors 210-220, the statistical value combiner 250 may be con?gured to combine the individual statistical values asso ciated With the multiple category assessors 210-220 and, for example, determined by the multiple statistical value deter miners 230-240 into a composite statistical value represent ing the association of the record to the particular category. Example techniques Which may be used by the statistical value combiner to combine statistical values include: averag ing the statistical values, performing a Weighted average of the statistical values With the Weights representative of a particular category assessor’ s expected accuracy, ranking the statistical values and choosing the median, the mode, the maximum value, the minimum value, etc., and/ or any other technique for combining statistical values. An example pro cedure for combining statistical values is discussed in more detail beloW in connection With FIG. 10. Additionally or alternatively, the statistical value combiner 250 may be con ?gured to process an input 260 Which provides the results of a previous classi?cation procedure such that the statistical value combiner 250 may update the results based on the current classi?cation operation. Alternatively, the classi?er 110 may not include a statistical value combiner 250. For example, the statistical value combiner 250 may be excluded if the results of each category assessor/ statistical value deter miner pair are maintained separately for processing by other processing units in the records management system 100. The example classi?er 110 of FIG. 2 includes a category

mapper 270, for example, to create and/ or update a category map associated With the record. As mentioned previously, the category map is used to represent the classi?cation of the input record into one or more categories de?ned by the ?le plan. In the example of FIG. 2, the category mapper 270 uses the composite statistical value output by the statistical value combiner 250 to determine Whether the particular category being assessed should be included in the category map for the input record. The category mapper 270 may include a par ticular category in the category map based on various criteria, such as, for example, if the category is associated With a non-Zero statistical value, if the statistical value associated With the particular category is not less than a threshold, if the statistical value exceeds a thresholds, etc. Additionally or alternatively, the category mapper 270 may be con?gured to limit the number of categories into Which an input record may be classi?ed. For example, the category mapper 270 may rank the statistical values for categories already included in the category map for the input record; add the particular category under consideration only if there is room in the category map and/or if the statistical value associated With the particular category exceeds the minimum statistical value already included in the category map; and, if necessary, remove the existing category having the minimum statistical value from the category map to make room for the particular category under consideration. An example category map Which may be

Page 18: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US 7,45l,l55 B2 7

created and/or updated by the category mapper 270 is dis cussed in greater detail below in connection With FIG. 3. An example operation of the example classi?er 110 is

depicted in FIG. 3. In the example of FIG. 3, the classi?er 110 operates on an input record 150 to produce a category map 160. As discussed above, the category map 160 represents the classi?cation of the record 150 into one or more categories of a ?le plan. The example category map 160 contains category map entries 310-320, one entry for each category into Which the record is classi?ed. Persons of ordinary skill in the art Will appreciate the many alternative formats for the category map 160 may be used to associate the record 150 With one or more categories.

The category map entry 310 of the category map 160 includes a particular category 330 into Which the record 150 is classi?ed and a statistical value 340 to represent the asso ciation betWeen the record 150 and the category 330. For example, and as discussed above, the statistical value 340 may be a probability that the record 150 belongs to the cat egory 330, or the statistical value 340 may be a con?dence level representative of the expected accuracy of classifying the record 150 into the category 330, etc. Additionally, the category map entry 310 may include and/ or reference one or more management rules 350 associated With the category 330. The management rules 350 may be indirect references/ pointers to a repository of possible management rules, the actual management rules speci?cally associated With the par ticular category 330, a combination thereof, etc. Similarly, the category map entry 320 of the category map 160 includes a particular category 360 into Which the record 150 is classi?ed and a statistical value 370 to represent the association betWeen the record 150 and the category 360. Additionally, the category map entry 320 may include and/ or reference one or more management rules 380 associated With the category 360.

After the classi?er 110 produces the category map 160, the record 150 may be processed according to any or all of the management rules 350 and 380 based on the respective sta tistical values 340 and 370. For example, the statistical value 340 may be evaluated to determine Whether the association betWeen the record 150 and the category 330 is suf?cient enough to consider the record 150 as classi?ed into the cat egory 330 and, therefore, subject to one or more of the man agement rules 350. Similarly, the statistical value 370 may be evaluated to determine Whether the association betWeen the record 150 and the category 360 is su?icient enough to con sider the record 150 as classi?ed into the category 360 and, therefore, subject to one or more of the management rules 380. Additionally, if the record 150 is suf?ciently associated With both the category 330 and the category 360, similar management rules 350 and 380 may need to be combined before the record 150 may be processed by that rule. For example, if both the management rules 350 and 380 contain a respective retention rule, the retention rules may need to be combined before determining Whether to retain or destroy the record 150. Additional examples of proces sing the record 150 according to the category map 160 are discussed beloW in connection With FIGS. 5 and 7. An example implementation of the retention unit 120 of the

records management system 100 of FIG. 1 is shoWn in FIG. 4. The example retention unit 120 of FIG. 4 is con?gured to process records subject to a records management program to determine Whether or not the records should be retained or destroyed. The retention unit 120 includes a category map selector/parser 410 to select and parse a category map (e.g., such as the category map 160 of FIG. 1) associated With a record to be processed. For example, the category map selec

20

25

30

35

40

45

50

55

60

65

8 tor/parser 410 may be invoked to select a category map asso ciated With a speci?ed record to alloW the retention unit 120 to determine Whether to retain or destroy the record. Addi tionally or alternatively, the category map selector/parser 410 may be invoked on a periodic basis (e.g., such as annually, quarterly, monthly, etc.) or an a periodic basis to select one or more category maps corresponding to one or more records governed by the records management program to determine Whether such record or records should be retained or destroyed. Additionally or alternatively, the category map selector/parser 410 may be invoked based on the occurrence of an event (e.g., exceeding a speci?ed storage capacity limit, classi?cation of a neW record, etc.) to select one or more category maps corresponding to one or more records to deter mine Whether such record or records should be retained to destroyed. After selecting a category map, the category map selector/parser 410 parses the entries of the category map (e.g., such as category map entries 310 and 320) for further processing by the retention unit 120.

The example retention unit 120 of FIG. 4 also includes a rule retriever 420 to retrieve one or more management rules identi?ed in the category map entry or entries provided by the category map selector/parser 410. For example, the rule retriever 420 may be con?gured to retrieve one or more man agement rules from a repository of possible management rules indirectly referenced by one or more pointers provided in the category map entry or entries. Additionally or altema tively, the rule retriever 420 may be con?gured to directly process one or more actual rules included in the category map entry or entries. The rule retriever 410 may also determine Whether or not to retrieve a particular rule provided by a category map entry based on the statistical value associated With the particular map entry. For example, the rule retriever may evaluate the statistical value against one or more criteria to determine Whether the association betWeen the record and the category corresponding to the entry is suf?cient to justify retrieval of the particular rule. Processing of category map entries by the rule retriever 420 is further discussed in the context of FIG. 5 beloW. The example retention unit 120 of FIG. 4 also includes a

rule combiner 430 to combine similar rules retrieved by the rule retriever 420 and included in tWo or more category map entries provided by the category map selector/parser 410. For example, the rule retriever 420 may combine tWo or more similar rules associated to tWo or more respective categories based on the statistical values associated With the respective categories to generate a composite rule having an associated composite statistical value. In the case of retention rule com bining, tWo or more rules may be considered similar if they have similar retention periods. Thus, these rules may be com bined into a single retention rule possessing a particular reten tion period (e.g., the longer of the similar retention periods) and having an associated composite statistical value based on combining the statistical values associated With the constitu ent category map entries. Combining of management rules by the rule combiner 430 is further discussed in the context of FIG. 5 beloW. Also, an example technique for combining similar management rules is illustrated in FIG. 12. The example retention unit 120 of FIG. 4 includes a rule

evaluator 440 to evaluate the record under consideration against rules retrieved by the rule retriever 420 (and possibly combined by the rule combiner 430) to determine hoW the record under consideration should be processed. In the case of evaluating records/retention rules, the rule evaluator 440 may make a decision regarding Whether the record under consid eration should be retained or destroyed and output such a decision for execution under the records management pro

Page 19: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US 7,451,155 B2

gram. An example rule evaluation procedure Which may be performed by the rule evaluator 440 is depicted in the context of the example machine readable instructions 1100 of FIG. 11 that may be executed to implement the retention unit 120.

Persons of ordinary skill in the art Will appreciate that the architecture depicted in FIG. 4 may be adapted to implement other record processing operations in addition to record reten tion. For example, most record processing operations Will require selection and parsing of the category map associated With a particular record (e.g., by the category map selector/ parser 410). Additionally, the category map Will probably include management rules governing the record processing operation of interest Which Will need to be retrieved (e.g., by the rule retriever 420) andpossibly combined (e.g., by the rule combiner 430). Finally, such management rules Will likely need to be evaluated in the context of the record processing operation to be performed (e. g., by the rule evaluator 440). An example operation of the example retention unit 120 is

depicted in FIG. 5. In the example of FIG. 5, the retention unit 120 operates on a category map 160 associated With an input record 150 to produce a decision 510 regarding Whether to retain or destroy the record 150. As discussed above, the category map 160 represents the classi?cation of the record 150 into one or more categories of a ?le plan. As discussed above in the context of FIG. 3, the example category map 160 contains category map entries 310-320, one entry for each category into Which the record is classi?ed. Persons of ordi nary skill in the art Will appreciate that many alternative formats for the category map 160 may be used to associate the record 150 With one or more categories. As discussed above in the context of FIG. 3, the category

map entry 310 of the category map 160 includes a particular category 330 into Which the record 150 is classi?ed and a statistical value 340 to represent the association betWeen the record 150 and the category 330. Additionally, the category map entry 310 may include and/or reference one or more management rules 350 associated With the category 330. The management rules 350 may be indirect references/pointers to a repository of possible management rules, the actual man agement rules speci?cally associated With the particular cat egory 330, a combination thereof, etc. Similarly, the category map entry 320 of the category map 160 includes a particular category 360 into Which the record 150 is classi?ed and a statistical value 370 to represent the association betWeen the record 150 and the category 360. Additionally, the category map entry 320 may include one or more management rules 380 associated With the category 360.

The example retention unit 120 of FIG. 5 determines Whether to retain or destroy the record 150 based on one or more retention rules included in the management rules 350 and 380 and according to the respective statistical values 340 and 370. For example, the rule retriever 420 of FIG. 4 may retrieve retention rules included in and/or referenced by the management rules 350 and 380. Additionally, the rule evalu ator 440 may evaluate the statistical value 340 to determine Whether the association betWeen the record 150 and the cat egory 330 is su?icient enough to consider the record 150 as classi?ed into the category 330 and, therefore, subject to a ?rst retention rule included in and/or referenced by the man agement rules 350. Similarly, the rule evaluator 440 may evaluate the statistical value 370 to determine Whether the association betWeen the record 150 and the category 360 is suf?cient enough to consider the record 150 as classi?ed into the category 360 and, therefore, subject to a second retention rule included in and/or referenced by the management rules 380. Furthermore, if the record 150 is suf?ciently associated With both the category 330 and the category 360, the rule

20

25

30

35

40

45

50

55

60

65

10 combiner 43 0 may need to combine the ?rst retention rule and the second rule included in and/or referenced by the manage ment rules 350 and 380, respectively, before evaluation by the rule evaluator 440. For example, if both the ?rst retention rule and the second retention rule include a similar retention period, the rule combiner 430 may combine these rules before the rule evaluator 440 determines Whether to retain or destroy the record 150. Finally, the retention unit 120 outputs a deci sion 51 0 regarding Whether to retain or destroy the record 150 based on the rule evaluation performed by the rule evaluator 440. An example implementation of the production unit 130 of

the records management system 100 of FIG. 1 is shoWn in FIG. 6. The example production unit 130 ofFIG. 6 is con?g ured to process records subject to a records management program to determine Whether or not the records should be produced for subsequent inspection and/or use. The produc tion unit 130 includes a category map selector/parser 610 (similar to the category map selector/parser 410 of FIG. 4) to select and parse a category map (e.g., such as the category map 160 of FIG. 1) associated With a record to be processed. For example, the category map selector/parser 610 may be invoked to select a category map associated With a speci?ed record to alloW the production unit 130 to determine Whether to produce the record. Additionally or alternatively, the cat egory map selector/parser 610 may be invoked on a periodic basis (e.g., such as annually, quarterly, monthly, etc.) or an a periodic basis to select one or more category maps corre sponding to one or more records governed by the records management program to determine Whether such record or records should be produced. Additionally or alternatively, the category map selector/parser 610 may be invoked based on the occurrence of an event (e.g., exceeding a speci?ed storage capacity limit, classi?cation of a neW record, etc.) to select one or more category maps corresponding to one or more

records to determine Whether such record or records should be produced. After selecting a category map, the category map selector/parser 610 parses the entries of the category map (e.g., such as category map entries 310 and 320) for further processing by the production unit 130. The example production unit 130 of FIG. 6 also includes a

statistical value processor 620 to process the statistical value or values associated With the one or more categories included in the category map provided by the category map selector/ parser 610. For example, the production unit 130 may be invoked in response to a query or event requiring production of records classi?ed into a particular category. In such cases, the statistical value processor 620 may process the statistical value associated With the particular category to determine Whether the record associated With the category map should be considered as being classi?ed into the particular category. The processing performed by the statistical value processor 620 may include comparing the statistical value (e.g., Which may be a probability, a con?dence level, etc.) to a threshold indicative of a required degree of association betWeen a record and a particular category for record production. The example production unit 130 of FIG. 6 also includes a

production evaluator 630 con?gured to evaluate the results of the statistical value processor to determine Whether to pro duce the record associated With the category map provided by the category map selector/parser 610. For example, if the statistical value processor 620 determines that the statistical value for the particular queried category exceeds the required threshold, the production evaluator 63 0 may output a decision indicating that the record should be produced. An example production evaluation procedure Which may be performed by the production evaluator 630 is depicted in the context of the

Page 20: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5

US 7,451,155 B2 11

example machine readable instructions 1300 of FIG. 13 that may be executed to implement the production unit 130. An example operation of the example production unit 130

is depicted in FIG. 7. In the example of FIG. 7, the production unit 130 operates on a category map 160 associated with an input record 150 to produce a decision 710 regarding whether to produce the record 150 for subsequent inspection and/or use. As discussed above, the category map 160 represents the classi?cation of the record 150 into one or more categories of a ?le plan. As discussed above in the context of FIG. 3, the example category map 160 contains category map entries 310-320, one entry for each category into which the record is classi?ed. Persons of ordinary skill in the art will appreciate the many alternative formats for the category map 160 may be used to associate the record 150 with one or more categories. As discussed above in the context of FIG. 3, the category

map entry 310 of the category map 160 includes a particular category 330 into which the record 150 is classi?ed and a statistical value 340 to represent the association between the record 150 and the category 330. Additionally, the category map entry 310 may include and/or reference one or more management rules 350 associated with the category 330. The management rules 350 may be indirect references/pointers to a repository of possible management rules, the actual man agement rules speci?cally associated with the particular cat egory 330, a combination thereof, etc. Similarly, the category map entry 320 of the category map 160 includes a particular category 360 into which the record 150 is classi?ed and a statistical value 370 to represent the association between the record 150 and the category 360. Additionally, the category map entry 320 may include and/or reference one or more management rules 380 associated with the category 360.

The production unit 130 may determine whether to pro duce the record 150 based on the category or categories 330 and 3 60 included in the category map 160 and their respective statistical values 340 and 370. For example, in response to a record production query including category 330, the statisti cal value processor 620 may process the statistical value 340 to determine whether the association between the record 150 and the category 330 is su?icient enough to consider the record 150 as classi?ed into the category 330. Similarly, in response to a record production query including category 360, the statistical value processor 620 may evaluate the statistical value 370 to determine whether the association between the record 150 and the category 360 is suf?cient enough to consider the record 150 as classi?ed into the cat egory 360. Then, the production evaluator 63 0 may determine whether the record should be produced based on whether the statistical value processor 620 indicates that the record is classi?ed into the particular queried category, such as cat egory 330 or 360. Finally, the production unit 130 outputs a decision 710 regarding whether to produce the record 150 based on the record production evaluation performed by the production evaluator 630.

Flowcharts representative of example machine readable instructions that may be executed to implement the classi?er 110 of FIG. 2, the retention unit 120 of FIG. 4 and the productionunit130 ofFlG. 6 are shown in FIGS.8,11 and 13, respectively. In these examples, the machine readable instruc tions represented by each ?owchart may comprise one or more programs for execution by: (a) a processor, such as the processor 1412 shown in the example computer 1400 dis cussed below in connection with FIG. 14, (b) a controller, and/or (c) any other suitable device. The one or more pro grams may be embodied in software stored on a tangible medium such as, for example, a ?ash memory, a CD-ROM, a ?oppy disk, a hard drive, a DVD, or a memory associated with

20

25

30

35

40

45

50

55

60

65

12 the processor 1412, but persons of ordinary skill in the art will readily appreciate that the entire program or programs and/ or portions thereof could alternatively be executed by a device other than the processor 1412 and/ or embodied in ?rmware or dedicated hardware in a well-known manner (e. g., imple mented by an application speci?c integrated circuit (ASIC), a programmable logic device (PLD), a ?eld programmable logic device (FPLD), discrete logic, etc.). For example, any or all of the classi?er 110, the retention unit 120 and/or the production unit 130 could be implemented by any combina tion of software, hardware, and/or ?rmware. Also, some or all of the machine readable instructions represented by the ?ow chart of FIGS. 8, 11 and/or 13 may be implemented manually. Further, although the example machine readable instructions are described with reference to the ?owcharts illustrated in FIGS. 8, 11 and 13, persons of ordinary skill in the art will readily appreciate that many other techniques for implement ing the example methods and apparatus described herein may alternatively be used. For example, with reference to the ?owcharts illustrated in FIGS. 8, 11 and 13, the order of execution of the blocks may be changed, and/ or some of the blocks described may be changed, eliminated, combined and/ or subdivided into multiple blocks.

Example machine readable instructions 800 that may be executed to implement the classi?er 110 of FIGS. 1 and 2 are shown in FIG. 8. The example machine readable instructions 800 operate to classify a record into one or more categories de?ned by a ?le plan used to implement a records manage ment program. The example machine readable instructions 800 may be executed each time a new record is presented for classi?cation, at predetermined intervals (such as, for example, a part of a larger control loop implementing a records management process), based on an occurrence of a predetermined event (such as, for example, after a predeter mined number of records have been presented to a records management system for classi?cation), etc., and/or any com bination thereof.

The machine readable instructions 800 begin execution at block 804 at which the classi?er 110 initiates a control loop to assess whether the input record should be classi?ed into each category in the ?le plan. For example, at block 804 the cat egory assessor 210 of FIG. 2 processes some or all of the contents of the record against the membership rules of the particular category under consideration. As discussed above, either or both of an automatic assessment or a manual assess

ment may be performed at block 804. After the particular category is assessed at block 804, control proceeds to block 808 at which the category assessor 210 determines a likeli hood of whether the record should be classi?ed into the par ticular category in the set of categories included in the ?le plan. As discussed above, the category assessor 210 may determine a numeric likelihood assessment of whether the record should be classi?ed into the particular category (e. g., such as a correlation score between Zero and a maximum

value) and/or a linguistic assessment (e.g., such as one of the phrases “absolutely in this category,” “very likely in this category,” “likely in this category,” “possibly in this cat egory,” “unlikely in this category,” “not in this category,” etc.) of whether the record should be classi?ed into the particular category. After determining the likelihood assessment of whether to classify the record into the particular category under consideration, control proceeds to block 812 at which the classi?er 110 determines whether there are additional categories in the ?le plan to assess. If there are additional categories to assess (block 812), control returns to block 804 and blocks subsequent thereto at which the next category is

Page 21: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 22: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 23: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 24: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 25: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5
Page 26: (12) Unlted States Patent (10) Patent N0.2 US 7,451,155 B2 ... Toolkits/The... · PP 6’ ' ar es’ 2002/0087286 A1* 7/2002 Mitchell ... v /-120 380 RETENTION UNIT %510 FIG. 5