think automation first

15
PDF Redactor UiPathTeam PDF Redactor Custom Activity Version 3.0 May 2021 Bernard Lawes [email protected]

Upload: others

Post on 24-Mar-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

PDF RedactorUiPathTeam PDF Redactor Custom Activity

Version 3.0

May 2021

Bernard Lawes

[email protected]

2

Content

1. What Export Control is and why it is important

2. The legislation applicable to UiPath and the penalties

for not complying with it

3. Our obligations and sales responsibilities

4. The red flags you should be able to identify on your

own

5. Detailed 3rd Party Screening Process

6. Whom should you contact for further guidance

3

● Lack of Awareness of redaction protocols

● Manual, Tedious, and Error Prone

● Time-Consuming even with Desktop Software

● Batch Processing is not readily available

● Wastes lots of Paper (Environmental Impact)

● Mistakes can lead to costly compliance risks

Why is Redaction so challenging?

4

What does it do?

5

1

2

3 Reduce time required to safely publish or deliver requested documents

Ensure released documents are in compliance with data privacy laws

Fast and secure automated redaction without 3rd party software

Value Proposition

6

Digitize Extract Redact

Digitize OCR Activities Extraction Activities Redaction Activity

Redaction Plugin for Document Understanding

Leverage Studio’s available OCR

Libraries

Identify PII data using Regex, Form

Extractor, or Machine Learning

Irreversibly redact any text on the

document

7

Where to find help for PDF Redactor?

https://forum.uipath.com/t/pdf-redaction-

custom-activity/236846

https://connect.uipath.com/marketplace/co

mponents/pdf-redactor

Download Custom Activity Here

Forum Discussion and Examples

8

How does PDF Redactor work?

Leveraging OCR, a UiPath Robot locates and redacts sensitive data based on given Regex patterns and extracted data from UiPath Document Understanding.

The activity comes with pre-built regex patterns for items such as social security number, phone number, dates and others. In addition, the user can define their

own custom regex patterns as well.

Enabled to ingest ExtractionResults and DocumentObjectModels from Document Understanding, the activity can redact any text value in the document.

Moreover, the activity benefits from Document Understanding’s ever-growing extraction capabilities using Form Extraction, Machine Learning, and others

9

Arg Description Available Options Type Direction Examples

FileInput Input PDF File String In C:\temp\input.pdf

FileOutput Full path of output redacted PDF File String In/Out C:\temp\output.pdf

FormulaAutoPre-Built and auto-generated regular expressions for PII

data

• ssn

• phone

• email

• ein

• dates

• currency

String Array In {"ssn","ein","dates","currency","email","phone"}

Formula User-defined Custom Regular expression String In \d{6,}

Keywords String array of keywords to be redacted String Array In {“John”,”MacArthur”,”Jack”,”Hibbs”}

HighlightOnly To Highlight words, set to True; Leave False to Redact• True

• FalseString In

RedactColor Sets the color of the Redaction polygon or highlight System.Drawing.Color In System.Drawing.Color.Black

Silent Set to true to Hide Status Bar• True

• FalseBoolean In

OCREngine Sets the OCR Engine. Defaults to Omnipage

• OmniPage

• Google

• UiPath

String In “omnipage”

OCRAPIKey API Key for the selected OCR Engine String In

OCREndpoint Endpoint URL for the selected OCR Engine String In

PDFRedaction Activty Arguments

10

Arg Description Available Options Type Direction Example

FileInput Input PDF File String In C:\temp\input.pdf

FileOutput Full path of output redacted PDF File String In/Out C:\temp\output.pdf

FormulaAutoPre-Built and auto-generated regular expressions for PII

data

• ssn

• phone

• email

• ein

• dates

• currency

String Array In {"ssn","ein","dates","currency","email","phone"}

Formula User-defined Custom Regular expression String In \d{6,}

Keywords String array of keywords to be redacted String Array In {“John”,”MacArthur”,”Jack”,”Hibbs”}

HighlightOnly To Highlight words, set to True; Leave False to Redact• True

• FalseString In

RedactColor Sets the color of the Redaction polygon or highlight System.Drawing.Color In System.Drawing.Color.Black

Silent Set to true to Hide Status Bar• True

• FalseBoolean In

WaternarkFileFile to a PNG use as a watermark or logo on the

redacted document

• OmniPage

• Google

• UiPath

String In “omnipage”

WatermarkLocation Set the origin point for the Watermark to appear System.Drawing.Point In New System.Drawing.Point(0,0)

RedactFieldsRedactFields is an Array of the Fields from the existing

taxonomy, that you want to redact.String Array In {“name”,”address”,”invoice”,”dob”}

DocumentObjectModelDocumentObjectModel - Result from Document

Understanding Digitize ActivityDocumentObjectModel In

ExtractionResultExtractionResult > Result from the Document

Understanding Extraction Scope ActivityExtractionResult In

DU Redaction Plugin Activity Arguments

11

Limitations

Limitation Description

Contiguous Words Only The Basic PDFRedaction activity activity does not redact phrases, sentences, or any group of words like

paragraphs or characters separated by a space, tab, or carriage return. Regex patterns will match only contiguous

words.

To redact groups of words, please use the DU Redaction Plugin activity and specify the fields from the taxonomy to

be redacted

OCR Engines Available OCR Engines

• Google Vision OCR

• OMNIPage

• UiPath Document OCR

12

Dependencies

Library Description Version License Information

PDFSharp.ExtensionsExtension methods for PDFSharp to support and simplify some common operations

including image extraction.0.1.2.2

https://github.com/gheeres/P

DFSharp.Extensions/blob/ma

ster/LICENSE

PDFSharp.Standard 1.51.12

https://github.com/CLMSUK/

PDFsharp/blob/master/LICE

NSE

UiPath.OCR.ActivitiesThe UiPath.OCR.Activities package contains the "UiPath Screen OCR" and "UiPath

Document OCR" activities, that use UiPath's OCR engines.2.41

https://www.uipath.com/hubfs

/legalspot/UiPath_MSSA.pdf

UiPath.OmniPage.ActivitiesThe UiPath OmniPage Activities pack contains the OmniPage OCR activity, using the

Nuance OmniPage OCR Engine.1.50

UiPath.OmniPageBundle.ExtendedThe package contains the binaries needed by OmniPage OCR Engine.Powered by

OmniPage OCR.

UiPath.PDF.Activities 3.2.2

UiPath.System.ActivitiesPackage contains core activities which enable the automation of desktop applications,

browsers, and virtual machines. Please check the documentation for more details.20.4.0

UiPath.UIAutomation.Activities

Core activities which enable the robots to manipulate data tables and collections, work with

files and folders, communicate with Orchestrator. Package also contains workflow

operators, dialog forms, debugging and invoking methods.

20.4.2

UiPath.IntelligentOCR.Activities

Core activities that enable the usage of a complete document processing framework, from

taxonomy definition, digitization, document classification, data extraction, data validation

and classifier / extractor training.

4.5.2

UiPath.OCR.ActivitiesThe UiPath.OCR.Activities package contains the "UiPath Screen OCR" and "UiPath

Document OCR" activities, that use UiPath's OCR engines2.40

UiPathTeam.StatusProgress.Activities Status Message and Progress Bar that can be positioned anywhere on the screen. 1.02

13

Where to find help for PDF Redactor?

https://forum.uipath.com/t/pdf-redaction-

custom-activity/236846

https://connect.uipath.com/marketplace/co

mponents/pdf-redactor

Download Custom Activity Here

Forum Discussion and Examples

https://www.uipath.com/hubfs/legalspot/UiP

ath_Activity_License_Agreement.pdf

License Agreement

14

➢ UiPath Document Understanding webpage ← for trial and

more resources

➢ How AI Can Continuously Improve and Scale Automations

(webinar with customers sharing case studies)

➢ Guide on Document Understanding (white paper)

➢ Academy training

➢ Documentation on Document Understanding framework &

extractors and ML model training via AI Fabric

➢ AI & RPA webpage

Where can you find out more?

15

www.uipath.com

Bernard Lawes

Pre-Sales Engineer

[email protected]