techniques for manipulating text - evan schiff...techniques for manipulating text (and why that’s...

30
Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff

Upload: others

Post on 01-Apr-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Techniques for Manipulating Text

(and why that’s useful)

presented by Evan Schiff

Page 2: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What do these files have in common?

ALE STL SubtitlesSubCapAvid Bin Export

They are all plain text.

FCP XML

Page 3: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

You Can Do A Lot With Plain Text

Reformat one type of file into another

Add, remove, or fix data generated by other applications

Quickly hunt down a specific piece or pattern of information within a large document

Make changes en masse to avoid time-consuming manual adjustments

Parse it using a Terminal command or script

Page 4: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Some Real World Examples

Convert an EDL with Locators into a SubCap file for importing back into Avid

Convert Avid Locators into a DVDSP or Compressor Chapter Markers file

Create an EDL out of data in your Filemaker codebook

Batch rename files with advanced substitution patterns

Process a list of missing ProTools media to automatically hunt it down

Page 5: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What Tools Do You Need?

A good text editor.

Textmate ($64) Atom (Free) Sublime ($70)

(Not TextEdit or Notepad.)

Page 6: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Starting Out

A) What type of data do you have?

B) What type of data do you need?

How do you turn A into B?

Page 7: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Starting Out

Get familiar with everything a text editor can do.

Learn to navigate using the keyboard.

Experiment.

Page 8: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

After Learning the Basics

Learn Regular Expressions (RegEx)

Learn a programming language such as Python, Javascript, or Bash.

Combine scripting with RegEx to accomplish complex tasks

Constantly reassess your workflow to find faster and easier methods

If you come up with something cool, share it!

Page 9: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Text Manipulation Without Regular Expressions

Page 10: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Multiple Cursors Demo

Most of the time when we want to manipulate text, we want to change a lot of it all at once

One way to do that is with Multiple Cursors

Let’s look at what that is, and how it can be useful

Page 11: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Multiple Cursors Demo

Cmd-F: Find Opt-Enter: Find All, Add a Cursor at every Occurrence Cmd-Click Text: Add a Cursor manually Cmd-Shift-L: For every line of selected text, add a Cursor

OS X Keyboard Navigation • Cmd ←/→: Go to start/end of line • Cmd ↑ / ↓: Go to top/bottom of document • Opt ←/→: Go to previous/next word • Add Shift to select or unselect text

Windows Keyboard Navigation • Home/End: Go to start/end of line • Ctrl Home/End: Go to top/bottom of document • Ctrl ←/→: Go to previous/next word • Add Shift to select or unselect text

Page 12: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Text Manipulation Using Regular Expressions

Page 13: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What are Regular Expressions?

Regular Expressions (RegEx) are a way to define patterns of text

They enable you to find and select text that matches those patterns

And with that matching text selected you can then change it to suit your needs

Page 14: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What Does It Look Like?

Page 15: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

(\d{3}[^\n]*([0-9:]{11})\s([0-9:]{11})\s?\n[\s\S]*?(?=^\d{3}|^>|^$))

Page 16: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Wait, what the #\.$*^# is that?!

Page 17: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Hold it, Hold it, What the hell is that shit?!

RegEx is made up of codes

Each code represents a set of characters such as letters, numbers, and $#!*

When written in a specific sequence,

they define a pattern of text.

Let’s take a closer look.

Page 18: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What does RegEx look like?

Simple:

Timecode: \d{2}:\d{2}:\d{2}:\d{2} 01:02:03:04

E-Mail Address: [\w.-]+@[A-Za-z0-9.-]+\.[A-Z]{2,4} [email protected]

Complex:

EDL Event: (\d{3}[^\n]*([0-9:]{11})\s([0-9:]{11})\s?\n[\s\S]*?(?=^\d{3}|^>|^$)) Reference 003 08P013V V C 13:31:20:14 13:31:28:16 04:00:42:22 04:00:51:00

Locator in an EDL: \* LOC.*[\d:]{11}\s+([\w]+) +\b([^\r\n]*?)\r?\n Reference: * LOC: 04:00:47:20 BLUE CS0020

Page 19: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What are some other patterns you can think of?

Let’s test them with Rubular.com

Page 20: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

RegEx Structure

Regular Expressions consist of character codes and quantifiers

Or in other words, what is the character you are looking for and how many times do you expect to see it?

3 Digits:A word of any length:

Zero or one ‘u’:

\d{3}\w+

colou?r

315Avid

color or colour

Search Criteria Code Example

Page 21: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

RegEx Symbols

Symbol What It Represents. Any character except line break

Such AsA-Z 0-9 Special Characters

\d Any digit 0-9

\s Whitespace (spaces, tabs, line breaks)

\t Tab

\n and \r Line Break: \n is Mac/Linux, \r\n is Windows

\w Any character that could be part of a word A-Z 0-9 _

[ ] Match the letters or symbols inside the brackets. The example to the right matches only the letters D, E, or F

[DEF]

Page 22: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

RegEx Quantifiers

Symbol What It Represents? Zero or One occurrence

+ One or more occurrence

* Zero or more occurrences

{2} Exactly 2 occurrences

{2,5} Between 2 and 5 occurrences

Page 23: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Some Real World Demos

Convert an EDL with Locators into a SubCap file for importing back into Avid

Convert Avid Locators into a FCP or Compressor Chapter Markers file

Batch rename files with advanced substitution patterns

Page 24: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

When are RegEx Not Useful?

When you don’t know or can’t clearly define a pattern

If it’s faster to make the changes by hand than figure out what the pattern is

When there’s no variation in what you’re searching for

Sometimes a normal Find & Replace is all you need

When there’s too much variation, don’t try the all-in-one approach. Maybe you can break it down into multiple smaller patterns

Page 25: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

So What’s the Next Step?

Page 26: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Shell Scripting

Page 27: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

What is Shell Scripting?

Shell scripting uses a programming language

to execute a series of commands

in order to accomplish a more complex task

Page 28: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

ProTools Example

In 40 lines of code, and using regular expressions,

this script takes a text file of media that ProTools can’t find,

locates it, and copies it to a directory on your desktop.

Page 29: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

How Do I Learn?

Google it!

Check out sites like Codecademy, Code Avengers, Khan Academy, etc.

Pick a language to learn,

and of the many languages out there, I would probably start with Python

Page 30: Techniques for Manipulating Text - Evan Schiff...Techniques for Manipulating Text (and why that’s useful) presented by Evan Schiff. What do these files have in common? ALE Avid SubCap

Thanks!