techniques for manipulating text - evan schiff...techniques for manipulating text (and why that’s...
Post on 01-Apr-2020
6 Views
Preview:
TRANSCRIPT
Techniques for Manipulating Text
(and why that’s useful)
presented by Evan Schiff
What do these files have in common?
ALE STL SubtitlesSubCapAvid Bin Export
They are all plain text.
FCP XML
You Can Do A Lot With Plain Text
Reformat one type of file into another
Add, remove, or fix data generated by other applications
Quickly hunt down a specific piece or pattern of information within a large document
Make changes en masse to avoid time-consuming manual adjustments
Parse it using a Terminal command or script
Some Real World Examples
Convert an EDL with Locators into a SubCap file for importing back into Avid
Convert Avid Locators into a DVDSP or Compressor Chapter Markers file
Create an EDL out of data in your Filemaker codebook
Batch rename files with advanced substitution patterns
Process a list of missing ProTools media to automatically hunt it down
What Tools Do You Need?
A good text editor.
Textmate ($64) Atom (Free) Sublime ($70)
(Not TextEdit or Notepad.)
Starting Out
A) What type of data do you have?
B) What type of data do you need?
How do you turn A into B?
Starting Out
Get familiar with everything a text editor can do.
Learn to navigate using the keyboard.
Experiment.
After Learning the Basics
Learn Regular Expressions (RegEx)
Learn a programming language such as Python, Javascript, or Bash.
Combine scripting with RegEx to accomplish complex tasks
Constantly reassess your workflow to find faster and easier methods
If you come up with something cool, share it!
Text Manipulation Without Regular Expressions
Multiple Cursors Demo
Most of the time when we want to manipulate text, we want to change a lot of it all at once
One way to do that is with Multiple Cursors
Let’s look at what that is, and how it can be useful
Multiple Cursors Demo
Cmd-F: Find Opt-Enter: Find All, Add a Cursor at every Occurrence Cmd-Click Text: Add a Cursor manually Cmd-Shift-L: For every line of selected text, add a Cursor
OS X Keyboard Navigation • Cmd ←/→: Go to start/end of line • Cmd ↑ / ↓: Go to top/bottom of document • Opt ←/→: Go to previous/next word • Add Shift to select or unselect text
Windows Keyboard Navigation • Home/End: Go to start/end of line • Ctrl Home/End: Go to top/bottom of document • Ctrl ←/→: Go to previous/next word • Add Shift to select or unselect text
Text Manipulation Using Regular Expressions
What are Regular Expressions?
Regular Expressions (RegEx) are a way to define patterns of text
They enable you to find and select text that matches those patterns
And with that matching text selected you can then change it to suit your needs
What Does It Look Like?
(\d{3}[^\n]*([0-9:]{11})\s([0-9:]{11})\s?\n[\s\S]*?(?=^\d{3}|^>|^$))
Wait, what the #\.$*^# is that?!
Hold it, Hold it, What the hell is that shit?!
RegEx is made up of codes
Each code represents a set of characters such as letters, numbers, and $#!*
When written in a specific sequence,
they define a pattern of text.
Let’s take a closer look.
What does RegEx look like?
Simple:
Timecode: \d{2}:\d{2}:\d{2}:\d{2} 01:02:03:04
E-Mail Address: [\w.-]+@[A-Za-z0-9.-]+\.[A-Z]{2,4} evan@evanschiff.com
Complex:
EDL Event: (\d{3}[^\n]*([0-9:]{11})\s([0-9:]{11})\s?\n[\s\S]*?(?=^\d{3}|^>|^$)) Reference 003 08P013V V C 13:31:20:14 13:31:28:16 04:00:42:22 04:00:51:00
Locator in an EDL: \* LOC.*[\d:]{11}\s+([\w]+) +\b([^\r\n]*?)\r?\n Reference: * LOC: 04:00:47:20 BLUE CS0020
What are some other patterns you can think of?
Let’s test them with Rubular.com
RegEx Structure
Regular Expressions consist of character codes and quantifiers
Or in other words, what is the character you are looking for and how many times do you expect to see it?
3 Digits:A word of any length:
Zero or one ‘u’:
\d{3}\w+
colou?r
315Avid
color or colour
Search Criteria Code Example
RegEx Symbols
Symbol What It Represents. Any character except line break
Such AsA-Z 0-9 Special Characters
\d Any digit 0-9
\s Whitespace (spaces, tabs, line breaks)
\t Tab
\n and \r Line Break: \n is Mac/Linux, \r\n is Windows
\w Any character that could be part of a word A-Z 0-9 _
[ ] Match the letters or symbols inside the brackets. The example to the right matches only the letters D, E, or F
[DEF]
RegEx Quantifiers
Symbol What It Represents? Zero or One occurrence
+ One or more occurrence
* Zero or more occurrences
{2} Exactly 2 occurrences
{2,5} Between 2 and 5 occurrences
Some Real World Demos
Convert an EDL with Locators into a SubCap file for importing back into Avid
Convert Avid Locators into a FCP or Compressor Chapter Markers file
Batch rename files with advanced substitution patterns
When are RegEx Not Useful?
When you don’t know or can’t clearly define a pattern
If it’s faster to make the changes by hand than figure out what the pattern is
When there’s no variation in what you’re searching for
Sometimes a normal Find & Replace is all you need
When there’s too much variation, don’t try the all-in-one approach. Maybe you can break it down into multiple smaller patterns
So What’s the Next Step?
Shell Scripting
What is Shell Scripting?
Shell scripting uses a programming language
to execute a series of commands
in order to accomplish a more complex task
ProTools Example
In 40 lines of code, and using regular expressions,
this script takes a text file of media that ProTools can’t find,
locates it, and copies it to a directory on your desktop.
How Do I Learn?
Google it!
Check out sites like Codecademy, Code Avengers, Khan Academy, etc.
Pick a language to learn,
and of the many languages out there, I would probably start with Python
Thanks!
top related