what would it take to

31
What would it take to record, remember, and retrieve all text read or seen or heard during one’s lifetime? 2/13/2018 A billion words to remember 1

Upload: others

Post on 11-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What would it take to

What would it take to

record, remember, and retrieve

all text read or seen or heard

during one’s lifetime?

2/13/2018 A billion words to remember 1

Page 2: What would it take to

The Lifetime Reader

George NagyProfessor Emeritus

Rensselaer Polytechnic Institute

Page 3: What would it take to

2/13/2018 A billion words to remember 3

seenread

Page 4: What would it take to

2/13/2018A billion words to remember

4

or heard

Page 5: What would it take to

Not a new idea In 1945, Vannevar Bush proposed the Memex:

2/13/2018 A billion words to remember 5

The camera hound of the future wears on his forehead a lump a little larger than a walnut. It takes pictures 3 millimeters square … only a factor of 10 beyond present practice. … .Wholly new forms of encyclopedias … with a mesh of associative trails … The entire material of the Britannica in reduced microfilm form would go on a sheet eight and one-half by eleven inches. …

Page 6: What would it take to

What will it take today?

2/13/2018 A billion words to remember

http://pngimg.com/upload/laptop_PNG5940.png

A Sensor Module + A Host Computer

Page 7: What would it take to

2/13/2018 A billion words to remember 7

Pattern Recognition1966 IEEE Workshop…

Self-Organizing, Bionic, Synnoetic, Heuristic, Adaptive, Cybernetic, Pattern Recognizing, Self-repairing

George I read about it in IEEE Pervasive Computing

WEARABLE SENSOR TEXT PROCESSOR

Page 8: What would it take to

Spectacle-mounted camera

2/13/2018 A billion words to remember 8

https://www.kjbsecurity.com/products/detail/stylish-glasses-dvr-camera/785/

Page 9: What would it take to

Quantifying Reading Habits –Counting How Many Words You ReadKai Kunze, Katsutoshi; Masai, Masahiko Inami, Omer Sacakli, Marcus Liwicki, Andreas Dengel, Shoya Ishimaru, Koichi Kise

2/13/2018 A billion words to remember 9

Page 10: What would it take to

2/13/2018 A billion words to remember 10

IMAGE & AUDIO SENSORText-image and Speech

detection and compression

TEXT PROCESSORCharacter and Speech Recognition

Data Compression and IndexingRetrieval and Display

Functionality

Page 11: What would it take to

Sensor Module

2/13/2018 A billion words to remember 11

Camera: Text-image* capture1 frame per second (FPS)20 Megapixels RBG60° × 60° field of view (FOV)Autofocus 25 cm to ∝< 10 g

Mic: audio capture

Onboard processor:Text and speech detectionData compressionLog (time and space stamp)Encryption? 20 GB memory (images)

Bluetooth or Wi-Fi

GPS (or link to ...)

Page 12: What would it take to

*Definitions:

2/13/2018 A billion words to remember 12

Text-image: Image (bitmap) of some text visible to the wearere.g., a printed page, a billboard, a computer screen.

Image-text:Image encoded via Optical Character Recognition (OCR) into some searchable text format e.g. .txt, .pdf, .doc .

Page 13: What would it take to

Camera-based OCR in 1960 (20 x 20 pixel camera)

2/13/2018 A billion words to remember 13

Page 14: What would it take to

2/13/2018 A billion words to remember 14

Text Detection and Recognition in Imagery: A Survey

Qixiang Ye and David Doermann

7.3 Remaining ProblemsProcessing multilingual text.Processing incidental text.

Real-time detection and recognition.End-to-end recognition.Open vocabulary recognition.

IEEE TRANS. PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 37, NO. 7, JULY 2015

200 <references

Page 15: What would it take to

Host module: laptop, tablet, or smartphone

2/13/2018 A billion words to remember 15

StoreDuplicate detectionOCRReading orderSpeech RecognitionText compression ~10 GB (text)Progressive Indexing

RetrieveBrowser & digilib search tools Inverted index, concordanceVector-space modelLatent semantic indexingTemporal and spatial proximityPattern matchingPerfect hashingSignature filesGraph edit distanceRelevance feedbackUser model…

Page 16: What would it take to

2/13/2018 A billion words to remember 16

Display of retrieved data

Page 17: What would it take to

2/13/2018 A billion words to remember 17

Metadata, data, and options for a slide seen a year ago:

Page 18: What would it take to

Data-volumetext-image:4K x 5K pixels x 3 B/pixel x 1 fps x 3600 s/h x 8 hrs/ 100x compression = 17 GB /day

audio: 4 KB/sec x 3600 s/h x 8 hrs = 115 MB/day(estimates vary from 300 B/s for a vocoder to 1.4 MB/s for high-fidelity stereo CD audio books)

image-text:2 B/char x 5 chars/ word x 300 words/min x 60 m/h x 8 hrs /5x = 300 KB/day 300 KB x 365 x 100 = 10 GB / lifetime

audio text: same2/13/2018 A billion words to remember 18

Page 19: What would it take to

Three advantages of searching a personalcollection compared to web search:

2/13/2018 A billion words to remember 19

1. Total lifetime volume 10 GB compared to millions of times as much on WWW

2. Retrieved items already familiarand therefore easier to identify from top returns

3. Fractured prose & OCR errors not bothersome because we won’t re-broadcast retrieved items

Page 20: What would it take to

Some underlying research problems

2/13/2018 A billion words to remember 20

Hardware

Image acquisition

Text-image analysis

Information retrieval

Ethical and legal issues

Page 21: What would it take to

2/13/2018 A billion words to remember 21

Hardware• Camera size and weight

• Camera optics – resolution, field of view, focus

• Power consumption

• Power source – battery, heat, light, or motion?

Page 22: What would it take to

2/13/2018 A billion words to remember 22

Image acquisition• Text detection in spatial context: at home, at

work, in the neighborhood, in transit, abroad

• Mosaicking required by head and body motion while reading

• Lazy compression of text images

• Optional hands-free (via mic) annotation

• Optional visible (gestural) annotation, e.g. by tracing a phrase on a printed page or computer screen with a designated finger

Page 23: What would it take to

2/13/2018 A billion words to remember 23

Text-image analysis• Perspective-invariant recognition instead of skew removal

• Reading-order without gaze tracking

• Duplicate detection from consecutive frames and after (possibly lengthy) interruptions

• Retention policy for undecipherable and unindexable fragments of text, and for near-duplicates

• Adaptation to predictable reading material like our daily newspaper, favored magazines, the remaining volumes of the Jack Aubrey series, Computer, Python v2.7.6 documentation …

Page 24: What would it take to

2/13/2018 A billion words to remember 24

Information retrieval - 1• Retrieval strategies that mesh with our own mental

recall, e.g. “I read it in high school”, or “around Thanksgiving at my sister’s place”.

• Creating and exploiting an evolutionary personal information model

• Personalization: scripts and languages— reading speed—reading postures—computer display settings—work, leisure, shopping and napping habits

Page 25: What would it take to

Information retrieval - 2

2/13/2018 A billion words to remember 25

• Selective, topic-, time-, or location-specific summarization

• Mathematical formula and specialized notation (chess, bridge, organic chemistry) retrieval

• Logging queries, responses, and user reactions for improving the system even as one’s own memory deteriorates

Page 26: What would it take to

2/13/2018 A billion words to remember 26

Ethical and legal issues - 1

• Security and privacy: what do these mean over a lifetime?

• Copyright: should we have permanent access to anything that we have read?

• Text piracy: would the likely lack of source metadata encourage it?

• What is the legal difference between deliberately acquired information, as with a smartphone or camera, and autonomously acquired information?

Page 27: What would it take to

2/13/2018 A billion words to remember 27

Ethical and legal issues - 2

• What responsibility does delayed discovery of a crime entail (for instance, on an airplane seat neighbor’s laptop screen that one glanced at two years ago)? Can the recorded data be subpoenaed?

• What are the social and marketing implications of lifetime text logging?

Page 28: What would it take to

Conclusion

2/13/2018 A billion words to remember 28

• The major challenge is a 10g camera capable of taking 60,000 30-Megapixel images between battery charges.

• The software required seems simpler than that of browsers, smartphones, or self-driving cars.

• The initial model need not be perfect and can be developed incrementally. Early adopters are likely to take their Lifetime Reader to where they now take their laptops, tablets and smartphones: to class, seminars, meetings, and perhaps even the library.

• Is occasional retrieval of dimly remembered read or heard nuggets worth the burden of another wearable?

Page 29: What would it take to

Product announcement expected on or about April 1, 2021.

Thank you for your interest and support!

2/13/2018 A billion words to remember 29

Page 30: What would it take to

2/13/2018 A billion words to remember 30

Page 31: What would it take to

No, I won’t wear it!

2/13/2018 A billion words to remember 31