list of suggested reviewers or reviewers not to include

47
List of Suggested Reviewers or Reviewers Not To Include (optional) SUGGESTED REVIEWERS: Not Listed REVIEWERS NOT TO INCLUDE: Not Listed

Upload: others

Post on 26-Dec-2021

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: List of Suggested Reviewers or Reviewers Not To Include

List of Suggested Reviewers or Reviewers Not To Include (optional)

SUGGESTED REVIEWERS:Not Listed

REVIEWERS NOT TO INCLUDE:Not Listed

Page 2: List of Suggested Reviewers or Reviewers Not To Include

A Your Name: Your Organizational Affiliation(s), last 12 m Last Active DateFragkiadaki, Katerina Carnegie Mellon University

G: Your PhD Advisor(s)T: All your PhD Thesis AdviseesP: Your Graduate Advisors

to disambiguate common namesB Advisor/Advisee Name: Organizational Affiliation Optional (email, Department)G: Shi, Jianbo University of PennsylvaniaG: Malik, Jitendra University of California, BerkeleyT: Harley, Adam Carnegie Mellon UniversityT: Tung, Hsiao-Yu Fish Carnegie Mellon University

A:

C: Collaborators on projects, such as funded grants, graduate research or others (in last 48 months).to disambiguate common names

C Name: Organizational Affiliation Optional (email, Department) Last ActiveC: Agarwal, Arpit Carnegie Mellon UniversityA: Agrawal, Pulkit University of California, BerkeleyC: Alemi, Alex Google ResearchA: Arbelaez, Pablo Universidad de los Andes, ColombiaC: Atkeson, Chris Carnegie Mellon University

Pursuant to PAPPG Chapter II.C.1.e., each PI, co-PI, and other senior project personnel identified on a proposal must provide collaborator and other affiliations information to help NSF identify appropriate reviewers.(v.4/21/2017)

Table B: List names as Last Name, First Name, Middle Initial, and provide organizational affiliations, if known, for the following.

Table C: List names as Last Name, First Name, Middle Initial, and provide organizational affiliations, if known, for the following.

Co-authors on any book, article, report, abstract or paper (with collaboration in last 48 months; publication date may be later).

Table A: List your Last Name, First Name, Middle Initial, and organizational affiliation (including considered affiliation) in the last 12 months.

List names as Last Name, First Name, Middle Initial. Additionally, provide email, organization, and department (optional) to disambiguate common names.

Fixed column widths keep this sheet one page wide; if you cut and paste text, set font size at 10pt or smaller, and abbreviate, where necessary, to make the data fit.

To insert n blank rows, select n row numbers to move down, right click, and choose Insert from the menu. You may fill-down (crtl-D) to mark a sequence of collaborators, or copy affiliations. Excel has arrows that enable sorting."Last active" dates are optional, but will help NSF staff easily determine which information remains relevant for reviewer selection.

Please complete this template (e.g., Excel, Google Sheets, LibreOffice), save as .xlsx or .xls, and upload directly as a Fastlane Collaborators and Other Affiliations single copy doc. Do not upload .pdf.

There are five tables: A: Your Name & Affiliation(s); B: PhD Advisors/Advisees (all);C: Collaborators; D: Co-Editors;E: Relationships

Page 3: List of Suggested Reviewers or Reviewers Not To Include

A: Carreira, Joao Deep MindA: Efros, Alexei University of California, BerkeleyA: Felsen, Panna University of California, BerkeleyA: Girshick, Ross FacebookA: Gkioxari, Georgia FacebookA: Gupta, Saurabh FacebookA: Hariharan, Bharath FacebookC: Huang, Henry Carnegie Mellon UniversityC: Huang, Jonathan Google ResearchA: Kar, Abhishek University of California, BerkeleyA: Levine, Sergey University of California, BerkeleyC: Ricco, Susanna Google ResearchC: Salakhutdinov, Ruslan Carnegie Mellon UniversityC: Schmid, Cordelia INRIA Grenoble Rhne-AlpesC: Sukthankar, Rahul Google ResearchA: Tulsiani, Shubham University of California, BerkeleyC: Vijayanarasimhan, Sudheen Google Research

B: Editorial board: Name(s) of editor-in-chief and journal (in past 24 months).E: Other Co-Editors of journals or collections with whom you directly interacted (in past 24 months).

to disambiguate common namesD Name: Organizational Affiliation Journal/Collection Last Active

R:to disambiguate common names

D Name: Organizational Affiliation Optional (email, Department) Last Active

Additional names for whom some relationship would otherwise preclude their service as a reviewer.

Table E: List persons for whom a personal, family, or business relationship would otherwise preclude their service as a reviewer.

Table D: List editorial board, editor-in-chief and co-editors with whom you interact. An editor-in-chief should list the entire editorial board.

Page 4: List of Suggested Reviewers or Reviewers Not To Include

Not for distribution

COVER SHEET FOR PROPOSAL TO THE NATIONAL SCIENCE FOUNDATIONFOR NSF USE ONLY

NSF PROPOSAL NUMBER

DATE RECEIVED NUMBER OF COPIES DIVISION ASSIGNED FUND CODE DUNS# (Data Universal Numbering System) FILE LOCATION

FOR CONSIDERATION BY NSF ORGANIZATION UNIT(S) (Indicate the most specific unit known, i.e. program, division, etc.)

PROGRAM ANNOUNCEMENT/SOLICITATION NO./DUE DATE Special Exception to Deadline Date Policy

EMPLOYER IDENTIFICATION NUMBER (EIN) ORTAXPAYER IDENTIFICATION NUMBER (TIN)

SHOW PREVIOUS AWARD NO. IF THIS ISA RENEWALAN ACCOMPLISHMENT-BASED RENEWAL

IS THIS PROPOSAL BEING SUBMITTED TO ANOTHER FEDERALAGENCY? YES NO IF YES, LIST ACRONYM(S)

NAME OF ORGANIZATION TO WHICH AWARD SHOULD BE MADE ADDRESS OF AWARDEE ORGANIZATION, INCLUDING 9 DIGIT ZIP CODE

AWARDEE ORGANIZATION CODE (IF KNOWN)

IS AWARDEE ORGANIZATION (Check All That Apply) SMALL BUSINESS MINORITY BUSINESS IF THIS IS A PRELIMINARY PROPOSALFOR-PROFIT ORGANIZATION WOMAN-OWNED BUSINESS THEN CHECK HERE

NAME OF PRIMARY PLACE OF PERF ADDRESS OF PRIMARY PLACE OF PERF, INCLUDING 9 DIGIT ZIP CODE

TITLE OF PROPOSED PROJECT

REQUESTED AMOUNT

$

PROPOSED DURATION (1-60 MONTHS)

months

REQUESTED STARTING DATE SHOW RELATED PRELIMINARY PROPOSAL NO.IF APPLICABLE

THIS PROPOSAL INCLUDES ANY OF THE ITEMS LISTED BELOWBEGINNING INVESTIGATOR

DISCLOSURE OF LOBBYING ACTIVITIES

PROPRIETARY & PRIVILEGED INFORMATION

HISTORIC PLACES

COLLABORATIVE STATUSVERTEBRATE ANIMALS IACUC App. DatePHS Animal Welfare Assurance Number

HUMAN SUBJECTS Human Subjects Assurance Number

Exemption Subsection or IRB App. Date

INTERNATIONAL ACTIVITIES: COUNTRY/COUNTRIES INVOLVED

TYPE OF PROPOSAL

PI/PD DEPARTMENT PI/PD POSTAL ADDRESS

PI/PD FAX NUMBER

NAMES (TYPED) High Degree Yr of Degree Telephone Number Email Address

PI/PD NAME

CO-PI/PD

CO-PI/PD

CO-PI/PD

CO-PI/PD

Page 1 of 3

CNS - MAJOR RESEARCH INSTRUMENTATION, (continued)

NSF 18-513

250969449

Carnegie-Mellon University

0001057000

5000 Forbes AvenueWQED BuildingPITTSBURGH, PA 15213-3815

Carnegie-Mellon UniversityCarnegie-Mellon University Carnegie-Mellon University5000 Forbes AvenuePittsburgh ,PA ,152133890 ,US.

MRI: Development of a Mobile Human Behavior Capture System

1,866,666 48 09/01/18

Robotics Institute & HCI Institute

412-268-6436

5000 Forbes Avenue

Pittsburgh, PA 15213United States

Christopher Atkeson PhD 1986 412-681-8354 [email protected]

Katerina Fragkiadaki DPhil 2013 412-268-9527 [email protected]

Jessica Hodgins PhD 1989 412-268-6795 [email protected]

Yaser Sheikh PhD 2006 412-268-1138 [email protected]

052184116

Not a collaborative proposalEquipment

Page 5: List of Suggested Reviewers or Reviewers Not To Include

Not for distribution

CERTIFICATION PAGE

Certification for Authorized Organizational Representative (or Equivalent) or Individual Applicant

By electronically signing and submitting this proposal, the Authorized Organizational Representative (AOR) or Individual Applicant is: (1) certifying that statements made herein are true and complete to the best of his/her knowledge; and (2) agreeing to accept the obligation to comply with NSF award terms and conditions if an award is made as a result of this application. Further, the applicant is hereby providing certifications regarding conflict of interest (when applicable), drug-free workplace, debarment and suspension, lobbying activities (see below), nondiscrimination, flood hazard insurance (when applicable), responsible conduct of research, organizational support, Federal tax obligations, unpaid Federal tax liability, and criminal convictions as set forth in the NSF Proposal & Award Policies & Procedures Guide (PAPPG). Willful provision of false information in this application and its supporting documents or in reports required under an ensuing award is a criminal offense (U.S. Code, Title 18, Section 1001).

Certification Regarding Conflict of Interest

The AOR is required to complete certifications stating that the organization has implemented and is enforcing a written policy on conflicts of interest (COI), consistent with the provisionsof PAPPG Chapter IX.A.; that, to the best of his/her knowledge, all financial disclosures required by the conflict of interest policy were made; and that conflicts of interest, if any, were,or prior to the organization’s expenditure of any funds under the award, will be, satisfactorily managed, reduced or eliminated in accordance with the organization’s conflict of interest policy.Conflicts that cannot be satisfactorily managed, reduced or eliminated and research that proceeds without the imposition of conditions or restrictions when a conflict of interest exists,must be disclosed to NSF via use of the Notifications and Requests Module in FastLane.

Drug Free Work Place Certification

By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent), is providing the Drug Free Work Place Certification contained in Exhibit II-3 of the Proposal & Award Policies & Procedures Guide.

Debarment and Suspension Certification (If answer "yes", please provide explanation.)

Is the organization or its principals presently debarred, suspended, proposed for debarment, declared ineligible, or voluntarily excluded from covered transactions by any Federal department or agency? Yes No

By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent) or Individual Applicant is providing the Debarment and Suspension Certification contained in Exhibit II-4 of the Proposal & Award Policies & Procedures Guide.

Certification Regarding LobbyingThis certification is required for an award of a Federal contract, grant, or cooperative agreement exceeding $100,000 and for an award of a Federal loan or a commitment providing for the United States to insure or guarantee a loan exceeding $150,000.

Certification for Contracts, Grants, Loans and Cooperative AgreementsThe undersigned certifies, to the best of his or her knowledge and belief, that:(1) No Federal appropriated funds have been paid or will be paid, by or on behalf of the undersigned, to any person for influencing or attempting to influence an officer or employee of any agency, a Member of Congress, an officer or employee of Congress, or an employee of a Member of Congress in connection with the awarding of any Federal contract, the making of any Federal grant, the making of any Federal loan, the entering into of any cooperative agreement, and the extension, continuation, renewal, amendment, or modification of any Federal contract, grant, loan, or cooperative agreement.(2) If any funds other than Federal appropriated funds have been paid or will be paid to any person for influencing or attempting to influence an officer or employee of any agency, a Member of Congress, an officer or employee of Congress, or an employee of a Member of Congress in connection with this Federal contract, grant, loan, or cooperative agreement, the undersigned shall complete and submit Standard Form-LLL, ‘‘Disclosure of Lobbying Activities,’’ in accordance with its instructions.(3) The undersigned shall require that the language of this certification be included in the award documents for all subawards at all tiers including subcontracts, subgrants, and contracts under grants, loans, and cooperative agreements and that all subrecipients shall certify and disclose accordingly.

This certification is a material representation of fact upon which reliance was placed when this transaction was made or entered into. Submission of this certification is a prerequisite for making or entering into this transaction imposed by section 1352, Title 31, U.S. Code. Any person who fails to file the required certification shall be subject to a civil penalty of not lessthan $10,000 and not more than $100,000 for each such failure.

Certification Regarding Nondiscrimination

By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent) is providing the Certification Regarding Nondiscrimination contained in Exhibit II-6 of the Proposal & Award Policies & Procedures Guide.

Certification Regarding Flood Hazard Insurance

Two sections of the National Flood Insurance Act of 1968 (42 USC §4012a and §4106) bar Federal agencies from giving financial assistance for acquisition or construction purposes in any area identified by the Federal Emergency Management Agency (FEMA) as having special flood hazards unless the: (1) community in which that area is located participates in the national flood insurance program; and(2) building (and any related equipment) is covered by adequate flood insurance.

By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent) or Individual Applicant located in FEMA-designated special flood hazard areas is certifying that adequate flood insurance has been or will be obtained in the following situations: (1) for NSF grants for the construction of a building or facility, regardless of the dollar amount of the grant; and(2) for other NSF grants when more than $25,000 has been budgeted in the proposal for repair, alteration or improvement (construction) of a building or facility.

Certification Regarding Responsible Conduct of Research (RCR) (This certification is not applicable to proposals for conferences, symposia, and workshops.)

By electronically signing the Certification Pages, the Authorized Organizational Representative is certifying that, in accordance with the NSF Proposal & Award Policies & Procedures Guide, Chapter IX.B. , the institution has a plan in place to provide appropriate training and oversight in the responsible and ethical conduct of research to undergraduates, graduate students and postdoctoral researchers who will be supported by NSF to conduct research. The AOR shall require that the language of this certification be included in any award documents for all subawards at all tiers.

Page 2 of 3

Page 6: List of Suggested Reviewers or Reviewers Not To Include

Not for distribution

CERTIFICATION PAGE - CONTINUED

Certification Regarding Organizational Support

By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent) is certifying that there is organizational support for the proposal as required by Section 526 of the America COMPETES Reauthorization Act of 2010. This support extends to the portion of the proposal developed to satisfy the Broader Impacts Review Criterion as well as the Intellectual Merit Review Criterion, and any additional review criteria specified in the solicitation. Organizational support will be made available, as described in the proposal, in order to address the broader impacts and intellectual merit activities to be undertaken.

Certification Regarding Federal Tax Obligations

When the proposal exceeds $5,000,000, the Authorized Organizational Representative (or equivalent) is required to complete the following certification regarding Federal tax obligations. By electronically signing the Certification pages, the Authorized Organizational Representative is certifying that, to the best of their knowledge and belief, the proposing organization: (1) has filed all Federal tax returns required during the three years preceding this certification; (2) has not been convicted of a criminal offense under the Internal Revenue Code of 1986; and (3) has not, more than 90 days prior to this certification, been notified of any unpaid Federal tax assessment for which the liability remains unsatisfied, unless the assessment is the subject of an installment agreement or offer in compromise that has been approved by the Internal Revenue Service and is not in default, or the assessment is the subject of a non-frivolous administrative or judicial proceeding.

Certification Regarding Unpaid Federal Tax Liability

When the proposing organization is a corporation, the Authorized Organizational Representative (or equivalent) is required to complete the following certification regarding Federal Tax Liability: By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent) is certifying that the corporation has no unpaid Federal tax liability that has been assessed, for which all judicial and administrative remedies have been exhausted or lapsed, and that is not being paid in a timely manner pursuant to an agreement with the authority responsible for collecting the tax liability.

Certification Regarding Criminal Convictions

When the proposing organization is a corporation, the Authorized Organizational Representative (or equivalent) is required to complete the following certification regarding Criminal Convictions: By electronically signing the Certification Pages, the Authorized Organizational Representative (or equivalent) is certifying that the corporation has not been convicted of a felony criminal violation under any Federal law within the 24 months preceding the date on which the certification is signed.

Certification Dual Use Research of Concern

By electronically signing the certification pages, the Authorized Organizational Representative is certifying that the organization will be or is in compliance with all aspects of the United States Government Policy for Institutional Oversight of Life Sciences Dual Use Research of Concern.

AUTHORIZED ORGANIZATIONAL REPRESENTATIVE SIGNATURE DATE

NAME

TELEPHONE NUMBER EMAIL ADDRESS FAX NUMBER

fm1207rrs-07

Page 3 of 3

Page 7: List of Suggested Reviewers or Reviewers Not To Include

Not for distribution

COVER SHEET FOR PROPOSAL TO THE NATIONAL SCIENCE FOUNDATIONFOR CONSIDERATION BY NSF ORGANIZATION UNIT(S) - continued from page 1 (Indicate the most specific unit known, i.e. program, division, etc.)

Continuation Page

IIS - ROBUST INTELLIGENCE

Page 8: List of Suggested Reviewers or Reviewers Not To Include

TABLE OF CONTENTSFor font size and page formatting specifications, see PAPPG section II.B.2.

Total No. of Page No.*Pages (Optional)*

Cover Sheet for Proposal to the National Science Foundation

Project Summary (not to exceed 1 page)

Table of Contents

Project Description (Including Results from Prior

NSF Support) (not to exceed 15 pages) (Exceed only if allowed by aspecific program announcement/solicitation or if approved inadvance by the appropriate NSF Assistant Director or designee)

References Cited

Biographical Sketches (Not to exceed 2 pages each)

Budget (Plus up to 3 pages of budget justification)

Current and Pending Support

Facilities, Equipment and Other Resources

Special Information/Supplementary Documents(Data Management Plan, Mentoring Plan and Other Supplementary Documents)

Appendix (List below. )

(Include only if allowed by a specific program announcement/solicitation or if approved in advance by the appropriate NSFAssistant Director or designee)

Appendix Items:

*Proposers may select any numbering mechanism for the proposal. The entire proposal however, must be paginated.Complete both columns only if the proposal is numbered consecutively.

1

17

5

7

5

5

0

Page 9: List of Suggested Reviewers or Reviewers Not To Include

Figure 1: Mobile behavior capture for animals. (From National Geographic)

MRI: Development of a Mobile Human Behavior Capture SystemInstrument Location: Carnegie Mellon University, as well as deployments in the field such as in subject’s

homes. The system development will be located in CMU’s Motion Capture Lab and various mechanical and

electronics fabrication areas in CMU’s Robotics Institute.

Instrument Type: A behavior capture system, similar to a motion capture system but more general.

Research Activities to be Enabled

We propose building a ground-breaking Mobile Behavior Capture system that goes far beyond current labo-

ratory markerless motion capture systems used in research and movie production. We will build on our success

with CMU’s Motion Capture Lab and Panoptic Studio (a markerless motion capture facility, Figures 3 and 4).

This system will be transformative by offering a combination of new capabilities: in terms of capturing behav-

ior in natural environments outside the laboratory by being portable, in terms of mobile measurement tracking

moving behavior with multiple system-wide “foveas”, in terms of enabling interactive behavioral experimentsinvolving humans, robots, and also human-robot interaction, in terms of being able to perform markerless cap-

ture as well as tracking markers, in terms of spatial and temporal scalability so that fine as well as gross behaviors

can be captured as well as quick and long duration behaviors, and in terms of multi-modal capture: coordinated

motion, sound, contact, interaction force, and physiological measurements. Our system will support research

on human behavior and human-technology interaction, diagnosis and therapy for people with disabilities, and

robot programming and learning techniques. This is exciting because it allows us to build more realistic models

of human behavior, perform more sophisticated robot experiments, and make a real difference in people’s lives

through better rehabilitation and therapy in place.

Why get out of the lab? Behavior of animals in a zoo is quite different from behavior of animals in the wild,

and robots are being used to extend behavior capture to natural environments (Figure 1). The same is true of

humans: behavior in a lab or motion capture studio is often unnatural. Behavior is shaped by its context. For

example, in order to assist older adults to live independently for as long as possible, we need to understand how

specific individuals behave in their own homes. Figure 2 shows the Aware Home, a house we built to capture

behavior. Unfortunately, it became clear that subjects still treated the house as a laboratory and not as their own

home.

Why measure mobile behavior? Much behavior, especially social behavior, is expressed while on the move,

such as behavior on sidewalks, hallways, stairs, elevators, and outdoors in general, so mobile measurement

Figure 2: Left: The Aware Home: a house we built for behavior capture. Right: Social interactions measured

by our CareMedia system in a hallway of an Alzheimer’s care facility.

1

Page 10: List of Suggested Reviewers or Reviewers Not To Include

Figure 3: Left and Middle: Capturing deformation typically involve many markers and cumbersome camera

arrangements. Right: The CMU Panoptic Studio, a video-based capture area within a 6m diameter dome.

is also important (Figure 2). Instrumenting a large volume leads to poor spatial resolution, unless one has a

Hollywood movie-sized budget.

Why interactive? Most motion measurement systems are not real time, and results are typically available the

next day or week. We will build a system that can track selected quantities in real time, such as the motion of a

hand, or behavior transitions. The real time portion of the system will support interactive behavioral experiments

with humans, as well as experiments involving real time control of robots.

Why markerless? Soft materials such as human skin (Figure 3), liquids, and granular materials pose chal-

lenges to measurement systems that rely on markers [30]. Imagine putting markers on vegetables being cut up

during food preparation, or on Alzheimer’s patients to assess their needs for assistance in dressing, feeding, or

cleaning themselves. We have found that such patients immediately focus on and pick at the markers.

Why scalable? Currently, we cannot get high resolution images of details like facial expressions or finger

movements if we allow subjects to move around significantly. We need to pre-plan where the high resolution

measurements should be, instead of following the subjects. Handling quick captures is relatively easy, in that

we can start capture well before an event of interest and stop capture well after it. Capturing long duration

behaviors, and capturing continuously 24/7 creates a flood of data. We are able to compress the data using

standards like H264 in real time, but in order to reduce the flood more, we need to do compression and data

forgetting that is tuned for the experiment being performed or the hypothesis being tested.

Why multi-modal? Human manipulation behavior and also social interaction often involves touching and

forces. We need ways to measure contact and forces, especially in situations where we can’t easily use kinematic

measurements to estimate them. A more complete measurement and understanding of physiological variables

helps us better understand human motion, and also other factors such as human emotion.

What is captured? Our system will include measurement-at-a-distance components such as cameras, ther-

mal imaging, microphone arrays, and radar imaging and velocity measurement, contact measurement compo-

nents such as traditional strain gage instrumentation and resistive and capacitive touch and force sensors as well

as optically measured deformation of elastic materials in contact [29], and physiological measurement compo-

nents that range from current worn devices such as Fitbits and heart monitors to electromyographic sensing and

ultrasound and radar imaging of internal tissues such as muscles and bones [7, 23]. Novel challenges include in-

tegrating this wide range of multi-modal sensors, capturing behavior involving soft deformable objects, liquids,

and granular materials (such as salt, sugar, and flour used in cooking), tracking internal tissue movement using

ultrasound and radar, creating a system that is easy to deploy, calibrate, use, and maintain, and that provides

results quickly and conveniently, and, most importantly, is accepted or ignored by subjects.

Who will use the system? A good predictor of who will use the proposed Mobile Behavior Capture system

is based on who uses CMU’s Motion Capture Lab and Panoptic Studio. Research, courses, and independent

student projects from the Robotics Institute (robot and drone testing, robot programming studies such as robots

learning from imitation or demonstration, human-robot interaction studies, human motion capture for animation,

2

Page 11: List of Suggested Reviewers or Reviewers Not To Include

Figure 4: Tracking humans in the Panoptic Studio.

computer vision research and obtaining ground truth measurements, humanoid robot real time control studies

and well as ground truth measurements, and human behavior research), Drama (research and teaching how to act

for and use motion capture), Art (research and teaching animation), the Entertainment Technology Center (ETC)

(research and teaching how to use motion capture), and Biomedical Engineering (research on disease, therapies,

and medical devices) use the Motion Capture Lab and Panoptic Studio. Disney Research Pittsburgh and Boston

Dynamics are two companies that made use of the Motion Capture Lab recently. Perhaps the biggest users so

far are from the computer graphics, animation, and vision communities worldwide. Data made available on the

web has been acknowledged in several hundred papers.

Where has funding come from? To predict future funding, we will review past funding. The Motion Cap-

ture Lab and various versions of the Panoptic Studio have helped us obtain the following NSF funding (total

$18,516,823). The titles of these awards gives an indication of the breadth of research supported by our ex-

isting capture facilities: NSF Young Investigator: Coordination and Control of Dynamic Physical Systems,

0196047, $13,932; Data-Driven Control of Humanoid Robots, 0196089, $254,501; PostDoc: Parallel Search

Algorithms for Automating the Animation of Human Motion, 0196221, $7,740; CADRE: Digital Muybridge:

A Repository for Human Motion Data, 0196217, $1,253,648; Programming Entertainment Robots 0203912,

$66,000; ITR: CareMedia: Automated Video and Sensor Analysis for Geriatric Care, 0205219, $2,131,000;

ITR: Providing Intuitive Access to Human Motion Databases, 0205224, $528,000; Collaborative Research Re-

sources: An Experimental Platform for Humanoid Robotics Research, 0224419, $1,015,000; CISE Research

Instrumentation: Data-Driven Modeling for Real-Time Interaction and Animation, 0242482, $48,394; ITR:

Human Activity Monitoring Using Simple Sensors, 0312991, $338,646; ITR: Collaborative Research: Using

Humanoids to Understand Humans, 0325383, $1,484,667; ITR Collaborative Research: Indexing, Retrieval,

and Use of Large Motion Databases; 0326322, $1,454,000; Collaborative Research: DHB: Human Dynamics

of Robot-Supported Collaborative Work, 0624275, $544,000; Data-Driven Animation of Skin Deformations,

0702556, $349,000; Exploring the Uncanny Valley, 0811450, $362,000; Approximate Dynamic Programming

Using Random Sampling, 0824077, $348,199; II-EN The Human Virtualization Studio: From Distributed

Sensor to Interactive Audiovisual Environment, 0855163, $600,000; RI: Small: Spacetime Reconstruction of

Dynamic Scenes from Moving Cameras, 0916272, $445,771; RI: Medium: Collaborative Research: Trajec-

tory Libraries for Locomotion on Rough Terrain, 0964581, $699,879; CPS: Medium: Collaborative Research:

Monitoring Human Performance with Wearable Accelerometers, 0931999, $1,206,078; Collaborative Research:

Computational Behavioral Science: Modeling, Analysis, and Visualization of Social and Communicative Be-

3

Page 12: List of Suggested Reviewers or Reviewers Not To Include

havior, 1029549, $1,531,518; EAGER: 3D Event Reconstruction from Social Cameras, 1353120, $216,000;

RI: Medium: Combining Optimal and Neuromuscular Controllers for Agile and Robust Humanoid Behavior,

1563807, $1,000,000; SCH: EXP: Monitoring Motor Symptoms in Parkinson’s Disease with Wearable Devices,

1602337, $678,850; RI: Small: Optical Skin For Robots: Tactile Sensing and Whole Body Vision, 1717066,

$440,000; and NRI: INT: Individualized Co-Robotics, 1734449, $1,500,000.

This list does not include larger group grants such as IGERT: Interdisciplinary Research Training in Assistive

Technology, 0333420, $3,718,105; and Quality of Life Technology Engineering Research Center, 0540865,

$29,560,917; Disney Research Pittsburgh also provided substantial funding, as did DARPA.

Where will new funding come from? We expect the transformative capabilities of this facility: portabil-

ity, mobility, multiple foveation, real time, markerless, scalable, and multi-modal, to attract new sources of

funding for new types of research, as well as further NSF support. As an example of relevant funding, this

proposal complements a recently funded NSF Expeditions project which includes CMU, Computational Photo-Scatterography: Unraveling Scattered Photons for Bio-imaging. This project will develop optical wearable

devices to make real-time physiological measurements. The proposed capture system would be useful for each

of Atkeson’s current awards: “RI: Medium: Combining Optimal and Neuromuscular Controllers for Agile

and Robust Humanoid Behavior,” 1563807, $1,000,000, to assess robot performance and support new kinds of

feedback for robot control; “RI: Small: Optical Skin For Robots: Tactile Sensing and Whole Body Vision,”

1717066, $440,000, to evaluate robot skin as well as provide ground truth data; and “NRI: INT: Individualized

Co-Robotics, 1734449, $1,500,000. to assess exoskeleton performance and support new kinds of feedback for

exoskeleton control. It would also be useful for Hodgins’ project “SCH: EXP: Monitoring Motor Symptoms in

Parkinson’s Disease with Wearable Devices,” 1602337, $678,850; to provide additional monitoring information

as well as ground truth data. Atkeson is involved in a study of cheetah motor control in South Africa using

motion capture. Although it is not realistic to suggest this capture system would be sent to Africa, it is clear

that a portable motion capture system would be very useful for animal studies in the wild. In terms of the im-

mediate future, CMU and University of Pittsburgh faculty are also preparing an NSF Science and Technology

Center (STC) proposal on Understanding Action Through High Definition Behavioral Analysis. The goal is to

use high quality behavioral monitoring of humans and animals to better understand the neuro-biological basis

of behavior. This proposal has been selected and will be submitted by CMU this year. In addition to funding

similar to the previous funding we have received, we expect to be able to develop new funding sources due to the

coordinated visual, contact, and physiological capture in the areas of wearable devices for medical, entertain-

ment, and other purposes, soft robotics, including robot skin and sensors, and to support a national facility for

exoskeleton and robot evaluation. We hope to be more successful in seeking NIH funding for wearable devices

for preventive medicine and therapy.

Who will be the future users? We expect the Mobile Behavior Capture system to be used in similar ways as

well as new ways we cannot predict that take advantage of its transformative capabilities. We expect the Mo-

bile Behavior Capture system to be a national facility, with CMU hosting visiting researchers, and researchers

worldwide using our data. We expect our data repositories to be widely used by diverse research communities

worldwide, as our current repositories are.

Specific users during development: The development process will be stimulating for the co-PIs, the graduate

students supported by this award, and other students of the co-PIs or involved in the development of this system.

These researchers will largely be from the Robotics Institute and the Machine Learning Department.

Specific anticipated users: As the new system comes online, we expect the pattern of usage described

above to continue: 10s of faculty, postdocs, and graduate students and hundreds of undergraduates in various

courses and projects, as well as visitors and a large number of researchers using the captured data. In addition

to the co-PIs and their students, we expect users from the human-robot interaction (HRI) research community

such as Henny Admoni, a new faculty member in the Robotics Institute (RI). We expect users from the robot

manipulation research community such as Katharina Muelling, Oliver Kroemer and David Held, also new fac-

ulty in the RI. Nancy Pollard, who works on robot hand design, will be a user. Carmel Majidi, soft robotics,

is a likely user. The researchers involved in the NSF STC proposal on Understanding Action Through HighDefinition Behavioral Analysis are likely users: Deva Ramanan, RI, Machine Vision; Mike Tarr, Psychology,

Human behavior; Rita Singh, Language Technologies Institute (LTI), Human verbal behavior; LP Morency, LTI,

Human affect from video and audio; Maysam Chamanazar, Biomedical Engineering, Devices/recording tech-

4

Page 13: List of Suggested Reviewers or Reviewers Not To Include

nology; Marios Savides, ECE, Behavior/biometrics/face and posture recognition; Pulkit Grover, ECE, Devices

including dense array EEG; Hae Young Noh, Civil Engineering, Vibration sensors for detecting human actions,

Brooke Feeney, Psychology, Human social behavior; Nathan Urban, U. Pitt. Neurobiology; Avniel Ghuman, U.

Pitt. Neurosurgery; Julie Fiez, U. Pitt. Psychology; and Doug Weber, U. Pitt. BioEngineering, Behavior and

brain-computer interfaces. Rory Cooper and other members of the University of Pittsburgh School of Health

and Rehabilitation Science will be users. This is in addition to continued use by the CMU Drama and Art

Departments and the ETC.

Prior Work and Results from Prior NSF Support

Before discussing NSF support in the last five years, we would like to present earlier research done with NSF

support. 30 years ago Atkeson developed special purpose video hardware to track colored markers in real time,

leading to a spinoff company [14]. This system was used to measure human movement as well as supporting re-

search on visually guided robot behavior and robot learning by watching human teachers, It was clear, however,

that many other aspects of human behavior needed to be measured beyond just optical tracking of movement.

20 years ago at Georgia Tech Atkeson co-lead the construction of the Aware Home (Figure 2) which was in-

strumented with cameras and radio-frequency tracking systems to support research on how technology can help

older adults and people with disabilities live independently in their own homes as long as possible, as well as

other human-technology interaction issues [24]. At CMU he developed prototype behavior measurement sys-

tems using floor vibration and elements of commonly available home security systems such as motion detectors.

Based on this work, it became clear that to observe natural behavior, one had to capture it “in the wild”, rather

than in a zoo or laboratory setting. Living in someone else’s home for a few days of a study, or even sleeping

in a sleep lab for a night, leads to unnatural behavior. Also, many subjects, especially older adults or people

with disabilities, were not willing or able (due to mobility or transportation issues) to travel to or participate in

a lab study, no matter how naturalistic it was. People change their behavior when they are out of their natural

environment (e.g., home, work, school, etc). Robots and smart or assistive environments, and other intelligent

interactive systems need to be developed and tested with actual end users in their natural environments.

As an example of the philosophy that we must capture behavior 24/7 in the wild, 15 years ago Atkeson

helped lead the instrumentation of a skilled nursing facility (an Alzheimer’s unit) as part of the CMU Care-

Media project [1]. Patients in this unit were not able to describe other medical problems or side effects of

medications they were already taking. We developed a camera and microphone network to measure behavior

such as locomotion, eating, and social interaction to try to identify medical problems and drug side effects (Fig-

ure 2). This deployment forced us to address challenges such as automatically processing large amounts of data

to find rare or sparse events, as well as acceptance, privacy, and ethical issues. This work is closely related

to the CMU/Pitt work on using facial capture to track medical issues such as depression, which is best done

continuously 24/7 in the wild [25].

This work led to educational support in the form of an NSF IGERT on Interdisciplinary Research Training

Opportunities in Assistive Technology at CMU and the University of Pittsburgh (PIs Atkeson and Cooper). One

of our guiding principles was that students learn more when they need to leave the academic campus and collect

data and test their systems in the real world. The work also led to a CMU/Pitt NSF Engineering Research Center

on Quality of Life Technology.

15 years ago Hodgins established the CMU Motion Capture Lab, which has collected and provided move-

ment data for a wide range of research ([11], which has been acknowledged in several hundred papers). With

the support of the NSF Engineering Research Center on Quality of Life Technology, this lab made available

movement data and other forms of behavior capture [12]. Similarly, the 10 year old CMU Panoptic Studio built

by Sheikh, an enclosed space instrumented with hundreds of several types of cameras, has provided data on and

software for measuring social interactions ([17], Figures 3, 4, and 5). Narasimhan at CMU is currently develop-

ing outdoor behavior capture devices to be installed on light and sign poles throughout the city of Pittsburgh [8].

A prototype of the capture station is in front of Newell Simon Hall at CMU. One goal is to provide a test

bed for many types of researchers including urban planning and policy as well as well as smart transportation

projects such as CMU’s Traffic 21 [5], CMU’s real-time bus tracking [22], and Pittsburgh’s many autonomous

5

Page 14: List of Suggested Reviewers or Reviewers Not To Include

Figure 5: Openpose is widely used human tracking software that came out of the work on the Panoptic Studio [6,

3, 18].

transportation companies. Another goal is to extend current automobile traffic behavior capture techniques to

pedestrians and street life.

CMU has some of the best facilities for optically tracking movement (the geometry of behavior) in the world.

However, multi-modal sensing technology has advanced greatly in the past decade driven by the popularity of

smart phones, and we still have not achieved our goal of ubiquitous unobtrusive 24/7 behavior capture in the

wild.

In terms of a large NSF equipment award, Hodgins was PI and Atkeson was a co-PI for “Collaborative

Research Resources: An Experimental Platform for Humanoid Robotics Research”, 0224419, $1,015,000. This

equipment award was for the development of a humanoid robot in collaboration with a company, Sarcos. This

development was successful, and the robot continues to be an important component of CMU’s research in hu-

manoid robotics. Approximately 10 students, 50 papers, several NSF awards totaling approximately $3,500,000,

and significant DARPA support were enabled by this equipment award.

The most relevant recent award for Atkeson is: (a) NSF award: IIS-1717066 (PI: Atkeson); amount:$440,000; period: 8/1/17 - 7/31/20.

(b) Title: RI: Small: Optical Skin For Robots: Tactile Sensing and Whole Body Vision

(c) Summary of Results: This recent grant is supported work on developing optical approaches for tactile sensing

as well as whole body vision (eyeballs all over the body).

Intellectual Merit: This project will enable robots to feel what they touch. The key idea is to put cameras

inside the body of the robot, looking outward at the robot skin as it deforms, and also through the robot skin

to see nearby objects as they are grasped or avoided. This approach addresses several challenges: 1) achieving

close to human resolution (a million biological sensors) using millions of pixels, 2) reducing occlusion during

grasping and manipulation, and detecting obstacles before impact, and 3) protecting expensive electronics and

wiring while allowing replacement of worn out or damaged inexpensive skin. Technical goals for the project

include first building and then installing on a robot a network of about 100 off-the-shelf small cameras (less than

1 cubic centimeter) that is capable of collecting information, deciding what video streams to pay attention to, and

processing the video streams to estimate forces, slip, and object shape. A transformative idea is to aggressively

distribute high resolution imaging over the entire robot body. This reduces occlusion, a major issue in perception

for manipulation. Given the low cost of imaging sensors, there is no longer a need to restrict optical sensing

to infrared range finders (single pixel depth cameras), line cameras, or low resolution area cameras. Building

a camera network of hundreds of cameras on a mobile skin, and building a multi-modal sensing skin, will be

highly synergistic with developing the proposed mobile behavior capture system.

Broader Impacts: Robots with better sensing can more safely help people. In terms of outreach, we are

developing a robot museum, as described in the broader impacts portion of this proposal.

Development of Human Resources. The project involves one graduate student. We have weekly individual

meetings and weekly lab meetings. The graduate student is performing research, making presentations to our

group, and will give conference presentations and lectures in courses. We will put the graduate student in a

position to be a success in academia and industry.

(d) Publications resulting from this NSF award: [31].

(e) Other research products: We have made instructions on how to build our tactile sensors available on the

web.

6

Page 15: List of Suggested Reviewers or Reviewers Not To Include

(f) Renewed support. This proposal is not for renewed support.

The most relevant recent award for Sheikh is: (a) NSF award: IIS-1353120 (PI: Sheikh); amount: $216,000;

period: 9/15/13 - 8/30/15.

(b) Title: EAGER: 3D Event Reconstruction from Social Cameras

(c) Summary of Results: This award supported work on combining information from uninstrumented unsyn-

chronized mobile cameras.

Intellectual Merit: This EAGER project helped establish a new area of visual analysis by providing the

requisite framework for social activity understanding in 3D rather than in 2D. It explored the use of social

cameras to reconstruct and understand social activities in the wild. Users naturally direct social cameras at

areas of activity they consider significant, by turning their heads towards them (with wearable cameras) or by

pointing their smartphone cameras at them. The core scientific contribution of this work is the joint analysis

of both the 3D motion of social cameras (that encodes group attention) and the 3D motion in the scene (that

encodes social activity) towards understanding the social interactions in a scene. A number of internal models

(such as maximizing rigidity or minimizing effort) for event reconstruction were investigated to address the

ill-posed inverse problems involved.

Broader Impacts: The ability to analyze social videos in 3D space and time provides useful tools for

almost any activity that involves social groups working together, such as citizen journalism, search-and-rescue

team coordination, or collaborative assembly teams. The project was integrated with education through teaching

and student training, and collaborated with industry.

Development of Human Resources. The project supported one graduate student.

(d) Publications resulting from this NSF award: [4, 26, 10, 2, 16, 20].

(e) Other research products: This work contributed to the publicly available Openpose software (Figure 5).

(f) Renewed support. This proposal is not for renewed support.

The most relevant recent award for Hodgins is: (a) NSF award: IIS-1602337 (PI: Hodgins); amount:$678,850; period: 9/1/16 - 8/31/19.

(b) Title: SCH: EXP: Monitoring Motor Symptoms in Parkinson’s Disease with Wearable Devices

(c) Summary of Results: This project aims to promote a paradigm shift in PD management through in-home

monitoring using wearable accelerometers and machine learning (Figure 10). Novel algorithms and experimen-

tal protocols are developed to allow for robust detection and assessment of PD motor symptoms during daily

living environments.

Intellectual Merit: Specifically, this project develops algorithms for weakly-supervised learning, time se-

ries analysis, and personalization of classifiers. This project collects long-term (several weeks), in-home data

where the participants’ actions are natural and unscripted. Participants use a cell phone app to label their

own data, marking segments of time as containing or not containing the occurrence of a PD motor symptom.

This project extends multiple-instance learning algorithms for learning from weakly-labeled data in time series.

Additional major technical challenges include detection of subtle motor symptoms and local minima during

optimization. To further increase robustness and generalization, this project explores the use of personalization

algorithms to learn person-specific models of motor symptoms from unsupervised data. The proposed tech-

niques for weakly-supervised learning and personalization are general, and they can be applied to other human

sensing problems.

Broader Impacts: This project aims to promote a paradigm shift in Parkinson’s Disease (PD) management.

This disease poses a serious threat to the elderly population, affecting as many as one million Americans.

Costs associated with PD, including treatment, social security payments, and lost income from inability to

work, is estimated to be nearly $25 billion per year in the United States alone. The current state-of-the-art

in PD management suffers from several shortcomings: (1) frequent clinic visits are a major contributor to the

high cost of PD treatment and are inconvenient for the patient, especially in a population for which traveling

is difficult; (2) inaccurate patient self-reports and 15-20 minute clinic visits are not enough information for

doctors to accurately assess their patients, leading to difficulties in monitoring patient symptoms and medication

response; and (3) motor function assessments are subjective, making it difficult to monitor disease progression.

Furthermore, because they must be performed by a trained clinician, it is infeasible to do frequent motor function

assessments. This project explores how we can do better using wearable devices to monitor this disease.

Development of Human Resources. The project supported one graduate student.

7

Page 16: List of Suggested Reviewers or Reviewers Not To Include

Figure 6: Left: Two visible light cameras combining multiple lenses and image chips. Right: Three RGBD

cameras combining infrared illumination, a visible light camera, and an infrared camera (a single camera, a

stereo pair or a time-of-flight depth measurement).

(d) Publications resulting from this NSF award: [32].

(e) Other research products: None yet.

(f) Renewed support. This proposal is not for renewed support.

There is no prior NSF support for Katerina Fragkiadaki.

State of the Art in Mobile Human Behavior Capture

Robot cameras were pioneered in the making of 2001: A Space Odyssey and the first Star Wars movie [28].

Robotic pan/tilt mounts for cameras are now commonly available. There are commercial sources for pan/tilt

robot cameras on trolleys to provide mobile capture for television and movie studios as well as mobile capture

for sports events (such as track and field events). These trolleys are limited to tracks on the ground, floor, or

ceilings. Companies such as Bot&Dolly, Ross, Telemetrics, Mark Roberts Motion Control, and Camerobot

Systems sell cameras mounted on robot arms for six degree of freedom camera position and orientation control

(within the workspace of the robot arm, which is often quite limited). Consumer-level robots such as Jibo and

mobile robots such as Pepper can serve as (slow) camera platforms. Telepresence robots also serve as camera

platforms. There are a few “selfie-bots” that can move autonomously. Remotely controlled and autonomous

flying drones with cameras are widely available. Omnidirectional cameras are used to capture for virtual reality

playback where only orientation can be controlled by the viewer. There are marker-based motion capture sys-

tems such as Vicon that offer a real time tracking option as well as provide support for capturing simultaneous

physiological measurements such as EMG. Multi-modal cameras are common in RGBD cameras (Figure 6),

which typically have a visible light cameras, an infrared illuminator or pattern projector, one or more infrared

cameras, and a microphone, and in the form of robot heads, which typically combine some of visible light

stereo imaging, infrared depth measurement, visible light and infrared illumination, microphone arrays, and

infrared lidar. Self-driving cars add radar to the mix. However, we are not aware of any integrated system with

multiple mobile camera platforms that combines the attributes we will develop: markerless interactive scalable

multi-modal behavior capture. One technological wave we are surfing is driven by smart phones, and the race

to provide smart phone-based sensing such as optical, depth (Figure 7), thermal, ultrasound, and radar imaging

(Figure 8). Another technological wave is the development of miniaturized “soft” electronics for wearable sen-

sors (Figure 12). We are adopting the philosophy of including as many of these sensors as we can, limited by

cost, so that the system can support a wide range of studies we are planning, as well as those we cannot predict

at this time.

Research Needs

We will expand on the discussion in the initial section of what motivates each of the desired attributes of the

system we will develop: portable self-mobile markerless interactive scalable multi-modal behavior capture This

proposal is based on our experience with our Panoptic Studio (Figures 3 and 4) and our Motion Capture Lab. We

are frustrated with the limited sensing volume provided by a 6m diameter dome, or any marker-less capture sys-

tem. The scale of the behavior and the number of available cameras defines the camera arrangement, achievable

sensing volume, and resulting spatial resolution. We cannot simultaneously capture fine scale behavior such

8

Page 17: List of Suggested Reviewers or Reviewers Not To Include

Figure 7: Depth images from time of flight imagers.

as facial expressions, finger movements, or gestures like a slight shrug, while capturing large scale behavior

such as dance, running, or just walking around. Our cameras are fixed, with a fixed orientation, and a fixed

lens setting. Often the behavioral context is unnatural: it is like being in a children’s playhouse or treehouse, or

living in the currently popular “tiny houses”. We are also tired of putting hundreds of tiny markers on subjects

in the Motion Capture Lab to capture facial expressions and skin deformation. This process takes a long time at

the start and end of each capture session, is cumbersome for the subject, and fundamentally limits the resolution

of what can be captured.

We propose to build a Behavior Capture System to address the above needs, as well as other needs that are

not met by our current behavior capture systems: portability and flexibility so behavior can be captured in its

natural domain (such as in the home, or at a sports event or concert) rather than in a lab setting, spatial scalabilityso that fine as well as gross behaviors can be captured, temporal scalability so that long duration or continuous

24/7 behaviors can be captured, mobility during capture so the fovea(s) of the system can accurately track a

moving locus of behavior, and greater multi-modality to more fully capture human and robot behavior including

devices to capture touch, contact force, and physiological behavior as well as behavior at a distance. We will

go beyond capturing rigid bodies to capturing the behavior of deformable bodies, liquids, and granular materi-

als. We will integrate measurements of internal tissue movement and changes along with other physiological

measurements.

Description of the Research Instrument

Our design involves: a) A modular system of Multi-Modal Cameras (MMCams), arrays of measurement-at-a-distance components: multiple coordinated visible light imaging devices, optical depth measurement devices,

microphone arrays, thermal imaging, radar imaging and distance and velocity measurement, other radio fre-

quency and electric and magnetic field measurements, and lighting for night and low light situations.

b) The MMCams can be assembled, calibrated, and synchronized into various size groups to match the

capture environment. We will use the modular panels of the Panoptic Studio dome as the starting point for

our design. These panels hold a mix of high resolution and low resolution cameras, a depth camera, and

synchronization hardware (Figure 3).

c) the MMCams can be mounted on robots, vehicles, drones, and even the subjects themselves to track

mobile behavior. We plan to purchase omnidirectional mobile robot bases to explore this capability (Figure 9).

d) Additional contact measurement components which include traditional strain gage instrumentation and

Figure 8: Phone-based thermal imagers (2), an ultrasound imager, and phone-based radar.

9

Page 18: List of Suggested Reviewers or Reviewers Not To Include

Figure 9: Left: A Segway omnidirectional robot base. Middle: Currently available small cameras such as the

Naneye (1x1x1mm). Right: Stereo Naneye to also measure depth.

resistive and capacitive touch and force sensors embedded in manipulated objects and surfaces such as furniture,

appliances and their controls, floors, and walls, as well as extending our work on optically measured deforma-

tion of materials in contact [29]. These components will also be mountable on subjects. We have extensive

experience attaching small accelerometers, gyroscopes, and magnetic field sensors to humans and robots, for

example. We also will explore mounting very small cameras on subjects to augment the capture system’s views

and reduce occlusion (Figure 9).

e) Additional physiological measurement components which generalize from current worn devices such as

Fitbit and heart monitors to include electromyographic activity and recently developed ultrasound imaging of

internal tissues such as muscle fiber movement [7, 23]. We will extend newly available radar and ultrasound

chips and cell phone-based devices such as those used for gesture recognition to physiological measurements

such as breathing, heartbeat, and muscle state [9]. Hodgins’ work on on monitoring Parkinson’s Disease using

wearable devices is an example of using physiological sensors (Figure 10) [32]. Markvicka and Majidi at CMU

are developing small bandaid-like physiological sensors that we plan to use (Figure 12) [15].

Real time tracking will be included to support tracking mobile behaviors as well as supporting studies

of interactive robots and real-time robot control and learning. Integration of many multimodal sensors will

be a major emphasis. This includes time synchronization as well as convenient calibration and rectification

of different measurements. For example, we will combine optical, audio, contact and force, and vibration

measurements to understand how an older adult with a motor disability such as tremor can more effectively

utilize current tablet and phone technology. People with motor difficulties repeatedly “press” or touch the

wrong area of the screen, and are unable to undo or recover. Many older adults can’t use ride services like Uber,

which they desperately need because they have lost their license, because they can’t operate the smart phone

interface.

Our existing software base is described in [13]. We have developed a method to automatically reconstruct

full body motion of interacting multiple people. Our method does not rely on a 3D template model or any

subject-specific assumption such as body shape, color, height, and body topology. Our method works robustly

in various challenging social interaction scenes of arbitrary number of people, producing temporally coherent

time-varying body structures. Furthermore, our method is free from error accumulation and, thus, enables

capture of long term group interactions (e.g., more than 10 minutes). Our algorithm is designed to fuse the weak

Figure 10: Left: A wearable accelerometer system. Right: A Parkinson’s patient whose tremor is being

monitored by cameras and wearable accelerometers (red circles) [32].

10

Page 19: List of Suggested Reviewers or Reviewers Not To Include

Figure 11: Several levels of proposals generated by our method. (a) Images from up to 480 views. (b) Per-

joint detection score maps. (c) Node proposals generated after non-maxima suppression. (d) Part proposals by

connecting a pair of node proposals. (e) Skeletal proposals generated by piecing together part proposals. (f)

Labeled 3D patch trajectory stream showing associations with each part trajectory. In (c-f), color means joint or

part labels shown below the figure.

perceptual processes in the large number of views by progressively generating skeletal proposals from low-

level appearance cues, and a framework for temporal refinement is also presented by associating body parts to

reconstructed dense 3D trajectory stream (Figure 11). Our system and method are the first in reconstructing full

body motion of more than five people engaged in social interactions without using markers. We also empirically

demonstrate the impact of the number of views in achieving this goal.

Intellectual Merit

We will address several intellectual and practical challenges in the development of this system. The system will

be a combination of moving sensor panels so an initial calibration is not adequate. A critical challenge is to

continuously calibrate the system be able to integrate information across panels. We will make this problem

easier by marking the panels and the environment and dedicating sensors on each panel to continuously track

neighboring panels and static fiducial markers. We will use optimization to continuously estimate all the panel

locations by minimizing the error in fitting these measurements, as well as the measurements of the subjects,

which also provide calibration information. We will also explore a number of Simultaneous Localization and

Mapping (SLAM) techniques from robotics. Another challenge for a distributed system is synchronization

(sharing a global clock). We will use wired connections rather than wireless when necessary to simplify syn-

chronization. We will also use wired umbilicals to the mobile robots to provide power rather than relying on

cumbersome and dangerous batteries.

Measuring small details like finger movements, facial expressions, and subtle social cues requires high

resolution, but monitoring a large space limits resolution if it is uniform. We plan to address this challenge by

implementing movable cameras and computer controlled lenses. The MMCam components will be mounted

on separate pan/tilt mounts with computer controlled lenses (zoom and focus) and the camera panels will also

be pointable. The mobile bases will be steerable. This introduces a new challenge. We will need to be able to

process enough information in real time to provide guidance information to all these actuators. We will simplify

this problem by using depth cameras to reduce the computational load to track objects in 3D, and thermal

cameras to make it easier to find and track human body parts. We will also take advantage of the continual

improvement of GPUs to process more of the more complex visible light imaging information in real time.

Even with pointable cameras occlusion is a major challenge. In addition, human appearance and configu-

ration variation is immense. Clothes and skin sliding across a muscle are difficult challenges, since the visible

texture moves relative to the actual limb. Usually fixed cameras are pointed to the center of a capture area.

Multiple people in such an area tend to move to the edges where resolution is less good, just as people in an

elevator move to the walls. We will extend our work on the Panoptic Studio to handle these and other issues that

arise.

Practical challenges include: How do we minimize the cost and hassle of deploying the system? How do

we maximize the acceptance and minimize the invasiveness of the system for subjects? How do we minimize

the cost and hassle of analyzing the data?

11

Page 20: List of Suggested Reviewers or Reviewers Not To Include

We expect our behavior capture work to support the development of a great deal of other behavior recog-

nition and monitoring technology in research and in for consumers. From a machine learning point of view,

our data can be used to train learning systems to process data using many fewer sensors. This may lead to

more inexpensive ways to recognize and track human behavior from devices like phones, sensors placed in the

environment, and wearable technology.

We also expect our system to be used to design and prototype, and potentially train operators for, future

monitoring systems in care facilities (hospitals, nursing homes, assisted living, and homes in general).

Justification for a Development Proposal

How will the end result of the effort be a stable shared-use research instrument, rather than technology devel-opment, a device, a product or a technique/protocol? We hope our description above and our track record with

building the Panoptic Studio will convince you we can create a stable shared-use research instrument. What sig-nificant new capabilities, not available in an instrument provided by a vendor, will the new instrument provide?This is discussed above. In what way does the instrument development require design and development work thatmust be undertaken or has been undertaken in-house, rather than through readily available/published designsfound in the literature? There are companies (particularly in the movie and special effects business) that can

help in setting up many cameras (movie sets) or capturing an expensive stunt. There are robotic camera systems

(as described in the state of the art section). However, these companies are only interested in capturing beautiful

images and videos, not in data or accurate measurement, and they are out of the typical academic’s price range.

We have the expertise to combine imaging, contact, and physiological measurements into useful integrated data.

To what extent does the instrument development require/benefit from a team of scientists/engineers/techniciansthat bring a variety of skills to the project? This work combines expertise in computer vision, robotics, hard-

ware, instrumentation, and physiological sensors. For what activities does the instrument development requirea significant number of person-hours, more so than simple “assembly” of purchased parts? See the challenges

listed in the section on Intellectual Merit. To what extent does the instrument development require time-framesfor completion that are longer than are required for plug-and-play or assembled instruments? We expect to

spend a full year designing the measurement-at-a-distance component, and a full year constructing that part. We

expect to spend another year perfecting and implementing contact and physiological sensing systems. The final

year focuses on evaluation and refinement of the integrated system. Does the instrument development requirethe use of a machine shop or a testbed to fabricate/test unique components? Yes. Does the instrument devel-opment effort involve risks in achieving the required specifications, and what is the risk mitigation plan? Risks

are listed in the section on Intellectual Merit. Risk mitigation is achieved by using a wide variety of sensors to

simplify computation, and to dedicate some sensors to continuous self-calibration of the system.

Management Plan

The mobile behavior capture system will be placed (deployable) in a number of rooms at CMU, including

the existing Motion Capture Lab. The system will be operated as needed. We expect it will be used daily, with

downtimes caused by a need to transport the system long distances to a subject’s home or a rehabilitation facility,

for example. The system will be maintained by the developers. The computers will be maintained by CMU’s

Information Technology staff. We will allocate instrument time in the same way we use for current facilities,

a simple web-based signup system. The students being supported by the CMU cost sharing funding will assist

new users. As the system comes online, we will advertise it both locally and nationally on appropriate email

lists.

Organization of the project team: We will have three tracks: measurement at a distance component

development, system integration, and wearable and deployed sensor design and fabrication. Development of

the measurement-at-a-distance MMCams will be led by Katerina Fragkiadaki, a new professor who works on

machine learning for computer vision. System integration, development of the robotics aspects of the system and

development of the contact, force, and physiological sensors will be led by the PI, Chris Atkeson. We will use

our technical expertise developed in building the Panoptic Studio (Sheikh) and Motion Capture Lab (Hodgins).

12

Page 21: List of Suggested Reviewers or Reviewers Not To Include

Figure 12: Smart adhesive patch. Photographs of three variations of the smart adhesive patch including, from

left to right, an accelerometer, pulse oximeter, and pressure sensor. Each patch includes a coin cell battery,

power regulation, wireless processor, and a digital (or analog) sensor that is assembled on a medical grade

adhesive. (left) For size comparison an adhesive bandage is included (Tough-Strips, Band-Aid) [15].

We will need and Atkeson has extensive experience in robot design, system integration, and multi-modal sensor

design.

In the first year we will perform the detailed design of the system, and build prototypes of the MMCams

mounted on omnidirectional mobile robots (see budget justification for more information). We will be able to

evaluate these prototypes by operating them in conjunction with the existing Motion Capture Lab and Panoptic

Studio. In the second year we will build the complete measurement-at-a-distance system, and continue proto-

typing and evaluating other types of the system. In the third year we will build the complete set of other sensors:

contact, force, and physiological measurement systems. We will both integrate and evaluate the system. In the

fourth year we will focus on evaluating our system by testing a variety of experiment designs and other usage

scenarios, and remedying any deficiencies we find. We will do extensive testing against ground truth data as

well as our existing systems, the Motion Capture Lab and the Panoptic Studio.

During the development process the graduate students performing the development will be closely super-

vised by the co-PIs. We will have weekly project meetings where we perform activities such as design reviews,

code walkthroughs, performance assessment, API design, and overall project coordination. The development

group will invite users to these meetings and listen carefully to user feedback.

There are four co-PIs involved in the project, and approximately four graduate students (depending on

recruitment). The graduate students are supported by $800,000 (approximately 8 student-years) CMU cost

sharing support. The co-PIs are supported by other projects.

The design of the mobile behavior capture system is described in the section “Description of the Research

Instrument and Needs”. For the measurement at-a-distance system we will use the construction techniques used

to build the Panoptic Studio and mobile robots in the CMU Robotics Institute. We both use our machine shop

as well as web interfaces to order parts fabricated by others. In year 4 we will certify and then commission the

system.

Project activities include: measurement-at-a-distance: designing and building MMCam, camera panels,

mobile robots, mounting the cameras on the robots, creating the power and signal wiring, getting the computers

functioning, writing the software, analyzing the data, and designing and building calibration and test fixtures.

Include a description of parts and materials, deliverables and estimated schedules and costs for each phaseof the project as appropriate.

13

Page 22: List of Suggested Reviewers or Reviewers Not To Include

The total estimated costs for each year are described in the budget section. In the first year we expect

to purchase five computers at approximately $12,000 each (quote from Exxact). These computers have been

equipped with four state of the art GPUs. We expect the computers and GPUs we will actually purchase a

year from now will cost about the same but be even more powerful. To network the above computers we will

buy an Infiniband network switch for approximately $8000 (quote from Dell). We also plan to purchase two

mobile bases which we estimate will cost approximately $40,000 each (quote from Segway). This mobile base

is one of the few omnidirectional bases we have found that are fast enough to keep up with human walking

(1.3m/s) and strong enough to carry up to two of the above computers and 8 MMCams. Again, a year from

now we will again survey available mobile bases. We have estimated the cost of MMCams by pricing visible

light cameras (a cluster of 6 cameras attached to an NVIDIA board with a TX2 Jetson GPU: Leopard Imaging

LI-JETSON-KIT-IMX477CS-X $1600) and time of flight depth cameras (Basler tof640-20gm 850nm $2340).

There are additional costs for synchronization hardware and other wiring. We have based this cost estimate on

costs we saw building the Panoptic Studio.

In the first year we will also begin to prototype contact, force, and physiological sensors. The costs of

individual components are relatively cheap (in the hundreds of dollars: consumer level thermal, ultrasound, and

radar imaging sensors are typically $200-300). We have based the total costs for this year based on our historical

costs for this type of development.

We have provided a total estimate for costs involving components less than $5000 each as “fabricated equip-

ment”. For fabricating equipment we have based our estimates on our historical costs for developing this type

of equipment.

In year 2 we are building the full measurement-at-a-distance system, adding 15 computers at $12000 (quote

from Exxact) each and 8 mobile bases at $40000 (quote from Segway) each. Additional funds are requested for

64 MMCams, and continuing development of contact, force, and physiological sensors.

In year 3 and 4 we focus on building out the full system, developing ground truth testing equipment, and

fixing any design flaws. We expect to develop custom electronics for the contact, force, and physiological

sensors.

We have attempted to reduce risk as much as possible. We are reusing much of the Panoptic Studio design

for the MMCam panel and synchronization hardware. We will dedicate hardware on the mobile elements to

directly measure the location of other elements and fiduciaries placed around the capture volume, rather than

trying to infer camera position only from image data. We have extensive experience with contact and force

sensors, and taking advantage of Atkeson’s work in the project “RI: Small: Optical Skin For Robots: Tactile

Sensing and Whole Body Vision,”. We will benefit from the assistance of Carmel Majidi, an expert in soft

sensors in the CMU Mechanical Engineering Department. We will use our weekly meetings to assess new risks

and for quarterly re-analyzing and modifying the project plan to keep it within scope, schedule and budget.

We have had great success making public most data collected in the Motion Capture Lab (mocap.cs.cmu.edu and kitchen.cs.cmu.edu) and Panoptic Studio (domedb.perception.cs.cmu.edu).

Data made available on the web has been acknowledged in several hundred papers.

Broader Impacts of the Proposed Work

Impact on the research community of interest, and How will we attract and involve other researchers? We

expect to continue to make capture data available on the web. Data made available so far has been acknowledged

in several hundred papers, mostly from the computer graphics, animation, and vision communities worldwide.

This form of usage is freely available to all, including those from non-Ph.D. and/or minority-serving institutions.

We will host visitors who wish to use our facilities, as we do now. As we have described in this proposal, this

instrument development will result in a mobile behavior capture device that is not only unique across CMU, but

also worldwide, making a substantial improvement in our capabilities to conduct leading-edge research as well

as leading edge research training.

Outreach: A major outreach initiative led by Atkeson is the creation of a physical and virtual Robot Mu-

seum. So far we have created physical exhibits on juggling robots, robot actuation (gears vs. direct drive),

mobile robots, soft robots, Steve Jacobsen and Sarcos, robots in literature, legged robots, computer graphics

14

Page 23: List of Suggested Reviewers or Reviewers Not To Include

(Ivan Sutherland), and AI (Newell and Simon). Our next major initiatives are 1) to develop cell phone apps that

trigger off augmented reality (AR) tags and robot pictures in halls to provide a self-guided tour of the Robotics

Institute, and 2) use virtual reality (VR) to provide access to our collection from anywhere in the world. We want

anyone to be able to design, build, debug, evaluate, and repair a historical robot in virtual reality. The impact

of our outreach will be increased by a new Disney TV show based on the characters from the Disney movie

Big Hero 6, including the inflatable medical robot Baymax inspired by Atkeson’s work on inflatable robots. We

have coordinated our outreach activities with the larger outreach efforts of CMU’s Robotics Institute to scale

up reach and effectiveness. Our technologies are being shared by being published, and papers and software are

available electronically.

Is student participation in development appropriate? This project will engage several graduate students in

instrument development activities. We miss the old days when students had to make their own oscilloscopes.

We believe this is an excellent way for a new student to get to know the field, while dealing with a concrete set

of problems. We believe this instrument development work will inspire students to ask new questions and enrich

the rest of their graduate and future careers. An excellent example of that is the students that participated in

developing the Panoptic Studio. Their work led to many papers, participation in workshops sharing information

between related groups, and prominence in their fields [13, 21, 4, 19, 27, 26, 10, 16, 20].

Participation of Underrepresented Groups: Two of the co-PIs of this proposal are female. We expect this

will encourage female students to participate in the instrument development. Because one of the foci of the

system is for rehabilitation and therapy, and we will work with the University of Pittsburgh School of Health

and Rehabilitation Sciences, we also expect participation by faculty and students with disabilities. In terms of

more general outreach to underserved populations, we will make use of ongoing efforts in the Robotics Institute

and CMU-wide. These efforts include supporting minority visits to CMU, recruiting at various conferences and

educational institutions, and providing minority fellowships. As the Robotics Institute PhD admissions chair in

2016, Atkeson led a process which resulted in 31% of acceptances going to female applicants. As a member of

the Robotics Institute faculty hiring committee in 2016, Atkeson participated in a process that led to 10 out of

18 interviewees being female. Half of the faculty hired were women. As the head of Robotics Institute hiring

in 2018, Atkeson is leading a process in which 9 out of 20 interviewees are female. Atkeson is assisting efforts

at CMU to raise money for fellowships for students who will help us in our efforts to serve diverse populations

and communities, including our own.

Dissemination Plan: For a more complete description of our dissemination plan, see our Data Management

Plan. We will maintain a public website to freely share our captured data with video material. We will present

our work at conferences and publish it in journals, and will use these vehicles to advertise our work to potential

collaborators in science and industry.

Technology Transfer: The best way to transfer technology is by having students go to industry. Three recent

students work at Boston Dynamics transferring our work in robotics to commercial applications, one recent

student and recent postdoc work on self-driving cars at Uber, one recent student works on self-driving cars at

Apple, and one recent student works on humanoid robotics at the Toyota Research Institute. An older former

student is the CTO of the Amazon drone effort. Several older former students work at Google. We are thrilled

that we and our students are part of the robotics revolution. Sheikh leads a Facebook/Oculus research lab in

Pittsburgh as well as being a professor at CMU, which is another form of technology transfer.

15

Page 24: List of Suggested Reviewers or Reviewers Not To Include

References

[1] A. J. Allin, C. G. Atkeson, H. Wactlar, S. Stevens, M. J. Robertson, D. Wilson, J. Zimmerman, and

A. Bharucha. Toward the automatic assessment of behavioral disturbances of dementia. In Fifth Interna-tional Conference on Ubiquitous Computing (UbiComp’03), 2nd International Workshop on UbiquitousComputing for Pervasive Healthcare Applications, 2003.

[2] Ido Arev, Hyun Soo Park, Yaser Sheikh, Jessica Hodgins, and Ariel Shamir. Automatic video-editing of

footage from multiple social cameras. In ACM SIGGRAPH, 2014.

[3] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using

part affinity fields. In CVPR, 2017.

[4] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using

part affinity fields. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[5] CMU. Traffic21. traffic21.heinz.cmu.edu. [Online; accessed Oct-16-2017].

[6] CMU-Perceptual-Computing-Lab. Openpose: A real-time multi-person keypoint detection and multi-

threading C++ library. github.com/CMU-Perceptual-Computing-Lab/openpose. [Online;

accessed Oct-16-2017].

[7] D. J. Farris and G. S. Sawicki. Human medial gastrocnemius force–velocity behavior shifts with locomo-

tion speed and gait. In Proc. Natl. Acad. Sci. USA, volume 109, pages 977–982, 2012.

[8] K. Fatahlian. Led street light research project part ii: New findings. repository.cmu.edu/architecture/117. [Online; accessed Oct-16-2017].

[9] Google. Soli. atap.google.com/soli. [Online; accessed Oct-16-2017].

[10] Paulo Gotardo, Tomas Simon, Yaser Sheikh, and Iain Matthews. Photogeometric scene flow for high-detail

dynamic 3d reconstruction. In International Conference on Computer Vision (ICCV), 2015.

[11] J. K. Hodgins. CMU graphics lab motion capture database. mocap.cs.cmu.edu. [Online; accessed

Oct-16-2017].

[12] J. K. Hodgins. Grand challenge data collection. kitchen.cs.cmu.edu. [Online; accessed Oct-16-

2017].

[13] Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara,

and Yaser Sheikh. Panoptic studio: A massively multiview system for social motion capture. In The IEEEInternational Conference on Computer Vision (ICCV), 2015.

[14] Newton Labs. Cognachrome. www.newtonlabs.com/cognachrome. [Online; accessed Oct-16-

2017].

[15] Eric Markvicka. Soft-Matter Robotic Materials. PhD thesis, Carnegie Mellon University, 2018.

[16] Varun Ramakrishna, Daniel Munoz, Drew Bagnell, Martial Hebert, and Yaser Sheikh. Pose machines:

Articulated pose estimation via inference machines. In European Conference on Computer Vision (ECCV),2014.

[17] Y. Sheikh. CMU panoptic dataset. domedb.perception.cs.cmu.edu. [Online; accessed Oct-16-

2017].

Page 25: List of Suggested Reviewers or Reviewers Not To Include

[18] Tomas Simon, Hanbyul Joo, Iain Matthews, and Yaser Sheikh. Hand keypoint detection in single images

using multiview bootstrapping. In CVPR, 2017.

[19] Tomas Simon, Hanbyul Joo, Iain Matthews, and Yaser Sheikh. Hand keypoint detection in single images

using multiview bootstrapping. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2017.

[20] Tomas Simon, Jack Valmadre, Iain Matthews, and Yaser Sheikh. Separable spatiotemporal priors for

convex reconstruction of time-varying 3d point clouds. In European Conference on Computer Vision(ECCV, 2014.

[21] Tomas Simon, Jack Valmadre, Iain Matthews, and Yaser Sheikh. Kronecker-markov prior for dynamic 3d

reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 39(11):2201–

2214, 2017.

[22] A. Steinfeld. Tiramisu transit. www.tiramisutransit.com. [Online; accessed Oct-16-2017].

[23] J. M. D. Taylor, A. S. Arnold, and J. M. Wakeling. Quantifying achilles tendon force in vivo from ultra-

sound images. Journal of Biomechanics, 49(14):3200–3208, 2016.

[24] Georgia Tech. Aware home. www.awarehome.gatech.edu. [Online; accessed Oct-16-2017].

[25] F. De La Torre. Intraface. www.humansensing.cs.cmu.edu/intraface. [Online; accessed

Oct-16-2017].

[26] Minh Vo, Srinivas Narasimhan, and Yaser Sheikh. Spatiotemporal bundle adjustment for dynamic 3d

reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[27] Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. Convolutional pose machines. In

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[28] Wikipedia. Motion control photography. https://en.wikipedia.org/wiki/Motion_control_photography. [Online; accessed Feb-4-2018].

[29] A. Yamaguchi and C. G. Atkeson. Combining finger vision and optical tactile sensing: Reducing and

handling errors while cutting vegetables. In IEEE-RAS International Conference on Humanoid Robotics,

2016.

[30] A. Yamaguchi and C. G. Atkeson. Stereo vision of liquid and particle flow for robot pouring. In IEEE-RASInternational Conference on Humanoid Robotics, 2016.

[31] Akihiko Yamaguchi and Christopher G. Atkeson. Implementing tactile behaviors using fingervision. In

IEEE-RAS International Conference on Humanoid Robotics, 2017.

[32] Ada Zhang, Alexander Cebulla, Stanislav Panev, Jessica Hodgins, and Fernando de la Torre. Weakly-

supervised learning for parkinson’s disease tremor detection. In IEEE Eng. Med. Biol. Soc., pages 143–

147, 2017.

Page 26: List of Suggested Reviewers or Reviewers Not To Include

BIOGRAPHICAL SKETCH

No Bio Data Provided

Page 27: List of Suggested Reviewers or Reviewers Not To Include

KATERINA FRAGKIADAKIAssistant Professor

Machine Learning Department, Carnegie Mellon University

5000 Forbes Avenue, Pittsburgh, PA, 15213

2675289476

[email protected], www.cs.cmu.edu/∼katef

Professional PreparationNational Technical University of Athens EECS Diplomat 2007

University of Pennsylvania CIS M.S. 2011

University of Pennsylvania CIS Ph.D. 2013

EECS, UC Berkeley PostDoctoral Fellow, 2013–2015

Google Research PostDoctoral Fellow, Oct. 2015–December 2016

AppointmentsAssistant Professor MLD, CMU September 2016–present

Products Five Closely Related products to the proposed project1. Tung F., Tung W., Yumer E., Fragkiadaki K., 2017, Self-supervised learning of Motion

Capture, Neural Information Processing Systems (NIPS)2. Tung Hsiao-Yu F., Harley A. Seto W., Fragkiadaki K., 2017, Adversarial Inverse Graph-

ics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired

Supervision, International Conference on Computer Vision (ICCV)3. Vijayanarasimhan S., Ricco S., Schmid C., Sukthankar R., Fragkiadaki K., 2017, SfM-Net:

Learning of Structure and Motion from Video, arxiv4. Fragkiadaki K., Salas M., Arbelaez P., Malik J., 2014, Grouping-based Low-Rank Trajectory

Completion and 3D Reconstruction, Neural Information Processing Systems (NIPS)5. Fragkiadaki K., Levine S., Felsen P., Malik J., 2015, Recurrent Network Models for Human

Dynamics, IEEE International Conference on Computer Vision (ICCV)

Five Other Products1. Carreira J., Agrawal P., Fragkiadaki K., Malik J., 2016, Human Pose Estimation with Itera-

tive Error Feedback, IEEE Conference on Computer Vision and Pattern Recognition (CVPR)2. Fragkiadaki K., Agrawal P., Levine S., Malik J., 2016, Learning Visual Predictive Models of

Physics for Playing Billiards, International Conference to Learning Representations(ICLR)3. Ying C., Fragkiadaki K., 2017, Depth-Adaptive Computational Policies for Efficient Visual

Tracking, EMMCVPR4. Fragkiadaki K., Arbelaez P., Felsen P., Malik J., 2015, Learning to Segment Moving Objects

in Videos, IEEE Conference on Computer Vision and Pattern Recognition (CVPR)5. Felsen P., Fragkiadaki K., Malik J., Efros A., 2015, Learning Feature Hierarchies from Long

Term Trajectory Associations in Videos, Transfer and Multitask Learning Workshop inNIPS

Synergistic Activities

1

Page 28: List of Suggested Reviewers or Reviewers Not To Include

• Creation of a new course in Fall 2017 at MLD CMU, to cover recent advances on Lan-

guage grounding. Course Title: Language grounding in Vision and Control (Undergradu-

ate+Graduate)

• Creation of a new course in Sprint 2017 at MLD CMU, to cover recent advances on Deep

Reinforcement Learning and Deep Robotic Learning. Course Title: Deep Reinforcement

Learning and Control (Undergraduate+Graduate)

• Area Chair for CVPR 2018

• Organizer: The 11th Perceptual Organization for Computer Vision Workshop, CVPR 2016:

”The role of feedback in Recognition and Segmentation”. Workshop that brought together

Human and Computer Vision scientists to investigate incorporation of Feedback in visual

architectures

• Best Ph.D. Thesis, Computer and Information Science Department, University of Pennsyl-

vania, 2013.

2

Page 29: List of Suggested Reviewers or Reviewers Not To Include
Page 30: List of Suggested Reviewers or Reviewers Not To Include
Page 31: List of Suggested Reviewers or Reviewers Not To Include

SUMMARYPROPOSAL BUDGET

FundsRequested By

proposer

Fundsgranted by NSF

(if different)

Date Checked Date Of Rate Sheet Initials - ORG

NSF FundedPerson-months

fm1030rs-07

FOR NSF USE ONLYORGANIZATION PROPOSAL NO. DURATION (months)

Proposed Granted

PRINCIPAL INVESTIGATOR / PROJECT DIRECTOR AWARD NO.

A. SENIOR PERSONNEL: PI/PD, Co-PI’s, Faculty and Other Senior Associates (List each separately with title, A.7. show number in brackets) CAL ACAD SUMR

1.

2.

3.

4.

5.

6. ( ) OTHERS (LIST INDIVIDUALLY ON BUDGET JUSTIFICATION PAGE)

7. ( ) TOTAL SENIOR PERSONNEL (1 - 6)

B. OTHER PERSONNEL (SHOW NUMBERS IN BRACKETS)

1. ( ) POST DOCTORAL SCHOLARS

2. ( ) OTHER PROFESSIONALS (TECHNICIAN, PROGRAMMER, ETC.)

3. ( ) GRADUATE STUDENTS

4. ( ) UNDERGRADUATE STUDENTS

5. ( ) SECRETARIAL - CLERICAL (IF CHARGED DIRECTLY)

6. ( ) OTHER

TOTAL SALARIES AND WAGES (A + B)

C. FRINGE BENEFITS (IF CHARGED AS DIRECT COSTS)

TOTAL SALARIES, WAGES AND FRINGE BENEFITS (A + B + C)

D. EQUIPMENT (LIST ITEM AND DOLLAR AMOUNT FOR EACH ITEM EXCEEDING $5,000.)

TOTAL EQUIPMENT

E. TRAVEL 1. DOMESTIC (INCL. U.S. POSSESSIONS)

2. INTERNATIONAL

F. PARTICIPANT SUPPORT COSTS

1. STIPENDS $

2. TRAVEL

3. SUBSISTENCE

4. OTHER

TOTAL NUMBER OF PARTICIPANTS ( ) TOTAL PARTICIPANT COSTS

G. OTHER DIRECT COSTS

1. MATERIALS AND SUPPLIES

2. PUBLICATION COSTS/DOCUMENTATION/DISSEMINATION

3. CONSULTANT SERVICES

4. COMPUTER SERVICES

5. SUBAWARDS

6. OTHER

TOTAL OTHER DIRECT COSTS

H. TOTAL DIRECT COSTS (A THROUGH G)

I. INDIRECT COSTS (F&A)(SPECIFY RATE AND BASE)

TOTAL INDIRECT COSTS (F&A)

J. TOTAL DIRECT AND INDIRECT COSTS (H + I)

K. SMALL BUSINESS FEE

L. AMOUNT OF THIS REQUEST (J) OR (J MINUS K)

M. COST SHARING PROPOSED LEVEL $ AGREED LEVEL IF DIFFERENT $

PI/PD NAME FOR NSF USE ONLYINDIRECT COST RATE VERIFICATION

ORG. REP. NAME*

*ELECTRONIC SIGNATURES REQUIRED FOR REVISED BUDGET

1YEAR

1

Carnegie-Mellon University

Christopher

ChristopherChristopher

Atkeson

Atkeson Atkeson

0.00 0.00 0.00

0 0.00 0.00 0.00 01 0.00 0.00 0.00 0

0 0.00 0.00 0.00 00 0.00 0.00 0.00 00 00 00 00 0

00

0

8,000$1 network switch80,0002 mobile bases60,0005 computers

200,000Fabricated Equipment: mobile capture system 348,000

00

0000

0 0

00000

1,000 1,000

349,000

581Modified Total Direct Costs Base (Rate: 58.1000, Base: 1000)

349,5810

349,581200,000

Page 32: List of Suggested Reviewers or Reviewers Not To Include

SUMMARYPROPOSAL BUDGET

FundsRequested By

proposer

Fundsgranted by NSF

(if different)

Date Checked Date Of Rate Sheet Initials - ORG

NSF FundedPerson-months

fm1030rs-07

FOR NSF USE ONLYORGANIZATION PROPOSAL NO. DURATION (months)

Proposed Granted

PRINCIPAL INVESTIGATOR / PROJECT DIRECTOR AWARD NO.

A. SENIOR PERSONNEL: PI/PD, Co-PI’s, Faculty and Other Senior Associates (List each separately with title, A.7. show number in brackets) CAL ACAD SUMR

1.

2.

3.

4.

5.

6. ( ) OTHERS (LIST INDIVIDUALLY ON BUDGET JUSTIFICATION PAGE)

7. ( ) TOTAL SENIOR PERSONNEL (1 - 6)

B. OTHER PERSONNEL (SHOW NUMBERS IN BRACKETS)

1. ( ) POST DOCTORAL SCHOLARS

2. ( ) OTHER PROFESSIONALS (TECHNICIAN, PROGRAMMER, ETC.)

3. ( ) GRADUATE STUDENTS

4. ( ) UNDERGRADUATE STUDENTS

5. ( ) SECRETARIAL - CLERICAL (IF CHARGED DIRECTLY)

6. ( ) OTHER

TOTAL SALARIES AND WAGES (A + B)

C. FRINGE BENEFITS (IF CHARGED AS DIRECT COSTS)

TOTAL SALARIES, WAGES AND FRINGE BENEFITS (A + B + C)

D. EQUIPMENT (LIST ITEM AND DOLLAR AMOUNT FOR EACH ITEM EXCEEDING $5,000.)

TOTAL EQUIPMENT

E. TRAVEL 1. DOMESTIC (INCL. U.S. POSSESSIONS)

2. INTERNATIONAL

F. PARTICIPANT SUPPORT COSTS

1. STIPENDS $

2. TRAVEL

3. SUBSISTENCE

4. OTHER

TOTAL NUMBER OF PARTICIPANTS ( ) TOTAL PARTICIPANT COSTS

G. OTHER DIRECT COSTS

1. MATERIALS AND SUPPLIES

2. PUBLICATION COSTS/DOCUMENTATION/DISSEMINATION

3. CONSULTANT SERVICES

4. COMPUTER SERVICES

5. SUBAWARDS

6. OTHER

TOTAL OTHER DIRECT COSTS

H. TOTAL DIRECT COSTS (A THROUGH G)

I. INDIRECT COSTS (F&A)(SPECIFY RATE AND BASE)

TOTAL INDIRECT COSTS (F&A)

J. TOTAL DIRECT AND INDIRECT COSTS (H + I)

K. SMALL BUSINESS FEE

L. AMOUNT OF THIS REQUEST (J) OR (J MINUS K)

M. COST SHARING PROPOSED LEVEL $ AGREED LEVEL IF DIFFERENT $

PI/PD NAME FOR NSF USE ONLYINDIRECT COST RATE VERIFICATION

ORG. REP. NAME*

*ELECTRONIC SIGNATURES REQUIRED FOR REVISED BUDGET

2YEAR

2

Carnegie-Mellon University

Christopher

ChristopherChristopher

Atkeson

Atkeson Atkeson

0.00 0.00 0.00

0 0.00 0.00 0.00 01 0.00 0.00 0.00 0

0 0.00 0.00 0.00 00 0.00 0.00 0.00 00 00 00 00 0

00

0

180,000$15 computers320,0008 mobile bases300,000Fabricated Equipment: mobile capture system

800,00000

0000

0 0

00000

2,000 2,000

802,000

1,162Modified Total Direct Costs Base (Rate: 58.1000, Base: 2000)

803,1620

803,162200,000

Page 33: List of Suggested Reviewers or Reviewers Not To Include

SUMMARYPROPOSAL BUDGET

FundsRequested By

proposer

Fundsgranted by NSF

(if different)

Date Checked Date Of Rate Sheet Initials - ORG

NSF FundedPerson-months

fm1030rs-07

FOR NSF USE ONLYORGANIZATION PROPOSAL NO. DURATION (months)

Proposed Granted

PRINCIPAL INVESTIGATOR / PROJECT DIRECTOR AWARD NO.

A. SENIOR PERSONNEL: PI/PD, Co-PI’s, Faculty and Other Senior Associates (List each separately with title, A.7. show number in brackets) CAL ACAD SUMR

1.

2.

3.

4.

5.

6. ( ) OTHERS (LIST INDIVIDUALLY ON BUDGET JUSTIFICATION PAGE)

7. ( ) TOTAL SENIOR PERSONNEL (1 - 6)

B. OTHER PERSONNEL (SHOW NUMBERS IN BRACKETS)

1. ( ) POST DOCTORAL SCHOLARS

2. ( ) OTHER PROFESSIONALS (TECHNICIAN, PROGRAMMER, ETC.)

3. ( ) GRADUATE STUDENTS

4. ( ) UNDERGRADUATE STUDENTS

5. ( ) SECRETARIAL - CLERICAL (IF CHARGED DIRECTLY)

6. ( ) OTHER

TOTAL SALARIES AND WAGES (A + B)

C. FRINGE BENEFITS (IF CHARGED AS DIRECT COSTS)

TOTAL SALARIES, WAGES AND FRINGE BENEFITS (A + B + C)

D. EQUIPMENT (LIST ITEM AND DOLLAR AMOUNT FOR EACH ITEM EXCEEDING $5,000.)

TOTAL EQUIPMENT

E. TRAVEL 1. DOMESTIC (INCL. U.S. POSSESSIONS)

2. INTERNATIONAL

F. PARTICIPANT SUPPORT COSTS

1. STIPENDS $

2. TRAVEL

3. SUBSISTENCE

4. OTHER

TOTAL NUMBER OF PARTICIPANTS ( ) TOTAL PARTICIPANT COSTS

G. OTHER DIRECT COSTS

1. MATERIALS AND SUPPLIES

2. PUBLICATION COSTS/DOCUMENTATION/DISSEMINATION

3. CONSULTANT SERVICES

4. COMPUTER SERVICES

5. SUBAWARDS

6. OTHER

TOTAL OTHER DIRECT COSTS

H. TOTAL DIRECT COSTS (A THROUGH G)

I. INDIRECT COSTS (F&A)(SPECIFY RATE AND BASE)

TOTAL INDIRECT COSTS (F&A)

J. TOTAL DIRECT AND INDIRECT COSTS (H + I)

K. SMALL BUSINESS FEE

L. AMOUNT OF THIS REQUEST (J) OR (J MINUS K)

M. COST SHARING PROPOSED LEVEL $ AGREED LEVEL IF DIFFERENT $

PI/PD NAME FOR NSF USE ONLYINDIRECT COST RATE VERIFICATION

ORG. REP. NAME*

*ELECTRONIC SIGNATURES REQUIRED FOR REVISED BUDGET

3YEAR

3

Carnegie-Mellon University

Christopher

ChristopherChristopher

Atkeson

Atkeson Atkeson

0.00 0.00 0.00

0 0.00 0.00 0.00 01 0.00 0.00 0.00 0

0 0.00 0.00 0.00 00 0.00 0.00 0.00 00 00 00 00 0

00

0

382,303$Fabricated Equipment: mobile capture system

382,30300

0000

0 0

00000

10,000 10,000 392,303

5,810Modified Total Direct Costs Base (Rate: 58.1000, Base: 10000)

398,1130

398,113200,000

Page 34: List of Suggested Reviewers or Reviewers Not To Include

SUMMARYPROPOSAL BUDGET

FundsRequested By

proposer

Fundsgranted by NSF

(if different)

Date Checked Date Of Rate Sheet Initials - ORG

NSF FundedPerson-months

fm1030rs-07

FOR NSF USE ONLYORGANIZATION PROPOSAL NO. DURATION (months)

Proposed Granted

PRINCIPAL INVESTIGATOR / PROJECT DIRECTOR AWARD NO.

A. SENIOR PERSONNEL: PI/PD, Co-PI’s, Faculty and Other Senior Associates (List each separately with title, A.7. show number in brackets) CAL ACAD SUMR

1.

2.

3.

4.

5.

6. ( ) OTHERS (LIST INDIVIDUALLY ON BUDGET JUSTIFICATION PAGE)

7. ( ) TOTAL SENIOR PERSONNEL (1 - 6)

B. OTHER PERSONNEL (SHOW NUMBERS IN BRACKETS)

1. ( ) POST DOCTORAL SCHOLARS

2. ( ) OTHER PROFESSIONALS (TECHNICIAN, PROGRAMMER, ETC.)

3. ( ) GRADUATE STUDENTS

4. ( ) UNDERGRADUATE STUDENTS

5. ( ) SECRETARIAL - CLERICAL (IF CHARGED DIRECTLY)

6. ( ) OTHER

TOTAL SALARIES AND WAGES (A + B)

C. FRINGE BENEFITS (IF CHARGED AS DIRECT COSTS)

TOTAL SALARIES, WAGES AND FRINGE BENEFITS (A + B + C)

D. EQUIPMENT (LIST ITEM AND DOLLAR AMOUNT FOR EACH ITEM EXCEEDING $5,000.)

TOTAL EQUIPMENT

E. TRAVEL 1. DOMESTIC (INCL. U.S. POSSESSIONS)

2. INTERNATIONAL

F. PARTICIPANT SUPPORT COSTS

1. STIPENDS $

2. TRAVEL

3. SUBSISTENCE

4. OTHER

TOTAL NUMBER OF PARTICIPANTS ( ) TOTAL PARTICIPANT COSTS

G. OTHER DIRECT COSTS

1. MATERIALS AND SUPPLIES

2. PUBLICATION COSTS/DOCUMENTATION/DISSEMINATION

3. CONSULTANT SERVICES

4. COMPUTER SERVICES

5. SUBAWARDS

6. OTHER

TOTAL OTHER DIRECT COSTS

H. TOTAL DIRECT COSTS (A THROUGH G)

I. INDIRECT COSTS (F&A)(SPECIFY RATE AND BASE)

TOTAL INDIRECT COSTS (F&A)

J. TOTAL DIRECT AND INDIRECT COSTS (H + I)

K. SMALL BUSINESS FEE

L. AMOUNT OF THIS REQUEST (J) OR (J MINUS K)

M. COST SHARING PROPOSED LEVEL $ AGREED LEVEL IF DIFFERENT $

PI/PD NAME FOR NSF USE ONLYINDIRECT COST RATE VERIFICATION

ORG. REP. NAME*

*ELECTRONIC SIGNATURES REQUIRED FOR REVISED BUDGET

4YEAR

4

Carnegie-Mellon University

Christopher

ChristopherChristopher

Atkeson

Atkeson Atkeson

0.00 0.00 0.00

0 0.00 0.00 0.00 01 0.00 0.00 0.00 0

0 0.00 0.00 0.00 00 0.00 0.00 0.00 00 00 00 00 0

00

0

300,000$Fabricated Equipment: mobile capture system

300,00000

0000

0 0

00000

10,000 10,000 310,000

5,810Modified Total Direct Costs Base (Rate: 58.1000, Base: 10000)

315,8100

315,810200,000

Page 35: List of Suggested Reviewers or Reviewers Not To Include

SUMMARYPROPOSAL BUDGET

FundsRequested By

proposer

Fundsgranted by NSF

(if different)

Date Checked Date Of Rate Sheet Initials - ORG

NSF FundedPerson-months

fm1030rs-07

FOR NSF USE ONLYORGANIZATION PROPOSAL NO. DURATION (months)

Proposed Granted

PRINCIPAL INVESTIGATOR / PROJECT DIRECTOR AWARD NO.

A. SENIOR PERSONNEL: PI/PD, Co-PI’s, Faculty and Other Senior Associates (List each separately with title, A.7. show number in brackets) CAL ACAD SUMR

1.

2.

3.

4.

5.

6. ( ) OTHERS (LIST INDIVIDUALLY ON BUDGET JUSTIFICATION PAGE)

7. ( ) TOTAL SENIOR PERSONNEL (1 - 6)

B. OTHER PERSONNEL (SHOW NUMBERS IN BRACKETS)

1. ( ) POST DOCTORAL SCHOLARS

2. ( ) OTHER PROFESSIONALS (TECHNICIAN, PROGRAMMER, ETC.)

3. ( ) GRADUATE STUDENTS

4. ( ) UNDERGRADUATE STUDENTS

5. ( ) SECRETARIAL - CLERICAL (IF CHARGED DIRECTLY)

6. ( ) OTHER

TOTAL SALARIES AND WAGES (A + B)

C. FRINGE BENEFITS (IF CHARGED AS DIRECT COSTS)

TOTAL SALARIES, WAGES AND FRINGE BENEFITS (A + B + C)

D. EQUIPMENT (LIST ITEM AND DOLLAR AMOUNT FOR EACH ITEM EXCEEDING $5,000.)

TOTAL EQUIPMENT

E. TRAVEL 1. DOMESTIC (INCL. U.S. POSSESSIONS)

2. INTERNATIONAL

F. PARTICIPANT SUPPORT COSTS

1. STIPENDS $

2. TRAVEL

3. SUBSISTENCE

4. OTHER

TOTAL NUMBER OF PARTICIPANTS ( ) TOTAL PARTICIPANT COSTS

G. OTHER DIRECT COSTS

1. MATERIALS AND SUPPLIES

2. PUBLICATION COSTS/DOCUMENTATION/DISSEMINATION

3. CONSULTANT SERVICES

4. COMPUTER SERVICES

5. SUBAWARDS

6. OTHER

TOTAL OTHER DIRECT COSTS

H. TOTAL DIRECT COSTS (A THROUGH G)

I. INDIRECT COSTS (F&A)(SPECIFY RATE AND BASE)

TOTAL INDIRECT COSTS (F&A)

J. TOTAL DIRECT AND INDIRECT COSTS (H + I)

K. SMALL BUSINESS FEE

L. AMOUNT OF THIS REQUEST (J) OR (J MINUS K)

M. COST SHARING PROPOSED LEVEL $ AGREED LEVEL IF DIFFERENT $

PI/PD NAME FOR NSF USE ONLYINDIRECT COST RATE VERIFICATION

ORG. REP. NAME*

*ELECTRONIC SIGNATURES REQUIRED FOR REVISED BUDGET

Cumulative

C

Carnegie-Mellon University

Christopher

ChristopherChristopher

Atkeson

Atkeson Atkeson

0.00 0.00 0.00

0.00 0.00 0.00 00 0.00 0.00 0.00 0

0 0.00 0.00 0.00 00 0.00 0.00 0.00 00 00 00 00 0

00

0

1,830,303$

1,830,30300

0000

0 0

00000

23,000 23,000

1,853,303

13,363

1,866,6660

1,866,666800,000

Page 36: List of Suggested Reviewers or Reviewers Not To Include

Budget JustificationThis proposal is to fabricate equipment, a mobile behavioral capture system, with a total project cost of

$2,666,666, over 4 years. CMU is providing cost sharing of $200,000 in graduate student support per year,

totalling $800,000. The source of the cost sharing is the Dean’s office of the School of Computer Science of

Carnegie Mellon University.

To prepare this budget, given that in the first year we will perfect the design of the fabricated equipment

using the latest technology, we have used prices of relevant currently available technology. We provide quotes

for individual components that cost more than $5000 as supplementary documents.

In the first year we expect to purchase five computers at approximately $12,000 each (quote from Exxact).

These computers have been equipped with four state of the art GPUs. We expect the computers and GPUs

we will actually purchase a year from now will cost about the same but be even more powerful. To network

the above computers we will buy an Infiniband network switch for approximately $8000 (quote from Dell).

We also plan to purchase two mobile bases which we estimate will cost approximately $40,000 each (quote

from Segway). This mobile base is one of the few omnidirectional bases we have found that are fast enough

to keep up with human walking (1.3m/s) and strong enough to carry up to two of the above computers and 8

MMCams. Again, a year from now we will again survey available mobile bases. We have estimated the cost

of MMCams by pricing visible light cameras (a cluster of 6 cameras attached to an NVIDIA board with a TX2

Jetson GPU: Leopard Imaging LI-JETSON-KIT-IMX477CS-X $1600) and time of flight depth cameras (Basler

tof640-20gm 850nm $2340). There are additional costs for synchronization hardware and other wiring. We

have based this cost estimate on costs we saw building the Panoptic Studio.

In the first year we will also begin to prototype contact, force, and physiological sensors. The costs of

individual components are relatively cheap (in the hundreds of dollars: consumer level thermal, ultrasound, and

radar imaging sensors are typically $200-300). We have based the total costs for this year based on our historical

costs for this type of development.

We have also included $1000 in maintenance costs, which also pay for installing some computers in an

appropriately cooled computer room.

In year 2 we are building the full measurement-at-a-distance system, adding 15 computers at $12000 (quote

from Exxact) each and 8 mobile bases at $40000 (quote from Segway) each. Additional funds are requested for

64 MMCams, and continuing development of contact, force, and physiological sensors. There will be additional

maintenance costs.

In year 3 and 4 we focus on building out the full system, developing ground truth testing equipment, and

fixing any design flaws. We expect to develop custom electronics for the contact, force, and physiological

sensors. At this point we start to pay for a full year of maintenance for the full system. For fabricating equipment

with low cost components, we have based our estimates on our historical costs for developing this type of

equipment.

Equipment

Line D reflects the equipment needed for the project.

Other Direct Costs

Line G6 reflects the costs associated with maintenance of the equipment for the project.

Indirect Costs

Indirect Costs on this proposal have been calculated at our current proposed or negotiated rate for all fiscal years

in accordance with the OMB Uniform Guidance on Cost Principles, Audit, and Administrative Requirements

1

Page 37: List of Suggested Reviewers or Reviewers Not To Include

for Federal Awards. The modified total direct cost base (MTDC) amount used in calculating the indirect costs

is the total direct costs, excluding capital equipment, charges for tuition remission, and participant support.

Overhead Rate: 58.10% (capped rate for grants and cooperative agreements)

Requested TableITEM YEAR 1 YEAR 2 YEAR 3 YEAR 4 TOTAL

NSF Cost NSF Cost NSF Cost NSF Cost NSF Cost

Request Sharing Request Sharing Request Sharing Request Sharing Request Sharing

1 computers 60000 0 180000 0 0 0 0 0 240000 0

2 mobile 80000 0 320000 0 0 0 0 0 400000 0

bases

3 network 8000 0 0 0 0 0 0 0 8000 0

switch

4 fabricated 200000 0 300000 0 382203 0 300000 0 1182203 0

equipment

5 maintenance 1000 0 2000 0 10000 0 10000 0 23000 0

6 other 581 0 1162 0 5810 0 5810 0 13363 0

7 graduate 0 200000 0 200000 0 200000 0 200000 0 800000

students

TOTAL 349581 200000 803162 200000 398113 200000 315810 200000 1866666 800000

2

Page 38: List of Suggested Reviewers or Reviewers Not To Include

Current and Pending Support (See GPG Section II.D.8 for guidance on information to include on this form.)

The following information should be provided for each investigator and other senior personnel. Failure to provide this information may delay consideration of this proposal. Other agencies (including NSF) to w hich this proposal has been/will be submitted. Investigator: Christopher Atkeson Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: RI: Medium: Combining Optimal and Neuromuscular Controllers for Agile and Robust Humanoid Behavior Source of Support: National Science Foundation Total Award Amount: $1,000,000

Total Award Period Covered: 08/01/16 – 07/31/19 Location of Project: Carnegie Mellon University Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr: 1.0

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: NSF: INT: Individualized Co-Robotics Source of Support: National Science Foundation Total Award Amount: $1,500,000

Total Award Period Covered: 09/01/17 – 08/31/20 Location of Project: Carnegie Mellon University Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr: 1.0

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: MRI: Development of a Mobile Human Behavior Capture System (this proposal) Source of Support: National Science Foundation Total Award Amount: $1,866,666

Total Award Period Covered: 09/01/18 – 08/31/22 Location of Project: Carnegie Mellon University NO EFFORT Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr:

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: Source of Support: Total Award Amount: $

Total Award Period Covered: Location of Project: Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr:

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: Source of Support: Total Award Amount: $

Total Award Period Covered: Location of Project: Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr:

*If this project has previously been funded by another agency, please list and furnish information for immediately pre-ceding funding period. NSF Form 1239 USE ADDITIONAL SHEETS AS NECESSARY

Page 39: List of Suggested Reviewers or Reviewers Not To Include

CURRENT AND PENDING SUPPORT Investigator: Katerina Fragkiadaki Other agencies including NSF to which this proposal has been/will be submitted: None

Support: Pending Title: RI: Small: Large Scale Imitation Learning and Language Understanding from Narrated Videos Source of Support: National Science Foundation Total Award Amount: $499,835 Period of Performance: 9/1/2018 to 8/31/2021 Location of Project: Carnegie Mellon University Number of Person-Months: 1.2 SU Support: Pending Title: MRI: Development of a Mobile Human Behavior Capture System Source of Support: National Science Foundation Total Award Amount: $1,866,666 Period of Performance: 9/1/2018 to 8/31/2022 Location of Project: Carnegie Mellon University Number of Person-Months: N/A

Page 40: List of Suggested Reviewers or Reviewers Not To Include

Current and Pending Support (See GPG Section II.D.8 for guidance on information to include on this form.)

The following information should be provided for each investigator and other senior personnel. Failure to provide this infor-mation may delay consideration of this proposal. Other agencies (including NSF) to w hich this proposal has been/w ill be submit-

Investigator: Jessica Hodgins Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: SCH: EXP: Monitoring Motor Symptoms in Parkinson's Disease with Wearable Devices Source of Support: National Science Foundation Total Award Amount: $678,850

Total Award Period Covered: 09/01/16 – 08/31/19 Location of Project: Carnegie Mellon University Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr: 1.00

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: CMLH: In-Home Movement Therapy Data Collection Source of Support: UPMC Total Award Amount: $300,000

Total Award Period Covered: 06/01/17 – 05/31/18 Location of Project: Carnegie Mellon University Person-Months Per Year Committed to the Project.

Cal: Acad: 1.25 Sumr: 0.40

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: Affective State Estimation From Wearable Sensors Source of Support: Sony Corporation Total Award Amount: $155,689

Total Award Period Covered: 08/21/17 – 03/31/18 Location of Project: Carnegie Mellon University S t Person-Months Per Year Committed to the Project.

Cal: Acad: 0.75 Sumr: 0.25

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: MRI: Development of a Mobile Human Behavior Capture System (this proposal) Source of Support: National Science Foundation Total Award Amount: $1,866,666

Total Award Period Covered: 09/01/18 – 08/31/22 Location of Project: Carnegie Mellon University NO EFFORT Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr:

Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: Source of Support: Total Award Amount: $

Total Award Period Covered: Location of Project: Carnegie Mellon University Person-Months Per Year Committed to the Project.

Cal: Acad: Sumr:

*If this project has previously been funded by another agency, please list and furnish information for immediately preceding funding period. NSF Form 1239 USE ADDITIONAL SHEETS AS NECESSARY

Page 41: List of Suggested Reviewers or Reviewers Not To Include

Current and Pending Support (See GPG Section II.D.8 for guidance on inform ation to include on this form .)

Please note that Dr. Yaser Sheikh is a Research Professor, and not tenure track faculty w ith teaching responsibilities. As such, his ef fort is supported solely by research funding. Other agencies (including NSF) to which this proposal has been/will be sub-

Investigator: Yaser She ikh Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: Data-driven 3D Event Brow s ing from Multiple Mobile Cam eras Source of Support: Office of Naval Research Total Aw ard Amount: $498,499

Total Aw ard Period Covered: 05/01/2015– 04/30/2018 Location of Project: Carnegie Me llon Univers ity Person-Months Per Year Committed to the Pro-

Cal : Acad: Sumr: 0.69 Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: Robust Autom atic Activity De tection for a Multi-Cam era Stream ing Video Environ-

Source of Support: DoI/IARPA Total Aw ard Amount: $9,972,762

Total Aw ard Period Covered: 09/20/2017– 09/19/2021 Location of Project: Carnegie Me llon Univers ity Person-Months Per Year Committed to the Pro-

Cal : Acad: 2.25 Sumr: 0.75 Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal Title: MRI: Deve lopm ent of a Mobile Hum an Behavior Capture Sys tem (this proposal) Source of Support: National Science Foundation Total Aw ard Amount: $1,866,666

Total Aw ard Period Covered: 09/01/2018– 08/31/2022 Location of Project: Carnegie Me llon Univers ity NO EFFORT Person-Months Per Year Committed to the Pro-

Cal : Acad: Sumr: Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal T i tle: Source of Support: Total Award Amount: $

Total Award Period Covered: Location of Project: Carnegie Me llon Univers ity Person-Months Per Year Committed to the Pro-

Cal : Acad: Sumr: Support: Current Pending Submission Planned in Near Future *Transfer of Support Project/Proposal T i tle: Source of Support: Total Award Amount: $

Total Award Period Covered: Location of Project: Carnegie Me llon Univers ity Person-Months Per Year Committed to the Pro-

Cal : Acad: Sumr: *If this project has previously been funded by another agency, please list and furnish information for immediately preceding funding period. NSF Form 1239 USE ADDITIONAL SHEETS AS NECESSARY

Page 42: List of Suggested Reviewers or Reviewers Not To Include
Page 43: List of Suggested Reviewers or Reviewers Not To Include

CMU Facilities

Motion Capture LabThe 1700 square foot Motion Capture Lab provides a resource for behavior capture of humans as well

as measuring and controlling robot behavior in real time. It includes a Vicon Optical Motion Capture

System with sixteen 200 Hz, 4Meg resolution cameras (MX-40). In addition to traditional motion

capture, the Vicon system can be used in real time to track robot motion, and provide the equivalent

of very high quality inertial feedback. In addition to capturing motion, we have instrumentation to

capture contact forces at the hands and feet (one force gauge (IMADA DPS-44), one ATI Industrial

Automation Mini85 wrist force torque sensor, and two AMTI AccuSway PLUS force plates that mea-

sure the six-axis contact force and torque at a rate of 1 kHz), and also electromyographic activity

(EMG, a measure of muscle activation, Aurion ZeroWire (wireless) system with 16 pairs of electrodes

at a rate of 5 kHz). A high-speed video camera is also used to capture skin deformation at 1 kHz.

Behavior capture goes beyond motion capture with this capture of forces and muscle activation.

The Humanoid Robotics Lab (located in the Motion Capture Lab described below) provides a state

of the art full-sized humanoid for research and education. We have developed a hydraulic humanoid

in collaboration with Sarcos (with NSF equipment funding). We use this robot because of the speed,

power, and achievable joint compliance of the hydraulics and the range of motion of the joints.

Panoptic StudioThe Panoptic Studio is a multiview capture system with 521 heterogeneous sensors, consisting of

480 VGA cameras, 31 HD Cameras, and 10 Kinect v2 RGB+D sensors, distributed over the surface

of geodesic sphere with a 5.49m diameter (Figure 1). The large number of lower resolution VGA

cameras at unique viewpoints provide a large volume with robustness against occlusions, and allow

no restriction for view direction of the subjects. The HD views provide details (zoom) of the scene.

Multiple Kinects provide initial point clouds to generate dense trajectory stream.

The structure consists of pentagonal panels, hexagonal panels, and trimmed base panels. Our

design was modularized so that each hexagonal panel houses a set of 24 VGA cameras. The HD

cameras are installed at the center of each hexagonal panel, and projectors are installed at the center of

each pentagonal panel. Additionally, a total of 10 Kinect v2 RGB+D sensors are mounted at heights

of 1 and 2.6 meters, forming two rings with 5 evenly spaced sensors each.

System Architecture: Figure 2 shows the architecture of our system. The 480 cameras are ar-

ranged modularly with 24 cameras in each of 20 standard hexagonal panels on the dome. Each module

in each panel is managed by a Distributed Module Controller (DMC) that triggers all cameras in the

module, receives data from them, and consolidates the video for transmission to the local machine.

Each individual camera is a global shutter CMOS sensor, with a fixed focal length of 4.5mm, that

captures VGA (640 480) resolution images at 25Hz. Each panel produces an uncompressed video

stream at 1.47 Gbps, and thus, for the entire set of 480 cameras the data-rate is approximately 29.4

Gbps. To handle this stream, the system pipeline has been designed with a modularized communi-

cation and control structure. For each subsystem, the clock generator sends a frame counter, trigger

signal, and the pixel clock signal to each DMC associated with a panel. The DMC uses this timing

information to initiate and synchronize capture of all cameras within the module. Upon trigger and

1

Page 44: List of Suggested Reviewers or Reviewers Not To Include

Figure 1: Panoptic Studio layout. (Top Row) The exterior of the dome with the equipment mounted

on the surface. (Bottom Row) The interior of the dome. VGA cameras are shown as red circles, HD

cameras as blue circles, Kinects as cyan rectangles, and projectors as green rectangles.

exposure, each of the 24 camera heads transfers back image data via the camera interconnect to the

DMC, which consolidates the image data and timing from all cameras. This composite data is then

transferred via optical interconnect to the module node, where it is stored locally. Each module node

has dual purpose: it serves as a distributed RAID storage unit and participates as a multicore com-

putational node in a cluster. Each module has 3 HDDs integrated as RAID-0 to have sufficient write

speed without data loss, totaling 60 HDDs for 20 modules. All the local nodes of our system are on

a local network on a gigabit switch. The acquisition is controlled via a master node that the system

operator can use to control all functions of the studio. Similar to the VGA cameras, HD cameras are

modularized and each pair of cameras are connected to a local node machine via SDI cables. Each

local node saves the data from two cameras to two RAID storage units respectively. Each RGB+D

sensor is connected to a dedicated capture node that is mounted on the dome exterior. To capture at

rates of approximately 30 Hz, the nodes are equipped with two SSD drives each and store color, depth,

and infrared frames as well as body and face detections from the Kinect SDK. A separate master node

controls and coordinates the 10 capture nodes via the local network.

Temporal Calibration for Heterogeneous Sensors: Synchronizing the cameras is necessary to

use geometric constraints (such as triangulation) across multiple views. In our system, we use hard-

ware clocks to trigger cameras at the same time. Because the frame rates of the VGA and HD cameras

2

Page 45: List of Suggested Reviewers or Reviewers Not To Include

Figure 2: Modularized system architecture. The studio houses 480 VGA cameras synchronized to a

central clock system and controlled by a master node. 31 synchronized HD cameras are also installed

with another clock system. The VGA clock and HD clock are temporally aligned by recording them

as a stereo signal. 10 RGB-D sensors are also located in the studio. All the sensors are calibrated to

the same coordinate system.

are different (25 fps and 29.97 fps respectively) we use two separate hardware clocks to achieve shut-

terlevel synchronization among all VGA cameras, and independently among all HD cameras. To

precisely align the two time references, we record the timecode signals generated from the two clocks

as a single stereo audio signal, which we then decode to obtain a precise alignment at sub-millisecond

accuracy.

Time alignment with the Kinect v2 streams (RGB and depth) is achieved with a small hardware

modification: each Kinects microphone array is rewired to instead record an LTC timecode signal.

This timecode signal is the same that is produced by the genlock and timecode generator used to

synchronize the HD cameras, and is distributed to each Kinect via a distribution amplifier. We process

the Kinect audio to decode the LTC timecode, yielding temporal alignment between the recorded

Kinect datawhich is timestamped by the capture API for accurate relative timing between color, depth,

and audio framesand the HD video frames. Empirically, we have confirmed the temporal alignment

obtained by this method to be of at least millisecond accuracy

Spatial Calibration We use Structure from Motion (SfM) to calibrate all of the 521 cameras. To

easily generate feature points for SfM, five projectors are also installed on the geodesic dome. For

calibration, they project a random pattern on a white structure (we use a portable white tent), and

multiple scenes (typically three) are captured by moving the structure within the dome. We perform

SfM for each scene separately and perform a bundle adjustment by merging all the matches from each

scene. We use the VisualSfM software with 1 distortion parameter to produce an initial estimate and

3

Page 46: List of Suggested Reviewers or Reviewers Not To Include

a set of candidate correspondences, and subsequently run our own bundle adjustment implementation

with 5 distortion parameters for the final refinement. The computation time is about 12 hours with 6

scenes (521 images for each) using a 6 core machine. In this calibration process, we only use the color

cameras of Kinects. We additionally calibrate the transformation between the color and depth sensor

for each Kinect with a standard checkerboard pattern, placing all cameras in alignment within a global

coordinate frame.

Software: A more detailed description of the software is presented in [1]. We have developed a

method to automatically reconstruct full body motion of interacting multiple people. Our method does

not rely on a 3D template model or any subject-specific assumption such as body shape, color, height,

and body topology. Our method works robustly in various challenging social interaction scenes of ar-

bitrary number of people, producing temporally coherent time-varying body structures. Furthermore,

our method is free from error accumulation and, thus, enables capture of long term group interactions

(e.g., more than 10 minutes).

Our algorithm is designed to fuse the weak perceptual processes in the large number of views

by progressively generating skeletal proposals from low-level appearance cues, and a framework for

temporal refinement is also presented by associating body parts to reconstructed dense 3D trajectory

stream. Our system and method are the first in reconstructing full body motion of more than five

people engaged in social interactions without using markers. We also empirically demonstrate the

impact of the number of views in achieving this goal.

Our algorithm is composed of two major stages. The first stage takes, as input, images from mul-

tiple views at a time instance (calibrated and synchronized), and produces 3D body skeletal proposals

for multiple human subjects. The second stage further refines the output of the first stage by using a

dense 3D patch trajectory stream, and produces temporally stable 3D skeletons and an associated set

of labeled 3D patch trajectories for each body part, describing subtle surface motions.

In the first stage, a 2D pose detector is computed independently on all 480 VGA views at each

time instant, generating detection score maps for each body joint. Our approach then generates several

levels of proposals, as shown in Figure 3. A set of node proposals for each joint is generated by non-

maxima suppression of the 3D score map, where the k-th node proposal is a putative 3D position of

that anatomical landmark. A part proposal is a putative body part connecting two node proposals.

As the output of the first stage, our algorithm produces skeletal proposals. A skeletal proposal is

generated by finding an optimal combination of part proposals using a dynamic programming method.

After reconstructing skeletal proposals at each time t independently, we associate skeletons from the

same identities across time and generate skeletal trajectory proposals, which are sets of part trajectory

proposals (a moving part across time).

In the second stage, we refine the skeletal trajectory proposals generated in the first stage using

dense 3D patch trajectories. To produce evidence of the motion of different anatomical landmarks, we

compute a set of dense 3D trajectories which we refer to as a 3D patch trajectory stream, by tracking

each 3D patch independently. Each patch trajectory is initiated at an arbitrary time (every 20th frame

in our results), and tracked for an arbitrary duration (30 frames backward-forward in our results). Our

method associates a part trajectory with a set of patch trajectories, and these trajectories determine

rigid transformations, between any time t to t+1 for this part. These labeled 3D trajectories associated

with each part provide surface deformation cues and also play a role in refining the quality by reducing

motion jitter, filling missing parts, and detecting erroneous parts.

Social Interaction Dataset: We publicly share a novel dataset which is the largest in terms of the

number of views (521 views), duration (3+ hours in total), and the number of subjects in the scenes (up

to 8 subjects) for full body motion capture. Our dataset is distinctive from the previously presented

4

Page 47: List of Suggested Reviewers or Reviewers Not To Include

Figure 3: Several levels of proposals generated by our method. (a) Images from up to 480 views. (b)

Per-joint detection score maps. (c) Node proposals generated after non-maxima suppression. (d) Part

proposals by connecting a pair of node proposals. (e) Skeletal proposals generated by piecing together

part proposals. (f) Labeled 3D patch trajectory stream showing associations with each part trajectory.

In (c-f), color means joint or part labels shown below the figure.

datasets in that ours captures natural interactions of groups without controlling their behavior and

appearance, and contains motions with rich social signals. The system described in this paper provides

empirical data of unprecedented resolution with the promise of facilitating data-driven exploration of

scientific conjectures about the communication code of social behavior. All the data and output are

publicly shared on our website (https://domedb.perception.cs.cmu.edu).

References[1] Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei

Nobuhara, and Yaser Sheikh. Panoptic studio: A massively multiview system for social motion

capture. In The IEEE International Conference on Computer Vision (ICCV), 2015.

5