recap product design report

47
A Real-time Captioning Solution for Class Course 96717 - Special Topics: Technology-based Product Innovation and Enterprise Creation Instructors Prof. Jonathan Cagan, Prof. John Evans Technologist Prof. Ian Lane Entrepreneurship Mentor Tom Chiu Team Adhithi AJI, Cheng(Vanessa) LI, Fan Sai KUOK, Sanika KOKOTE ReCap | A Realtime Captioning Solution for Class | 1

Upload: vanessa-li

Post on 16-Jul-2015

1.380 views

Category:

Design


3 download

TRANSCRIPT

A Real-time Captioning Solution for Class

Course

96717 - Special Topics: Technology-based Product Innovation and Enterprise Creation

Instructors

Prof. Jonathan Cagan, Prof. John Evans

Technologist

Prof. Ian Lane

Entrepreneurship Mentor

Tom Chiu

Team

Adhithi AJI, Cheng(Vanessa) LI, Fan Sai KUOK, Sanika KOKOTE

ReCap | A Real­time Captioning Solution for Class | 1 

Contents 1 Introducing ReCap

1.1 What is ReCap 1.2 Stakeholders 1.3 System Structure 1.4 Core Use Case 1.5 Service and Product Use Flow

2 Designs in Detail 2.1 Industrial Design 2.2 User Interface Design 2.3 Technical Details 2.4 Cost Estimation

3 Stakeholder Testimonials 4 Value of ReCap

4.1 What Problem Is ReCap Solving? 4.2 What Makes ReCap Unique 4.3 10-Fold Improvement 4.4 Competitors 4.5 Value Opportunity Analyse 4.6 Value Opportunity Analyse in Detail 4.7 Secondary Competitors

5 About the Business 5.1 Market Size 5.2 Go to Market Strategy 5.3 Business Model 5.4 Financial 5.5 Funding Requirements

6 Project Development Process 7 Vision

7.1 ReCap as An Enterprise 7.2 Exit Strategy

Appendices

ReCap | A Real­time Captioning Solution for Class | 2 

I Business canvas II Survey from Western Pennsylvania Deaf School III Stakeholder Analysis IV Things We Have Done V VOA for Phase 2 & 3 Product (previous plan)

References

ReCap | A Real­time Captioning Solution for Class | 3 

1 Introducing ReCap

1.1 What is ReCap

ReCap is a combination of product and service based speech recognition that provides live

captioning for deaf and hard hearing student in class. Unlike CART, ReCap solves the

problem using modern technology instead of human labor which largely lower the cost

and enhance the using experience of live captioning service.

1.2 Stakeholders

The customers are schools and institutions, as they will purchase the service and devices

and provide them to the students.

End users are hard of hearing students and professors. We are targeting students who have

English reading ability and need more assistance besides hearing aids.

1.3 System Structure

ReCap is consisted of a mic system, which includes a portable mic and a mic base, an app

for hard of hearing students (compatible on mobile platforms like laptops, tablets and

ReCap | A Real­time Captioning Solution for Class | 4 

phones) and a website for backend management which will be used by the officers in the

disability office.

Portable Mic Mic Base Mobile App for students Website for officers

1.4 Core Use Case

Professors or conference speakers will hold the mic in hand or attach it on his cloth, and

the mic will capture the sound and transmit the sound to the mic base. Mic base can be

placed on the table, and it receives input as radio signals and converts into data packets to

transmit via wifi (or bluetooth as a backup choice when wifi is not available). Mic base acts

as the mic charger, as well as the microcontroller for converting radio signals to data

packets and modem/router.

ReCap | A Real­time Captioning Solution for Class | 5 

1.5 Service and Product Use Flow

1. The disability office will purchase the service after getting approval and fund from

the university.

2. Students with hearing loss can apply the service directly by visiting disability office

in person or going online and submitting the application form student online

system.

3. Once the application is proved, the student will get an email of ReCap app

download link and instructions. He or she can login ReCap using their students ID.

4. To estimate how many devices the university will need, the officer of disability

office will login ReCap’s website, and manage student accounts, classrooms. If there

are 15 students having classes in 9 different classrooms this semester, the

university is supposed to have at least 9 mic/mic base sets to be installed.

5. When professors enter the classroom, all he needs to do is to pick up the mic and

speak with it.

ReCap | A Real­time Captioning Solution for Class | 6 

2 Designs in Detail

2.1 Industrial Design

The mic and mic base shell will be made with ABS and aluminum, the dimension of the

mic is 0.8 x 1.4 x 0.4 inches and the dimension of the mic base is 3 x 2 x 1 inches. The

base is designed to be a signal hub and ease-accessible recharger so the mic can be

designed in the minimal size and weight. The high-fidelity mic can provide a reliable audio

source for speech recognition to have a high accuracy result.

2.2 User Interface Design

ReCap | A Real­time Captioning Solution for Class | 7 

After the voice of the speaker with the mic was captured and transmitted to the mic base,

the radio signal will be transformed into bluetooth and wifi signal and pushed to students’

devices.

The core function of the user interface is to show the caption as clear as possible. In the

captioning mode the whole screen will be used to show the captions, and other interface

elements will be hidden until user interacts with(tap on) the screen.

Further design and development can be adding more features related to classroom

scenario, such as note taking.

2.3 Technical Details

Mic, mic base and how signals are transmitted - The speaker speaks into the mic which is

converted into audio signals. Ideally, these are then converted into radio signals and sent

to a transmitter which is then received by the speakers. However, in our design, we would

ReCap | A Real­time Captioning Solution for Class | 8 

radio signals are instead transmitted to the mic base where it is converted to data packets

using the in house GPUs and CPUs. These data packets are then sent via wifi or bluetooth

to the end user’s phone/tablet/laptop. The data can be accessed via an mobile or web app

with a secure login.

Integration of speech recognition technology - The speech recognition technology will

applied be on student’s device, so its requirement of CPU and GPU will be fulfilled on

student’s laptops/tablets/phones. As long as the devices are able to receive the data

package, they can transcribe the voice in real time.

2.4 Cost Estimation

Hardware: One set of hardware will approximately cost $31 per piece. We estimate to produce 10,000 sets of mic and mic base for the first round. Including the molding tools costs, the shells will cost around $6 per each. As the market demands increases and the manufacture technique become mature, the cost will be lower. The additional electronics components are a microcontroller and a modem chip which would cost roughly around $25 per piece. Service: The main cost of the service is provide reliable web servers. Due to the nature of the service we are providing, the load will not be too heavy so the cost will be fairly low.

ReCap | A Real­time Captioning Solution for Class | 9 

3 Stakeholder Testimonials

“It’ll be very useful if the hearing impaired person is not with an interpreter.”

“We can’t wait to try it out. Do let me know when you get the prototype ready.”

- Sally, Western Pennsylvania School for the Deaf

ReCap | A Real­time Captioning Solution for Class | 10 

“The price will make it very competitive.”

“I like the feeling of independence and confidence. I can imagine how helpful it will be.”

- Maria, One of the hard of hearing students in Carnegie Mellon University

“Universities are always monitoring new technology. We are willing to evaluate the new

device.”

- Lawrence, Disability Office of Carnegie Mellon University

Overall, we got very positive feedback from the stakeholders, and we’d love to keep in

touch with them and update our progress with them.

ReCap | A Real­time Captioning Solution for Class | 11 

4 Value of ReCap

4.1 What Problem Is ReCap Solving?

Global population with hearing disability is 360 million (5% of the total population). Which

means, for every 100 people, there are five people are isolated from the rest. Hearing loss

has a huge negative impact on the communication quality.

Hearing impaired children can’t go to public school and receive the same education with

other children. Later when they choose to go to the mainstream schools, they cannot make

friends because it’s difficult for them to understand each other. The most common result is

that they only talk with their interpreters and finally give up the attempt of entering

hearing society. Many of them, for the whole life, are restricted in the hearing impaired

community, and can’t enjoy the music, lecture, movie, radio as other people do.

For all of the hearing impaired people, choosing a suitable device can be hard. A hearing

aid, is an electroacoustic device which is designed to amplify sound for the wearer, usually

with the aim of making speech more intelligible. Cochlear implants may help provide

hearing in patients who are deaf because of damage to sensory hair cells in their cochleas.

In those patients, the implants often can enable sufficient hearing for better understanding

of speech. But the quality of sound is different from natural hearing, with less sound

information being received and processed by the brain. While these product do help

making speech comprehensible, they are suboptimal solutions in noisy situations. The deaf

aid amplifies the noise along with the speech and the decoding quality is of the cochlear

implant is not very good.

ReCap | A Real­time Captioning Solution for Class | 12 

Besides the sound quality, other issues like being chunky, expensive, not convenient or of

high risk also bother the hearing impaired people. For example, the Cochlear Implant

needs surgery (which may bring risks and suffering), high cost (different state can cover

different portion), not easy to keep (young children can lose it when they are playing or

taking a shower), and not good-looking when people wear it.

Talking about classroom environment, now many universities are using CART

(Communication Access Realtime Translation), which is sending an interpreter to be with

the student and type on a special keyboard, so that students can read the text on the

computer display. It has a lot of drawbacks, including: being difficult to schedule a time

with the interpreter, expensive charge, feeling awkward, preventing deaf student from

communicating with other classmates, etc.

More introductions of competitors can be found in “Competitors” section.

To conclude, hearing impaired people (especially students) are lack of a cheap, easy and

reliable way to help with communication. Our product is here to provide the way.

Apart from the end users, we are also giving considerable value to our immediate

customers who are the educational institutions in terms of giving a superior service for

their students at a much lower cost that is not possible with existing solutions. Moreover,

the professors who are important stakeholder have complete control over the content that

is being delivered.

4.2 What Makes ReCap Unique

ReCap is a live transcribing device that has been optimized for the classroom setting.

Students can place the phone/tablet in front of them, and read the real-time transcription

on the screen.

ReCap | A Real­time Captioning Solution for Class | 13 

With the product, hearing impaired students can easily capture the content of the lecture

even if they don’t have other devices to help them to hear. If a student already has

Cochlear Implant or Hearing Aids, as a supplement, ReCap is helpful for them to adapt to

different accents, and help them to hear when ambiance noise is annoying.

ReCap will be designed exclusively for classroom/seminar scenarios, so we’ll provide some

features such as note taking or annotation on audio, so that students can better review the

lecture. University policy will not against recording in the classroom, because CART

already makes the text accessible to the deaf students.

4.3 10-Fold Improvement

The service of CART used currently in classrooms to aid hard of hearing students with

transcription costs around 100$/hour, so taking an average of 20 hours of classes in a

week, it comes up to 2000$ a week i.e $8000 a month. This is a tremendous expense for

the student or even the school to bear. Our technology on the other hand costs $200 per

month per student , which is a gigantic leap from what is being used currently. We also

offer packages for multiple students for schools having more number of students who

might require transcription in class thereby drastically decreasing the expense borne by

these institutions.

Our product requires minimal maintenance and does not depend on any individual, hence

there are no scheduling conflicts, which is one of the major problems faced by students

who use CART now as they cannot spontaneously attend any class whose time slot gets

shifted as they have to inform CART and find out if a captionist is available in that time

slot.

4.4 Competitors

As we look to address the of the hearing impaired individual in a classroom (academic)

scenario, our main competitor is one of the currently used tools, which is CART.

ReCap | A Real­time Captioning Solution for Class | 14 

The CART is basically a service provided by the Communication Access Realtime

Translation Inc where a person, “captioner” accompanies the hearing impaired person to

particular events such as classrooms, conferences and meetings and manually transcribes

the content. Currently CART can cost from a minimum fee ranging from $225 to $300 for

the first two hours or any part thereof; $75 to $100 per hour.

Our key competitive advantages over CART would be:

● Independence - A very highly valued attribute by an hearing impaired individual.

● Power - Gives the hearing impaired individual power, confidence and flexibility.

● Cost - As we eliminate a human factor the cost incurred by the individual or the

institute reduces by a great extent.

ReCap | A Real­time Captioning Solution for Class | 15 

ReCap CART

Price Around $200/month per student Around $100/h per student

Weight Less than 1 lb 40 pounds +

Skill Requirement None Experienced Captionist

Speed 5x (People read at 250 words/min) 140 words/min

4.5 Value Opportunity Analyse

We had the opportunity to interview a hearing impaired person in order to get inputs for

the VOA. Our end user is currently pursuing her MBA in Tepper School of business. She

currently has a cochlear implant on her left ear and uses the CART service to attend

classes. The university currently pays for the CART service.

The VOA developed below is a direct reflection of our end user’s thoughts/aspirations from

our product. We are currently interviewing more people from the hearing impaired

community through facebook groups with an intention of including our end user in the

design process of our product in order to delivery value that is substantial for our

customer.

Value Sub-value Product Low Medium High

Emotion

– Sense of adventure CART

Our product

– Feel of independence CART

Our product

ReCap | A Real­time Captioning Solution for Class | 16 

– Sense of security CART

Our product

– Sensuality CART

Our product

– Confidence CART

Our product

– Power CART

Our product

Aesthetics

– Visual CART NA

Our product

– Tactile CART NA

Our product

– Auditory CART NA

Our product NA

– Olfactory CART NA

Our product NA

– Gustatory CART NA

Our product NA

Impact

– Social CART

Our product

– Environmental CART

Our product

Identity

– Personality CART

Our product

ReCap | A Real­time Captioning Solution for Class | 17 

– Point in time CART

Our product

– Sense of place CART

Our product

Ergonomics

– Ease of use CART

Our product

– Safety CART NA

Our product

– Comfort CART

Our product

Core Technology

– Reliable CART

Our product

– Enabling CART

Our product

Quality

Craftsmanship CART NA

Our product

Durability CART NA

Our product

4.6 Value Opportunity Analyse in Detail

Emotion

- Sense of adventure

ReCap | A Real­time Captioning Solution for Class | 18 

Sense of adventure is the ability of a product or service which enables the individual to be

adventurous and impulsive. It discards hesitation in trying out new scenarios and

experiences.

As the CART service requires involvement of an additional person, the sense of adventure

is very low. It does not provide any flexibility to choose different situations. The CART

service does not encourage individuals to be impulsive and spontaneous.

Our technology ReCap on the other hand offers a better sense of adventure as it can be

used in any scenario as long as a compatible electronic gadget is present. It gives the

individual flexibility and confidence to engage in different experiences and be

spontaneous.

- Feel of Independence

The CART service offers a very low feel of independence as it involves another person

whom the hearing impaired individual is dependent on at all times during the class. If in

case the service is unable to allocate a transcriber the individual will be in an awkward

situation as they were completely dependent on it.

Our technology offers a very high feel of independence as it can be used in any situation

and does not require the speaker to be visible. It can also work with multiple individuals

talking by just using a smartphone or a tablet, it also does not require an internet

connection, hence offering a feel of independence to the user. Also one of the key insights

we found was that hearing impaired individuals don’t like to stand out in class because of

the interpreter’s presence. So on this factor our technology gives very high satisfaction.

- Sense of Security

ReCap | A Real­time Captioning Solution for Class | 19 

The CART service ranks medium for the sense of security aspect. The CART service involves

a third party, hence the sense of security gets diminished.

Our technology can be accessed on the individual’s personal electronic gadget like a

smartphone or a tablet, security settings of which can be entirely controlled by the

individual. Also the technology does not require an internet connection, hence all

transcribed conversation remains on the device and is not visible to anyone who cannot

access the gadget. Hence, our technology induces a high sense of security.

- Confidence

The CART service instills a moderate amount of confidence for the hearing impaired

individual as they can understand what is being discussed in class. But it does not give

enough confidence for the individual to actually participate in the discussion as there is

considerable time lag in the spoken word and the user reading it on the screen.

On the other hand our technology gives the user high confidence as they do not depend on

another person or internet connection and also as the transcribing is real time giving the

individual confidence to participate in the conversations or discussions.

- Power

The CART service gives a low sense of power to the user as he/she is dependent on the

transcriber for all the content being said in class. The user doesn’t feel very empowered

with just the CART service.

Our technology endows a very high sense of power to the user as they can understand and

participate in discussion by just using their phone, tablet or laptop, which is anyways used

ReCap | A Real­time Captioning Solution for Class | 20 

by many students to take notes and hence they can feel as a partaker in the class.

Aesthetics

- Visual

Not Applicable to the CART service as it does not posses any aesthetic aspects.

Our technology ranks high from the visual aesthetic point of view. It involves use of

everyday electronic gadgets and hence looks very familiar. Also the user interface is

designed to be very convenient and appealing visually.

- Tactile

Not Applicable to the CART service as it does not posses any aesthetic aspects.

Our technology is encompassed within a device like a smartphone, tablet or a laptop,

hence has the same tactile features as the host device. Hence, it has been ranked medium

on this parameter.

Impact

- Social

The CART service has a low social impact, because it is accessible to the individual only in

a classroom scenario. The service is not available for any situations not related to

academic lectures, also sometimes not available for conferences and lectures outside of

regular classes.

ReCap | A Real­time Captioning Solution for Class | 21 

Our Technology ranks high on the social impact because it can be used in any

environment. It gives the hearing impaired individual incentive to confidently engage in

social settings. Also as it does not require much additional hardware other than the

commonly used electronic gadgets and a small mic system ,so it merges very well in a

social situation giving a feeling of normalcy to the user.

- Environment

The CART service has a heavy, bulky specially designed keyboard used by the interpreter,

which has a negative impact on the environment.

Our technology is an app on an electronic device and a small mic system hence the

environmental disintegration because of it is very limited and thus, we ranked it medium

from the environmental impact point of view.

Identity

- Personality

Identity is an expression and conception of a person. The clothes he/she wears, computer

he/she used, bags he/she owns, are all factors that form and influence an identity. As a

hearing impaired person, the media to communicate is a crucial part of his/her identity.

The physical shape, sound, ways to use will affect other people and society’s perception of

him/her.

For the personality, because CART service requires a company wherever hearing impaired

people goes, it’s difficult to imagine him/her as an energetic, independent, well-rounded

person with a high level of social involvement and sense of security. Take Maria as an

example, since another person is assigned to be around her when she is taking the class,

ReCap | A Real­time Captioning Solution for Class | 22 

and she needs to pay attention to the transcriber all the time, her communication with her

classmates is blocked to some extent, which makes her feels isolated and vulnerable, and

makes her classmates think she is not easy to get along with. So CART service should get a

very low score in reflecting the personality.

As for our product, since it will be used on smartphones, it makes people think that the

user knows technology well, and is open to trying out new things. According to the

statistics of Nielsen revealed earlier this year, two thirds of Americans have smartphones,

and 80% of people between 18-24 years old use smartphones. Using smartphones is a

good sign of keeping up with trend and not being isolated and obsoleted. And compared to

having someone sit next to you and type for you, or putting some devices in your ear all

day, it’s not awkward at all to put smartphones on the desk in class.

- Point in time

In the time of digitalization, CART still transcribes by human, so it got a low score.

Our product has the perfect timing. Huge amount of people are using smartphones, and

relying on apps, our app should be one of the best solutions at this time.

- Sense of place

In classrooms, or any other places, the listeners should be students or other audiences, but

CART is not. In this case, the sense of place is low for CART service.

Our product has a good sense of place, because it’s natural to use, easy to quit using, and

it doesn’t suggest any difference between the hearing impaired people and hearing people.

Ergonomics

ReCap | A Real­time Captioning Solution for Class | 23 

- Easy to use

CART service is not easy to use. People have to adjust schedule with the transcriber, and if

something changes without notice, it’ll be difficult to coordinate with CART person. Based

on Maria’s experience, CART only helps her in class, so if she wants to attend a seminar or

conference, it’ll be hard to ask CART to go together. Though the quality is fine, according

to Maria’s description, we still consider it should have a low score about usability.

Our product relies on a smartphone, but doesn’t need internet connection, so it will be

very straightforward to use. Since most of people carry their phones every day, they can

easily open the app and begin to use our product. Also, we will design an easy to use

interface which will not require much learning time. Hence, we gave our product the full

grade on this evaluation.

- Safety

CART service doesn’t related to safety, so it’s not applicable.

Our product is, as far as we see, the safest solution to the people who’re suffering from

hearing loss. All we require is smartphones or tablets or laptop as the physical entity to

store and run the app, and smartphones have to pass multiple regulations to ensure the

safety of using, so we can guarantee the safety of using our app.

- Comfort

Having someone besides you is not always comfortable, especially the one is acting like a

“tool”. It feels awkward to get along with others with such a relationship, like the

transcriber is just a machine and is just for completing some tasks. As Maria said, it needs

ReCap | A Real­time Captioning Solution for Class | 24 

some time to get used to CART, and having CART is somehow negatively affect the

relationship between her and classmates.

Our app did a great job when we evaluated the level of comfort. Compared to CART,

hearing impaired people can use a phone, tablet or computer to read the text silently,

without disturbing anyone. Compared to Hearing aids, they do not need to stand the

amplified noise, or try hard to discern the voice among noise.

Core technology

- Reliable

The reliability of CART is low. CART’s technology realization depends on the transcribers,

which is a human being. In common sense, human beings is more reliable than machine in

understanding nature languages. However, this depends on the professional ability and

can be varied from different transcribers. Transcriber of CART needs to be trained for a

long time to perform in a high reliable level. And the accuracy of result might be

influenced by many factors, such as the disturbing of the environment noise, the

transcriber’s understanding of the speech topic, typos, and even the physiological

conditions of the transcriber. The influence can be reduced as the professional ability and

experience of the transcriber enhanced, but it cannot be eliminated.

Our app uses a high accuracy, high speed speech recognition technology. The stability of

of the system is high and will not effect by other factors.

- Enabling

CART ranks medium in core technology enabling.

ReCap | A Real­time Captioning Solution for Class | 25 

CART is based on manual transcribing and no much hard technology is included, the only

factor need to be considered is the transcribers’ availability, such as time slot and

geolocation, which is not always stable.

The speech recognition technology, our product based on, is an emerging technology and

flourishing. The technology has entered the market and being widely used for several

years. More and more research and development are focusing on this technology so it will

get faster and more accurate in the coming years, which is also our advantage.

Quality

- Craftsmanship

CART is a manual transcribing service so the craftsmanship is not applicable.

As for our product, the craftsmanship is the interaction design and the interface design or

the app. It is much easier for an emerging company to have a have craftsmanship software

product than a high craftsmanship hardware product. Once our product is well designed

and developed, it can be infinitely reproduced to every user’s device, so the craftsmanship

of our product can be high.

- Durability

CART is a manual transcribing service so the durability is not applicable.

Our product’s durability is high because it is a pure software solution so the information’s

can easily transferred from device to device. Users can change to any device but keep

using their own settings in their own account without feeling any difference. This provides

good user experience and results in high durability.

ReCap | A Real­time Captioning Solution for Class | 26 

Conclusion

ReCap provides a good solution in communicating with the society for people who has

hearing disabilities and difficulties. Moreover, by providing instant transcribing in every

mobile platform, our product will make the daily life easier for not only hearing disability

users but also for international students, international conferences and so on. In the value

opportunity analyse, our product exceeds existing products in the market in every value

opportunity, especially in emotion and ergonomics. With the readied technology and well

execution, our product will provide extra values for the target users and obtain success in

the market.

4.7 Secondary Competitors

Apart from CART we do have some indirect competitors who use speech recognition

technology, but in a different market space.

Transcense: An Android app using Speech recognition to convert speech to text in group

conversations. The app connects to several phones and activates their mics to capture

what everyone's saying, then it uses voice recognition to assign each person in the group a

color for their speech bubbles.

Transcense requires each speaker to have a phone/device near them and for that device to

be connected to the end users phone via wifi. Hence it doesn’t work in an environment

without internet whereas ReCap does. This gives us a wider usability range as well as more

flexibility as it can function with just one device.

ReCap | A Real­time Captioning Solution for Class | 27 

Nuance Dragon: Converts speech to text. Basically used for taking notes, writing emails,

etc. Has been in the market for a lot of time. Is used by iCommunicator which targets the

deaf community by translating speech to sign language.

Our research tells us that a large number of hard of hearing individuals do not know or use

sign language, hence we target a market which is not occupied by iCommunicator. Also

Nuance requires good internet connection whereas ReCap doesn’t.

ReCap | A Real­time Captioning Solution for Class | 28 

5 About the Business

5.1 Market Size

We are targeting the people who are hard of hearing and who rely on external devices such as cochlear implant and hearing aids to help them hear better. Since our product caters to one direction speech to text capability, we assume that our product will be most valuable to people who can speak comfortably and can reciprocate but need help in improving the cognizance of the dialogues in their surroundings. Currently, we plan to deploy the product in classrooms for the hard of hearing students to participate better in classroom discussions and to garner the professor’s lecture in a legible manner.

Therefore, our ideal target customer are hard of hearing students whose primary language of communication is English.

Global population with hearing disability is 360 million (5% of the total population). The entire hearing disability population can be segmented into two parts - completely deaf and hard of hearing. It is estimated that about 12% of the world population is completely deaf . 1

Hence, this opens up nearly 88% of the market, nearly 316 million which is the total servable market.

Out of this total addressable market, our target market would be the hard of hearing community who are educated.. We would refer to the l hearing impaired population who have 12+ years of education. It is assumed that this population understands English. Assuming that about 9% of the hard of hearing community is educated: Our target market : 9% of 316.8 million = 28,512,000

1  In 2011, 30 million out of 250 million were completely deaf which is around 12%. 

 

ReCap | A Real­time Captioning Solution for Class | 29 

5.2 Go to Market Strategy

Based on our detailed stakeholder analysis(enclosed in appendix ), we realized that the

schools and educational institutions are the maximum influencers of the value chain that

leads to our target market of hard of hearing students.. Hence, our product would be

offered to the students via the deaf schools and the disability centers in regular education

institutions catering to the hard of hearing. This also has an added advantage in that it

allows the verification of the end users and makes sure that the product is reaching the

intended user. This is important since this device can be misused by hearing students

where they could record class lectures and circulate the same.

5.3 Business Model

ReCap | A Real­time Captioning Solution for Class | 30 

The revenue model would be a monthly subscription at about $200 a month per user, and

a group discount can be as high as 50% off. Based on our research and study of the

expenses with current technology being used in deaf schools, our service will be disruptive

since it is a superior technology at a cheaper cost.

Our product will be an app that can be downloaded only by hard of hearing students on

their phone/ipad. Additionally, We intend to provide special mics to the professors to

accentuate the speech capturing process. The mic would be a one time buy for the

institutes.

There about 16000 education institutions across the world (assume that half of them have

at least 1 deaf student) and about 90 deaf schools (assume that they will buy the service

for all the students, and average number of students is 100) in the US alone. Assuming a

penetration of about 20% in two years, there will be 3.84 million annually (16000 * 50% *

20% * 200 * 12) from public schools and 2.16 million annually (90 * 20% * 100 * 100 * 12)

from deaf schools. For the one time mic/mic base set purchases, we will have $85600

(16000 * 50% * 20% * 40 + 90 * 20% * 30 * 40) for the first two years.

The detailed business model canvas is enclosed in the appendix.

5.4 Financial

Fixed and variable costs includes rent, utility bills, phone bills/communication costs,

accounting/bookkeeping, legal/insurance/licensing fees, postage, technology, advertising

& marketing, salaries Variable costs mainly are materials and supplies, packaging of the

mic.

ReCap | A Real­time Captioning Solution for Class | 31 

Customer demands of ReCap will be stably increasing for the initial period. Once a

customer(school or institute) has started using ReCap, it is hard for them to find a

replacement that will provide them more value.

The profit of first two years will be used as expanding our customer group in the first 2

years.

Profitability of ReCap’s is high. Because of the breakthrough technology ReCap based on,

even though providing a disruptive price, the profit margin of ReCap will still remain high.

ReCap expect to breakeven at in 1 year.

5.5 Funding Requirements

ReCap is seeking Seed funding in the amount of $1.2M for staffing purposes, purchasing

software and hardware computing equipment, office costs, and other Internet related

costs. This funding will also be used as developing the initial version of our products and

pilot test.

The company is also seeking Series A funding in the amount of $3 million for developing

more markets, improving product and building more web service facilities.

ReCap | A Real­time Captioning Solution for Class | 32 

6 Project Development Process

We took a top down approach to narrow down to identify the right product market fit. As a first

step, we brainstormed to identify at least 20 applications that could leverage on the speed and

accuracy which were the unique selling points(USP’s) of the speech processing platform. We took

a structured approach to filter out these applications to zero-in on the most lucrative

application that is viable and profitable. Each application was communicated with a use case

after which we analyzed the following information:

1. Is there an actual pain point that we are solving?

2. If yes, is the pain point big enough for the customer to buy our product? In other words,

are we giving the 10x difference?

3. How urgent is the need?

4. What are the market trends driving the industry?

5. Is the market saturated? How is the competitive landscape of the industry?

6. Is there a niche market that we could find in the big market?

7. Is it technically feasible?

Within the scope of the above list of information pointers, the first three were used as the most

important criteria for elimination/advancing with the application idea. The information was

obtained by actual stakeholder interviews relevant to the use case. After validating the

applications against market needs, we were able to eliminate most of the applications and come

down to the three applications which were promising. We used the second level filter of market

and competitor analysis to eliminate two other applications since it belonged to the home

automation space which was already saturated.

At the end of this entire process, we found that there was a pain point faced by the handicapped

particularly by the deaf and hard of hearing in terms of not being able to

communicate/understand conversations effectively since they could not hear properly. This was

indeed a big hindrance for them that renders them unemployable because of the handicap

ReCap | A Real­time Captioning Solution for Class | 33 

despite being well educated. We saw the potential use of our speech processing technology

which not only solves the pain point of the consumer but also leverages on the speed and

accuracy of the technology since real time translation requires such attributes. Therefore, an

ideal target-market fit was identified.

Initially, we had envisioned our product to be a device that allows a two way communication

where the deaf gestures into a phone/computer which converts to voice for a hearing person and

vice versa where the speech is converted into sign and appears as an avatar on the screen for the

deaf. However, in order to achieve this product entailed a huge scope in terms of technical

development. As an alternative, we explored possibilities where a one way communication can

be established by using a speech to text conversion.

We used end user inputs at every decision juncture in our commercialization process. For

example, we had assumed that our product provided more value only if it is bidirectional where

both hearing and deaf can communicate effectively. Contrary to our belief, the end user who was

a hard of hearing student from Tepper school of business, CMU seemed to find more value in a

simple speech to text application that would allow her to understand and comprehend

conversations better especially in noisy situations. This was a turning point in our

commercialization process where we had to pivot from catering to the completely deaf by

providing a high end gesture-speech-gesture application to a much simpler speech-text

conversion. This pivot literally changed the target market, the use case and the product design

that we had envisioned earlier resulting in a much finer product-market fit.

Our target market was now identified as the hard of hearing people who are not completely deaf

and who are educated and can speak fluently. Secondly, we realized that this one way

communication of speech to text is most effective in classrooms where the professor speaks in

majority and other interactions our minimal. Hence, our complete study thereof was centered

around the classroom scenario as the use case and the target market as the hard of hearing

students.

ReCap | A Real­time Captioning Solution for Class | 34 

7 Vision

7.1 ReCap as An Enterprise

ReCap would be a high tech firm with a core competency in speech processing platforms.

Since our first product caters to the handicapped, our company’s core mission and values

would be to continue helping the handicapped through the power of technology. We

believe that this section of people are often ignored and the power of technology has been

less harnessed in this area. Having said this, we envision ReCap to scale up quickly to a

mid size company with more intelligent product offerings under its gamut.

7.2 Exit Strategy

There is a general trend of the big players such as Microsoft, Google investing in

technologies for the handicapped. For example, Microsoft Research has partnered with a

firm in China to use Microsoft kinect as a means to convert sign to speech and speech to

sign. As a corollary of these activities, we foresee a possible trend of the big players

scouting for companies operating in this niche area either for acquisition or joint venture.

As a exit strategy, we believe that eventually we would come under this radar to be

acquired or have long term partnerships for future endeavours.

ReCap | A Real­time Captioning Solution for Class | 35 

Appendices

I Business canvas

II Survey from Western Pennsylvania Deaf School

We had provided survey forms to the deaf school to be filled by the faculty. We would be sending these separately once we have it.

III Stakeholder Analysis

ReCap | A Real­time Captioning Solution for Class | 36 

IV Things We Have Done

1. Idea brainstorming:

Top down approach; Came up with 20 ideas

2. Idea evaluation:

Validated the market needs secondary research and talking to end users

3. Narrowed down on deaf device

4. Came up with long term vision:

Two way communication, which is sign to speech and speech to sign

5. Broke down the vision into three phases:

ReCap | A Real­time Captioning Solution for Class | 37 

Speech -> Text, Speech <-> Text, Speech <-> Sign language

6. Evaluated technical feasibility

7. Interviewed end user to understand which phase had maximum value

8. Also identified the urgent need in this target market: educated hard of hearing people

9. Did market sizing to understand if market size is big enough

10. Identified use cases and target market to get a niche that had no competitors playing:

Classroom, seminars, conferences

11. Identified actual value to the end user - independence,security, quality transmission:

Our interview with the end user revealed some key aspects desired in a product which

are very valuable to the user. And those were independence, security and quality

transmission

12. Competitor analysis : Transcence, RogerVoice etc.

13. Understood current use of technology; CART, cochlear implant (previous voa)

14. Detailed voa on all three phases

15. Based on insights from above points: We narrowed down to phase I since it had

maximum value

16. Interviewed professors to check if they are ok with such a product in class

17. Stakeholder analysis to understand stakeholders who influence the end user:

parents/friends and educational institutions

18. Customer discovery phase II:

Went to deaf school in Pittsburgh to understand current technology used and challenges

faced

Interviewed official in office of disability to understand CART

19. Identified value for immediate customers; cheaper cost (existing technologies are at very

high cost and not great performance)

20. Developed a go to market strategy via school institutions to mitigate misuse of product;

initial roll out planned in early 2015

21. Product design: came up with sketch

22. First stage industrial design

23. Product cost estimation

24. A complete business model canvas

ReCap | A Real­time Captioning Solution for Class | 38 

V. VOA for Phase 2 & 3 Product (previous plan)

1. Introduction of the competitor product

The product we compare to is the sign language translator using microsoft kinect. In late 2013, the microsoft research group released a prototype using microsoft kinect that allows the deaf/hard of hearing to communicate with people outside of their community comfortably using sign. The envisioned product would convert the sign language to speech and text in real time and vice versa thus bridging the communication gap.The prototype uses the kinect as a depth camera to capture the gestures used in sign language. It is then converted into a speech and text for the hearing to understand. On the other hand, if the hearing speaks , the mic captures the speech and converts it into sign language for the deaf. This sign language is shown as an avatar on the screen for the deaf thus allowing seamless communication

2. Introduction of our product Our product, signtovoice, is a device that almost mimics an alongside interpreter for the deaf. With a high end technology that converts sign to speech and vice versa, the deaf can now communicate beyond their usual world in a method that they are most comfortable with- the sign language. THe software for the device can be downloaded as an app or web application on to a phone/tablet/desktop where the screen is used to see the sign and capture the sign. Additionally, a low cost small portable hardware would consist the 3D camera that captures the sign and processes it into speech.The hardware will be ergonomically designed that could be attached to the phone as a case. On the other end, the mic is used to capture the voice of the hearing and converts to sign which will appear on the screen of the portable device that the deaf is carrying. In this way, we are enabling the deaf communicate with the hearing with minimal additional hardware and a low latency translation.

3. VOA

Value Sub-value Product Low Medium High

Emotion

–Sense of adventure Kinect Sign Language Translator

Our product

ReCap | A Real­time Captioning Solution for Class | 39 

– Feel of independence

Kinect Sign Language Translator

Our product

– Sense of security Kinect Sign Language Translator

Our product

– Sensuality Kinect Sign Language Translator

Our product

– Confidence Kinect Sign Language Translator

Our product

– Power Kinect Sign Language Translator

Our product

Aesthetics

– Visual Kinect Sign Language Translator

Our product

– Tactile Kinect Sign Language Translator

NA

Our product NA

– Auditory Kinect Sign Language Translator

Our product

– Olfactory Kinect Sign Language Translator

NA

Our product NA

– Gustatory Kinect Sign Language Translator

NA

Our product NA

Impact

– Social Kinect Sign Language Translator

Our product

ReCap | A Real­time Captioning Solution for Class | 40 

– Environmental Kinect Sign Language Translator

Our product

Identity

– Personality Kinect Sign Language Translator

Our product

– Point in time Kinect Sign Language Translator

Our product

– Sense of place Kinect Sign Language Translator

Our product

Ergonomics

– Ease of use Kinect Sign Language Translator

Our product

– Safety Kinect Sign Language Translator

NA

Our product NA

– Comfort Kinect Sign Language Translator

Our product

Core Technology

– Reliable Kinect Sign Language Translator

Our product

– Enabling Kinect Sign Language Translator

Our product

Quality

Craftsmanship Kinect Sign Language Translator

Our product

ReCap | A Real­time Captioning Solution for Class | 41 

Durability Kinect Sign Language Translator

Our product

4. Explanation

Emotion

- Sense of adventure Microsoft Kinect and our technology both offer a similar sense of adventure as the primary functions of both the products are very similar, hence, both rank medium for sense of adventure.

- Feel of Independence Microsoft Kinect uses more hardware compared to our technology, so the overall feel of independence of our technology is more than that of Microsoft Kinect.

- Sense of Security Microsoft Kinect and our product induces an almost identical sense of security in the hearing impaired individual.

- Confidence Microsoft Kinect and our technology give high levels of confidence to the hearing impaired individual to communicate with the world. Both products make the user feel self-assured in a social scenario.

- Power Microsoft Kinect ranks higher on power than our technology because as microsoft is an internationally recognized brand, it makes the user feel more powerful. Aesthetics

- Visual

ReCap | A Real­time Captioning Solution for Class | 42 

Our product ranks higher than Microsoft Kinect from the visual aesthetics point of view, because it uses everyday electronic gadgets for inputs whereas Microsoft Kinect uses additional hardware.

- Auditory Our Technology and Microsoft Kinect rank moderate from the auditory aesthetics perspective. Both technologies have a AI software for converting text/sign language to speech and hence sound similar. Impact

- Social Our technology has a moderate social impact as we try and change the way the deaf community communicates with the world. Microsoft Kinect on the other hand ranks high on social impact because Microsoft being a globally recognized brand can impact the community on a greater scale with the amount of resources it possesses.

- Environment Both Microsoft Kinect and our product have low environmental impact as both products are mostly software based and hence do not require extensive hardware and manufacturing processes. Identity

- Personality Using Microsoft Kinect Sign Language Translator and using our product are similar. But due to the brand image of Microsoft and Kinect, the former choice gives the user a better personality.

- Point in time Both of them are good but not perfect in terms of timing. The technology is still developing, so the speed and accuracy are not good enough yet.

- Sense of place

ReCap | A Real­time Captioning Solution for Class | 43 

Kinect Sign Language Translator requires more hardware, which means it’s more cumbersome. Our product uses image processing and gesture recognition, so it’s more portable and more suitable in many places. Ergonomics

- Easy to use Both products are translating sign language without wearing rings, gloves or other gadgets, so it is quite easy to use. Given that Microsoft needs Kinect to recognize gestures, we consider Kinect Sign Language Translator is a little more difficult to use than our product.

- Safety Safety are not relevant to these two products.

- Comfort Kinect Sign Translator and our product are comfortable to use because people can use them naturally without the restraints from other gadgets like rings or gloves. Core technology

- Reliable Both products are high in technology reliable. Our technology is provided by Carnegie Mellon University which is a lead in this area. Kinect Sign Translator has a powerful support from the technology from Microsoft, which might be even stronger than ours.

- Enabling As mentioned above, both products has strong technology support, and Kinect might be even stronger than ours. Quality

ReCap | A Real­time Captioning Solution for Class | 44 

- Craftsmanship

Kinect Sign Translator and our product are both have high craftsmanship. Kinect has the manufacture support of Microsoft to produce high craftsmanship. Our product’s structure is simple to build, so finding a high craftsmanship supplier could be easy.

- Durability Our product has medium durability compared to Kinect Sign Translator’s low. Kinect Sign Translator high depends on the sensor of Kinect Devices and is not portable, which reduces the durability of the product, while our product is portable and easy-replaceable.

5. Conclusion From the value opportunity analyse, we can see that phase 2&3 our product has 5 items that has advantages compared to Microsoft Kinect, which are feel of independence, visual, sense of place, ease of use and durability. However our product has shortages in 3 items, which are power, social and personality. This shows that we might have a strong competition with Microsoft in the market, and in some hardly supercede Microsoft. So we decided to give up phase 2 & 3 and develop only phase 1 of our product.

ReCap | A Real­time Captioning Solution for Class | 45 

References

1. http://www.who.int/mediacentre/factsheets/fs300/en/ 2. http://www.gallaudet.edu/clerc_center/information_and_resources/info_to_go/resources/

websites_of_schools_and_programs_for_deaf_students_.html 3. Hands up for help! Giving deaf children a fair chance at school 2010 4. http://nces.ed.gov/pubs94/94394.pdf 5. http://www.academia.edu/1542352/Amazon_Closed_Captioning_Formal_Report 6. http://www.cs.rochester.edu/~sadilek/publications/Real-Time_Captioning_by_Groups_of_

Non-Experts_UIST-12.pdf 7. http://www.start-american-sign-language.com/dianrez.html 8. http://www.phonak.com/com/b2c/en/products.html 9. http://www.necc.mass.edu/academics/support-services/learning-accommodations/deaf-a

nd-hard-of-hearing-services/student-resources/accommodations-tipsheets/communication-access-realtime-translation/

10. https://www.youtube.com/watch?v=qn4B0gyDosA 11. https://graysdeafblog.wordpress.com/about/ 12. http://misskatsmom.blogspot.com/ 13. http://becomingdeaf.com/ 14. http://lizsdeafblog.blogspot.com/p/contact-me.html 15. http://campustechnology.com/articles/2014/10/06/ga-tech-google-glass-app-does-captio

ning.aspx

ReCap | A Real­time Captioning Solution for Class | 46 

A Real-time Captioning Solution for Class

ReCap | A Real­time Captioning Solution for Class | 47