top of page


 a Google Glass application as a cognitive tool to enhance second language acquisition (2014)


Role: all content of the full paper, such as user interface, flowchart, persona, scenario design, research on theories, prototyping, and development of the application for Google Glass, is conducted by the author: DoHyun Kim 


Design principles are based on design references & guidelines on Google Developers website and cognitive load theories, such as schematic learning, dual channel, and color-coding principles.


Google Glass development kit: Preview version (Latest version)

Progression: the real time captioning, taking pictures, and vocabulary editing are workable prototype (2014 Dec)

Main references for programming: 

Tang, J. (2014). Beginning Google Glass Development

Apress Scheel, M. (2014). Software development for Google Glass

Redmond, E. (2013). Programming Google Glass,

Brian Hardy & Bill Phillips (2013). Android Programming 

and more Google Developers website, and GitHub


​Introduction videos:

Video shot and edited by the author, DoHyun Kim 

Software: FInal cut pro 10.13 

Outro full paper

Introduction video

About the Video:

This video clip shows one of the examples of using Outro for language learning (user walkthourgh). 

In this video, outro helps people make a better conversation flow in face-to-face talk with the aid of technology, speech-to-text on the screen. The application provides definitions of target words users added when they encount the words in real time conversations. In the video the target word is "bran." In addition, users can take a photo by a gesture command in seconds. These data, text and pictures, can be saved and shared with other applications and devices, such as Evernote on desktops. By using diverse features on Evernote, students can organize and share their notes from Google glass more effectively than using only Outro.


1.Outro's approaches to solving problems associate with language learning


Why (Educational perspective): There are more than 800,000 international students in the United States who have to learn English for work or study. Although they devote countless hours trying to acquire proficiency in English, they often do not achieve great success. Many of them seldom socialize with natives due to the high language barrier. They have to struggle with thses frustrations every day, so I was inspired to start this project to help address this problem.


There are three main problems with authentic language learning: 

1.A lack of socialization and interactions with natives (Krashen, 2003)


2.A lack of in-time and on-demand learning with immediate feedback (Chapelle, 1997; Gass, 1997)


3. A lack of situated learning for authentic expressions with rich context (Chang, 2014; Mayer, 2005) 


What (Technology perspective): Google Glass can provide mainly two advantages to language learners: a see-through display and a user interface, which can prevent interruptions in the flow of conversation. Unlike smartphones, the transparent prism display enables users to view either a heads-up display of information or their surroundings and they can maintain eye contact with the person with whom they are talking. Also, the interfaces will help the learners pay attention to the real world instead of being distracted from it by micro-interactions. The reduced access time to get proper information would be a considerable advantage to language learners.


How (Design perspective): Outro for Google glass is a cognitive tool that enhances people’s ability to achieve goals that usually they cannot do without the tool. It does not provide specific learning content, but it encourages users to perform authentic learning in real-life situations with aid from technology. This was an approach to solve a root problem inherent in many other language learning methods, allowing users to learn the target information in the actual environment instead of an artificial or online environment. 



2. Structure, architecture, and systems of the application: Outro

Outro, has four main features: ShowMe, MyNotes, Vocab, and Wink.

A. ShowMe provides real-time captioning to encourage face-to-face conversations.

B. MyNotes is a place where users can save and review their dialogues made by real-time captioning.

C. Vocab allows L2 learners to edit their vocabulary in Google Glass without an Internet connection.

D. Wink is a feature that allows L2 learners to take a photo by using the gesture command, “wink.”


ShowMe and Wink and some other features linked will be described in this section.


FIgure1. ShowMe 
It shows how ShowMe feature, real time captioning, on Outro looks like when students use the feature in conversation. 
Figure 2. Wink 
It supports learners to take a photo by a gesture command. This feature is set as a super (override) class, which means that wherever learners are in the application, they can use this feature to take a photo and then use Captioning. Using Captioning activates Google Glass’s voice recognition feature. They can speak out the sentences to add them to the picture. The annotated picture can be shared with other applications and devices, such as Evernote or Facebook on desktop or tablets .
FIgure 3: the combination of ShowMe, Wink, and Vignette 
With these features, users can make annotated pictures. Once a user makes a comment through real-time captioning, s/he can take a picture by winking, then combine the text with the picture by the Vignette feature. Since the Wink feature is a super-class(overide) in the appliation, users can take a photo regardless of where and when they are in the application.
FIgure 4. the combination of Vocab and ShowMe. 
Add in Vocab, vocabulary database, and ShowMe, real-time captioning, is correlated. Specifically, users can use words in Vocab as a database of definitions of the target words with real-time captioning. To get definitions of target words in real time captioning, vocabulary has to be added into Vocab before real time captioning. The target word in the image above is "Starbucks," and the information about the word is on display at the bottom.
FIgure 5. Collaboration of Evernote and Outro 
Text and pictures created by the Glass can be shared with other applications and devices, such as Evernote on desktops and tablets. The picture on the top is an image before edited, and the picture on the bottom is the edited image on Evernote. Learners may add more descriptions or highlight parts by using arrows with varous colors. Outro users can use the well designed features of other applications, scuh as editing and organizing their notes.
3. Theoretical foundations
Outro is built on theoretical foundations: second language acquisition, multimedia learning, mobile assisted learning, and cognitive load theories. The hypothesises & principles can explain how users can improve on their second language acquisition by using the features in Outro. The slides will explain core parts of the theoretical framworks. 
Please refer to my presentation in the video below or the full paper to have more infroamtion about the theories.
Chappelle's hypothesis1

Chappelle's hypothesis1

Chappelle's hypothesis2

Chappelle's hypothesis2

Mayer hypothesis1

Mayer hypothesis1

Mayer hypothesis2

Mayer hypothesis2

Mobile Assisted Language Learning1

Mobile Assisted Language Learning1

Mobile Assisted Language Learning2

Mobile Assisted Language Learning2

Mobile Assisted Language Learning3

Mobile Assisted Language Learning3

Advantages of using Outro

Advantages of using Outro

Presentation about Outro with a live demo

About the video:
DMDL MA Thesis Colloquium, December 12, 2014, NYU MAGNET Town Hall. OUTRO: A Google Glass Application, Do Hyun Kim


4. User scenario and persona
Two user scenario and personas will be introduced: Tae-Yang Lee is an engineer in California, and Amy Zhou is a chef at a Chinese restaurant in New York. These personas are created to typify two target users, and to put developers and designers into their shoes. Their occupation and education are based on demographics of our target audience: Korean and Chinese students or workers in the United States. Personas are archetypes built to identify our real users’ profiles, needs, wants and expectations in order to design the right experience for them. (Full scenarios is in the full paper)
Scenario A

Lee is having trouble learning American idiomatic and colloquial expressions due to his limited cultural experience in the States. He is eager to improve his English for enhanced communication at work. He starts with a mobile app that defines common idioms and expressions for ESL students.


However, there is a problem in learning idioms with a smartphone app. Since there are thousands of idioms in his references, he does not know where to start. Even if he could memorize all of those idioms, he may never use most of them. He believes that it would be a waste of time to learn idioms or expressions that are unrelated to his job. He needs customized and personalized content and so he turns to Outro for Google Glass.

Before attending a meeting at work, he turns on the device and launches the application. The application starts voice dictation what he and everyone else in the meeting are saying in a text form on the display on the Glass in real time. Lee can look at the display to read what his colleagues are saying.


By using the devicce, Lee is able to understand of most of the items at the meeting. In the middle of the meeting, however, his manager says, “We’re gonna discuss about if we can start new business. We are interested in developing applications for smart watches, a wearable computer, if you will.” His colleague, Tom, responds, “I think we have fingers in too many pies. We have limited resources, so we should focus.” Although Lee can read what his colleagues are saying, he does not know what the idioms “if you will” and “fingers in too many pies” mean. Such expressions often make it difficult for him to communicate with his colleagues. Obviously, he cannot follow up the meeting after this, and he misses much of the significant information for work at the meeting (see figure).


Lee is able to read the dialogue with Outro. Outro not only provides real time captioning, but also saves the dialogue with Save to My Notes. After the meeting, Lee just taps the touch pad on the right side of the Glass to stop and save the real-time captioning. Then, he swipes forward and backward to Share and sends it to Evernote (see figure). 

When Lee returns to his desk, he uses a desktop computer to check the note he just saved on Evernote (see figure).

He can review the note and all items that he has to know for his work. In addition, he is able to find the parts he did not understand at the meeting, and look up online references or he can ask his native friends with the notes. Evernote provides an editing tool so that he can add, remove, save, and share his vocabulary and dialogues. He uses combination of diverse applications and features, especially when he wants to add a lot of words and change color, font, and size of texts. To apply these changes to his devices, he pushes the sync button on the left top corner of the Evernote window.

Scenario B

Most vocabulary books do not provide rich contexts with pictures, audio effects, and full dialogue, but simple definitions and a few sample sentences. Amy finds English difficult and confusing. She needs a way to organize and remember new vocabulary words. She uses Google Glass and Outro.

After she arrives at the restaurant where she works, she has to talk with other American chefs while cooking for the diners. However it is difficult for her to communicate with her American colleagues because the names of the ingredients in the States are different from the ones used in China. She has to learn the English names of the ingredients. Google Glass gesture and voice comments enable her to cook and learn English at the same time. She turns her device on and launches the real-time captioning. Whenever she finds an ingredient whose name she has to know in English, she uses Wink to take a photo of it and annotates it in English.


One day Jennifer asks Amy for some star anise. Amy does not know what star anise is, so she checks the word on the Glass screen and asks, “What is star anise?” Jennifer points at the spice and says, “We call this star anise.” At the moment, Amy uses Wink to take a photo of it, adds “Star anise,” and saves it into MyNotes (see figure).     

Jennifer then needs Amy to help her make soup. “Amy, can you trim the bamboo shoots and Chicken over there?”  Since Amy has already added the words, bamboo shoots, to her vocabulary database, she can see the description of it in real time. Her Glass immediately provides the information in both English and Chinese, “Bamboo shoots 竹 笋” (see figure). Amy answers, “I am working on it!”  She is getting more comfortable speaking English with the aid of Outro. She does not have to go to ESL classes, since she can learn English by using it.

The entire flowchart of Outro 

High Fidelity concept flowchart:

White-colored cards stand for menu pages,

Blue-colored cards stand for gesture-based control,

Yellow-colored cards stand for options that users have to choose, and

Green-colored cards stand for what events goes on of the display

5. Evaluation & beta test
In order to evaluate and improve upon Outro, interviews and beta test were conducted with a group of designers, engineers, teachers, and ESL students. 
Ruth Sherman is a graduate student in the master’s program in Digital Media Design for Learning at New York University:  
“Voice recognition is very accurate than I thought.” 

"How you gonna manage heavy text on the small screen?” 

A Graduate student at the NYU Polytechnic School of Engineering​. Rui Quo majored in computer science:
“Because my pronunciation is not good, voice recognition and control do not work well for me...” 
"I think Cortana (Windowphone's intelligent personal assistant) is better for me."
Siyad Dhaihalah is an international student at the American Language Institute in New York:
“What if multiple people are speaking at the same time?”
“What if speaker and I have some distance?”
5. Future research topics:
Areas of improvement for Outro:
1. accuracy of voice recognition
2. efficiency of managing a vocabulary database 
3. UI design to reduce the steps to help users finishing their tasks in shorter time
From the users' voices:
1. More language support like Chinese or Spanish
2. External microphone to recognize voices from the people at a distance.
Long-term goals: 
1. If it can be applicable for other purposes such as helping disabled people
2. If it can be used with the technology in health/ fitness kits 
3. If it can be interact with other devices and tools, like cars built on Android OS. 
bottom of page