Up: Full Project Video
Down: System Demo Set Up

AR Captioning for Deaf or Hard of Hearing Students in Lectures Using Semantic Aids

In Submission for IEEE VR
Role: Lead Researcher and Developer

How can we help Deaf and Hard of Hearing (DHH) students learn in college lecture?
CapAR is a real-time augmented reality (AR) captioning system that leverages Large Language Models (LLMs) to present Semantic Aid Visualizations on captions and slides, such as contextual highlights and real-time explanations of unfamiliar terms. By evaluational study with DHH students, our system is shown to enhance learning gain and engagement, ultimately improving learning outcomes for DHH students in classroom settings.

Description:
Deaf and hard of hearing (DHH) students face significant challenges in classrooms dominated by spoken communication. They constantly switch their focus between the instructor and the Interpretation Services (sign language, captions, etc.), often led to a high cognitive load or even falling behind their peers. Augmented Reality (AR) head-worn displays present a promising solution by positioning captions closer to lecture content, but current tools primarily focus on basic transcription, lacking integration with instructional material.

To address the gap, we present CapAR, an Augmented Reality (AR) system designed to enhance learning for Deaf and Hard of Hearing(DHH) students by providing semantic visualization aid to improve learning experience in lecture classrooms. Our results indicate that keyword high-lights helped users to keep pace with the instructor, and the additional explanation provides useful and accurate information during lectures.

(Using: Swift, Python, OpenAI, OCR, NLP, HTTP, Apple Vision Pro, RealityKit, UIKit, ARKit, AVFoundation)

#Augmented Reality, LLMs, Learning Tool, Accessibility

Full Paper:
Link


Visual Walkthrough

We present CapAR, an Augmented Reality (AR) system designed to enhance learning for Deaf and Hard of Hearing(DHH) students by providing semantic visualization aid to improve learning experience in lecture classrooms.


System Design of CapAR

The CapAR system captures the instructor’s speech, converts it to real-time captions, and displays it on the user’s AR headset. The Processing Module matches keywords from pre-processed lecture slides with the captions, and relevant terms are highlighted on both the slides and captions, with keyword explanations provided.