Data on Learning to Speak

80,000 hours of video and 120,000 hours of audio recording.  That is the massive amount of data collected by MIT Media Lab‘s Human Speechome Project to analyze how one child learns to speak.  You can see a visual of the data on Forbes.com to get a feel for how words are gained over time.  Just clicking through the words and seeing the relationship between caregiver use and child use impresses me with the power of being able to collect such data.

“The Speechome team believes that this unique dataset may shed light on basic questions of language acquisition, at least as they pertain to one child. Why did he learn words in the order that he did? Why did he start putting certain words together into proto-sentences before others? In what contexts did he effectively use words? How long after he comprehends a word does he first use that word? How did the structure of everyday life at home influence language development? The research team at MIT is in the process of analyzing the audio and video recordings with the ultimate goal of addressing these questions.” according to the Forbes article Speech in the Home.