Capture speakers names and add it to transcriptions