Hi Daniel,I think this normaly happen when there are a lot of people or the room is noisy, the cameras may be delayed finding the person who are speaking, and if the person speaks low is a little bit more difficult for the Speaker Track algoritm.Best...