Voice is a single source of audio (e.g. mono) so it’s typically recorded with mono mics. There can be multiple of them (lav on the body, boom over the frame is the usual) but both sources are mono and will indeed be mixed right down the middle unless they’re trying to make the viewer understand the location of the person speaking (for example imagine you’re watching the main character from behind while they’re in their room using the computer, then you hear their mom talk to them off camera, the voice is coming from a side and then the next shot you see the mom was located on that side, stuff like that).
Another method home sound systems use is to boost the EQ where voice is found (somewhere in the middle), or to apply compression to reduce the dynamic range, for example Sonos offers both these options in their home theater line, but they call them “speech enhancement” and “night mode” respectively.
Voice is a single source of audio (e.g. mono) so it’s typically recorded with mono mics. There can be multiple of them (lav on the body, boom over the frame is the usual) but both sources are mono and will indeed be mixed right down the middle unless they’re trying to make the viewer understand the location of the person speaking (for example imagine you’re watching the main character from behind while they’re in their room using the computer, then you hear their mom talk to them off camera, the voice is coming from a side and then the next shot you see the mom was located on that side, stuff like that).
Another method home sound systems use is to boost the EQ where voice is found (somewhere in the middle), or to apply compression to reduce the dynamic range, for example Sonos offers both these options in their home theater line, but they call them “speech enhancement” and “night mode” respectively.