For streaming SPS and PPS have to be transferred for decode as well. When streaming with RTP or similar these are relayed by the application layer, which gets them from the encoder or from the start of the stream the encoder put out. I dont know if it is possible to insert SPS/PPS at random points at certain P frames with x264 out of the box. So you may need this first I frame for stream headers.
By the way you should use more reference frames than 0, more like 3 or 4.
|