为什么将音频流添加到 ffmpeg 的 libavcodec 输出容器会导致崩溃？

Why does adding audio stream to ffmpeg's libavcodec output container cause a crash?

本文关键字：崩溃输出 libavcodec 音频流添加 ffmpeg 为什么更新时间：2024-03-29

目前，我的项目正确地使用libavcodec来解码视频，其中每个帧都被处理(无论如何)并输出到新的视频。我从网上找到的例子中拼凑出了这个，它很有效。结果是一个完美的.mp4的操纵帧，减去音频。

我的问题是，当我试图将音频流添加到输出容器时，我在mux.c中遇到了无法解释的崩溃。它在static int compute_muxer_pkt_fields(AVFormatContext *s, AVStream *st, AVPacket *pkt)中。在尝试st->internal->priv_pts->val = pkt->dts;的情况下，priv_pts为nullptr。

我不记得版本号了，但这是2020年11月4日从git构建的ffmpeg。

我的MediaContentMgr比我这里的要大得多。我去掉了所有与帧操作有关的内容，所以如果我遗漏了什么，请告诉我，我会编辑。

添加时触发nullptr异常的代码在内联中调用

.h:

#ifndef _API_EXAMPLE_H
#define _API_EXAMPLE_H
#include <glad/glad.h>
#include <GLFW/glfw3.h>
#include "glm/glm.hpp"
extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>
}
#include "shader_s.h"
class MediaContainerMgr {
public:
MediaContainerMgr(const std::string& infile, const std::string& vert, const std::string& frag, 
const glm::vec3* extents);
~MediaContainerMgr();
void render();
bool recording() { return m_recording; }
// Major thanks to "shi-yan" who helped make this possible:
// https://github.com/shi-yan/videosamples/blob/master/libavmp4encoding/main.cpp
bool init_video_output(const std::string& video_file_name, unsigned int width, unsigned int height);
bool output_video_frame(uint8_t* buf);
bool finalize_output();
private:
AVFormatContext*   m_format_context;
AVCodec*           m_video_codec;
AVCodec*           m_audio_codec;
AVCodecParameters* m_video_codec_parameters;
AVCodecParameters* m_audio_codec_parameters;
AVCodecContext*    m_codec_context;
AVFrame*           m_frame;
AVPacket*          m_packet;
uint32_t           m_video_stream_index;
uint32_t           m_audio_stream_index;

void init_rendering(const glm::vec3* extents);
int decode_packet();
// For writing the output video:
void free_output_assets();
bool                   m_recording;
AVOutputFormat*        m_output_format;
AVFormatContext*       m_output_format_context;
AVCodec*               m_output_video_codec;
AVCodecContext*        m_output_video_codec_context;
AVFrame*               m_output_video_frame;
SwsContext*            m_output_scale_context;
AVStream*              m_output_video_stream;

AVCodec*               m_output_audio_codec;
AVStream*              m_output_audio_stream;
AVCodecContext*        m_output_audio_codec_context;
};
#endif

还有，地狱般的.cpp：

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>
#include "media_container_manager.h"
MediaContainerMgr::MediaContainerMgr(const std::string& infile, const std::string& vert, const std::string& frag,
const glm::vec3* extents) :
m_video_stream_index(-1),
m_audio_stream_index(-1),
m_recording(false),
m_output_format(nullptr),
m_output_format_context(nullptr),
m_output_video_codec(nullptr),
m_output_video_codec_context(nullptr),
m_output_video_frame(nullptr),
m_output_scale_context(nullptr),
m_output_video_stream(nullptr)
{
// AVFormatContext holds header info from the format specified in the container:
m_format_context = avformat_alloc_context();
if (!m_format_context) {
throw "ERROR could not allocate memory for Format Context";
}

// open the file and read its header. Codecs are not opened here.
if (avformat_open_input(&m_format_context, infile.c_str(), NULL, NULL) != 0) {
throw "ERROR could not open input file for reading";
}
printf("format %s, duration %lldus, bit_rate %lldn", m_format_context->iformat->name, m_format_context->duration, m_format_context->bit_rate);
//read avPackets (?) from the avFormat (?) to get stream info. This populates format_context->streams.
if (avformat_find_stream_info(m_format_context, NULL) < 0) {
throw "ERROR could not get stream info";
}
for (unsigned int i = 0; i < m_format_context->nb_streams; i++) {
AVCodecParameters* local_codec_parameters = NULL;
local_codec_parameters = m_format_context->streams[i]->codecpar;
printf("AVStream->time base before open coded %d/%dn", m_format_context->streams[i]->time_base.num, m_format_context->streams[i]->time_base.den);
printf("AVStream->r_frame_rate before open coded %d/%dn", m_format_context->streams[i]->r_frame_rate.num, m_format_context->streams[i]->r_frame_rate.den);
printf("AVStream->start_time %" PRId64 "n", m_format_context->streams[i]->start_time);
printf("AVStream->duration %" PRId64 "n", m_format_context->streams[i]->duration);
printf("duration(s): %lfn", (float)m_format_context->streams[i]->duration / m_format_context->streams[i]->time_base.den * m_format_context->streams[i]->time_base.num);
AVCodec* local_codec = NULL;
local_codec = avcodec_find_decoder(local_codec_parameters->codec_id);
if (local_codec == NULL) {
throw "ERROR unsupported codec!";
}
if (local_codec_parameters->codec_type == AVMEDIA_TYPE_VIDEO) {
if (m_video_stream_index == -1) {
m_video_stream_index = i;
m_video_codec = local_codec;
m_video_codec_parameters = local_codec_parameters;
}
m_height = local_codec_parameters->height;
m_width = local_codec_parameters->width;
printf("Video Codec: resolution %dx%dn", m_width, m_height);
}
else if (local_codec_parameters->codec_type == AVMEDIA_TYPE_AUDIO) {
if (m_audio_stream_index == -1) {
m_audio_stream_index = i;
m_audio_codec = local_codec;
m_audio_codec_parameters = local_codec_parameters;
}
printf("Audio Codec: %d channels, sample rate %dn", local_codec_parameters->channels, local_codec_parameters->sample_rate);
}
printf("tCodec %s ID %d bit_rate %lldn", local_codec->name, local_codec->id, local_codec_parameters->bit_rate);
}
m_codec_context = avcodec_alloc_context3(m_video_codec);
if (!m_codec_context) {
throw "ERROR failed to allocate memory for AVCodecContext";
}
if (avcodec_parameters_to_context(m_codec_context, m_video_codec_parameters) < 0) {
throw "ERROR failed to copy codec params to codec context";
}
if (avcodec_open2(m_codec_context, m_video_codec, NULL) < 0) {
throw "ERROR avcodec_open2 failed to open codec";
}
m_frame = av_frame_alloc();
if (!m_frame) {
throw "ERROR failed to allocate AVFrame memory";
}
m_packet = av_packet_alloc();
if (!m_packet) {
throw "ERROR failed to allocate AVPacket memory";
}
}
MediaContainerMgr::~MediaContainerMgr() {
avformat_close_input(&m_format_context);
av_packet_free(&m_packet);
av_frame_free(&m_frame);
avcodec_free_context(&m_codec_context);

glDeleteVertexArrays(1, &m_VAO);
glDeleteBuffers(1, &m_VBO);
}

bool MediaContainerMgr::advance_frame() {
while (true) {
if (av_read_frame(m_format_context, m_packet) < 0) {
// Do we actually need to unref the packet if it failed?
av_packet_unref(m_packet);
continue;
//return false;
}
else {
if (m_packet->stream_index == m_video_stream_index) {
//printf("AVPacket->pts %" PRId64 "n", m_packet->pts);
int response = decode_packet();
av_packet_unref(m_packet);
if (response != 0) {
continue;
//return false;
}
return true;
}
else {
printf("m_packet->stream_index: %dn", m_packet->stream_index);
printf("  m_packet->pts: %lldn", m_packet->pts);
printf("  mpacket->size: %dn", m_packet->size);
if (m_recording) {
int err = 0;
//err = avcodec_send_packet(m_output_video_codec_context, m_packet);
printf("  encoding error: %dn", err);
}
}
}
// We're done with the packet (it's been unpacked to a frame), so deallocate & reset to defaults:
/*
if (m_frame == NULL)
return false;
if (m_frame->data[0] == NULL || m_frame->data[1] == NULL || m_frame->data[2] == NULL) {
printf("WARNING: null frame data");
continue;
}
*/
}
}
int MediaContainerMgr::decode_packet() {
// Supply raw packet data as input to a decoder
// https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3
int response = avcodec_send_packet(m_codec_context, m_packet);
if (response < 0) {
char buf[256];
av_strerror(response, buf, 256);
printf("Error while receiving a frame from the decoder: %sn", buf);
return response;
}
// Return decoded output data (into a frame) from a decoder
// https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c
response = avcodec_receive_frame(m_codec_context, m_frame);
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
return response;
} else if (response < 0) {
char buf[256];
av_strerror(response, buf, 256);
printf("Error while receiving a frame from the decoder: %sn", buf);
return response;
} else {
printf(
"Frame %d (type=%c, size=%d bytes) pts %lld key_frame %d [DTS %d]n",
m_codec_context->frame_number,
av_get_picture_type_char(m_frame->pict_type),
m_frame->pkt_size,
m_frame->pts,
m_frame->key_frame,
m_frame->coded_picture_number
);
}
return 0;
}

bool MediaContainerMgr::init_video_output(const std::string& video_file_name, unsigned int width, unsigned int height) {
if (m_recording)
return true;
m_recording = true;
advance_to(0L); // I've deleted the implmentation. Just seeks to beginning of vid. Works fine.
if (!(m_output_format = av_guess_format(nullptr, video_file_name.c_str(), nullptr))) {
printf("Cannot guess output format.n");
return false;
}
int err = avformat_alloc_output_context2(&m_output_format_context, m_output_format, nullptr, video_file_name.c_str());
if (err < 0) {
printf("Failed to allocate output context.n");
return false;
}
//TODO(P0): Break out the video and audio inits into their own methods.
m_output_video_codec = avcodec_find_encoder(m_output_format->video_codec);
if (!m_output_video_codec) {
printf("Failed to create video codec.n");
return false;
}
m_output_video_stream = avformat_new_stream(m_output_format_context, m_output_video_codec);
if (!m_output_video_stream) {
printf("Failed to find video format.n");
return false;
} 
m_output_video_codec_context = avcodec_alloc_context3(m_output_video_codec);
if (!m_output_video_codec_context) {
printf("Failed to create video codec context.n");
return(false);
}
m_output_video_stream->codecpar->codec_id = m_output_format->video_codec;
m_output_video_stream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
m_output_video_stream->codecpar->width = width;
m_output_video_stream->codecpar->height = height;
m_output_video_stream->codecpar->format = AV_PIX_FMT_YUV420P;
// Use the same bit rate as the input stream.
m_output_video_stream->codecpar->bit_rate = m_format_context->streams[m_video_stream_index]->codecpar->bit_rate;
m_output_video_stream->avg_frame_rate = m_format_context->streams[m_video_stream_index]->avg_frame_rate;
avcodec_parameters_to_context(m_output_video_codec_context, m_output_video_stream->codecpar);
m_output_video_codec_context->time_base = m_format_context->streams[m_video_stream_index]->time_base;

//TODO(P1): Set these to match the input stream?
m_output_video_codec_context->max_b_frames = 2;
m_output_video_codec_context->gop_size = 12;
m_output_video_codec_context->framerate = m_format_context->streams[m_video_stream_index]->r_frame_rate;
//m_output_codec_context->refcounted_frames = 0;
if (m_output_video_stream->codecpar->codec_id == AV_CODEC_ID_H264) {
av_opt_set(m_output_video_codec_context, "preset", "ultrafast", 0);
} else if (m_output_video_stream->codecpar->codec_id == AV_CODEC_ID_H265) {
av_opt_set(m_output_video_codec_context, "preset", "ultrafast", 0);
} else {
av_opt_set_int(m_output_video_codec_context, "lossless", 1, 0);
}
avcodec_parameters_from_context(m_output_video_stream->codecpar, m_output_video_codec_context);
m_output_audio_codec = avcodec_find_encoder(m_output_format->audio_codec);
if (!m_output_audio_codec) {
printf("Failed to create audio codec.n");
return false;
}

我已经注释掉了下一行之后的所有音频流init，因为这是麻烦开始了。创建这个输出流会导致我提到的空引用。如果我取消注释下面的所有内容，我仍然得到null的deref。如果我评论掉这句话deref异常消失。(IOW，我注释了越来越多的代码，直到我发现是导致问题的导火索。)

我认为我在注释掉的其余代码中做错了什么，修复后，将修复nullptr并给我一个工作音频流。

m_output_audio_stream = avformat_new_stream(m_output_format_context, m_output_audio_codec);
if (!m_output_audio_stream) {
printf("Failed to find audio format.n");
return false;
}
/*
m_output_audio_codec_context = avcodec_alloc_context3(m_output_audio_codec);
if (!m_output_audio_codec_context) {
printf("Failed to create audio codec context.n");
return(false);
}
m_output_audio_stream->codecpar->codec_id = m_output_format->audio_codec;
m_output_audio_stream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
m_output_audio_stream->codecpar->format = m_format_context->streams[m_audio_stream_index]->codecpar->format;
m_output_audio_stream->codecpar->bit_rate = m_format_context->streams[m_audio_stream_index]->codecpar->bit_rate;
m_output_audio_stream->avg_frame_rate = m_format_context->streams[m_audio_stream_index]->avg_frame_rate;
avcodec_parameters_to_context(m_output_audio_codec_context, m_output_audio_stream->codecpar);
m_output_audio_codec_context->time_base = m_format_context->streams[m_audio_stream_index]->time_base;
*/
//TODO(P2): Free assets that have been allocated.
err = avcodec_open2(m_output_video_codec_context, m_output_video_codec, nullptr);
if (err < 0) {
printf("Failed to open codec.n");
return false;
}
if (!(m_output_format->flags & AVFMT_NOFILE)) {
err = avio_open(&m_output_format_context->pb, video_file_name.c_str(), AVIO_FLAG_WRITE);
if (err < 0) {
printf("Failed to open output file.");
return false;
}
}
err = avformat_write_header(m_output_format_context, NULL);
if (err < 0) {
printf("Failed to write header.n");
return false;
}
av_dump_format(m_output_format_context, 0, video_file_name.c_str(), 1);
return true;
}

//TODO(P2): make this a member. (Thanks to https://emvlo.wordpress.com/2016/03/10/sws_scale/)
void PrepareFlipFrameJ420(AVFrame* pFrame) {
for (int i = 0; i < 4; i++) {
if (i)
pFrame->data[i] += pFrame->linesize[i] * ((pFrame->height >> 1) - 1);
else
pFrame->data[i] += pFrame->linesize[i] * (pFrame->height - 1);
pFrame->linesize[i] = -pFrame->linesize[i];
}
}

这就是我们获取一个修改后的帧并将其写入输出容器的地方。这很好只要我们没有在输出容器中设置音频流。

bool MediaContainerMgr::output_video_frame(uint8_t* buf) {
int err;
if (!m_output_video_frame) {
m_output_video_frame = av_frame_alloc();
m_output_video_frame->format = AV_PIX_FMT_YUV420P;
m_output_video_frame->width = m_output_video_codec_context->width;
m_output_video_frame->height = m_output_video_codec_context->height;
err = av_frame_get_buffer(m_output_video_frame, 32);
if (err < 0) {
printf("Failed to allocate output frame.n");
return false;
}
}
if (!m_output_scale_context) {
m_output_scale_context = sws_getContext(m_output_video_codec_context->width, m_output_video_codec_context->height, 
AV_PIX_FMT_RGB24,
m_output_video_codec_context->width, m_output_video_codec_context->height, 
AV_PIX_FMT_YUV420P, SWS_BICUBIC, nullptr, nullptr, nullptr);
}
int inLinesize[1] = { 3 * m_output_video_codec_context->width };
sws_scale(m_output_scale_context, (const uint8_t* const*)&buf, inLinesize, 0, m_output_video_codec_context->height,
m_output_video_frame->data, m_output_video_frame->linesize);
PrepareFlipFrameJ420(m_output_video_frame);
//TODO(P0): Switch m_frame to be m_input_video_frame so I don't end up using the presentation timestamp from
//          an audio frame if I threadify the frame reading.
m_output_video_frame->pts = m_frame->pts;
printf("Output PTS: %d, time_base: %d/%dn", m_output_video_frame->pts,
m_output_video_codec_context->time_base.num, m_output_video_codec_context->time_base.den);
err = avcodec_send_frame(m_output_video_codec_context, m_output_video_frame);
if (err < 0) {
printf("  ERROR sending new video frame output: ");
switch (err) {
case AVERROR(EAGAIN):
printf("AVERROR(EAGAIN): %dn", err);
break;
case AVERROR_EOF:
printf("AVERROR_EOF: %dn", err);
break;
case AVERROR(EINVAL):
printf("AVERROR(EINVAL): %dn", err);
break;
case AVERROR(ENOMEM):
printf("AVERROR(ENOMEM): %dn", err);
break;
}
return false;
}
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
pkt.flags |= AV_PKT_FLAG_KEY;
int ret = 0;
if ((ret = avcodec_receive_packet(m_output_video_codec_context, &pkt)) == 0) {
static int counter = 0;
printf("pkt.key: 0x%08x, pkt.size: %d, counter:n", pkt.flags & AV_PKT_FLAG_KEY, pkt.size, counter++);
uint8_t* size = ((uint8_t*)pkt.data);
printf("sizes: %d %d %d %d %d %d %d %d %dn", size[0], size[1], size[2], size[2], size[3], size[4], size[5], size[6], size[7]);
av_interleaved_write_frame(m_output_format_context, &pkt);
}
printf("push: %dn", ret);
av_packet_unref(&pkt);
return true;
}
bool MediaContainerMgr::finalize_output() {
if (!m_recording)
return true;
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
for (;;) {
avcodec_send_frame(m_output_video_codec_context, nullptr);
if (avcodec_receive_packet(m_output_video_codec_context, &pkt) == 0) {
av_interleaved_write_frame(m_output_format_context, &pkt);
printf("final push:n");
} else {
break;
}
}
av_packet_unref(&pkt);
av_write_trailer(m_output_format_context);
if (!(m_output_format->flags & AVFMT_NOFILE)) {
int err = avio_close(m_output_format_context->pb);
if (err < 0) {
printf("Failed to close file. err: %dn", err);
return false;
}
}
return true;
}

编辑崩溃时的调用堆栈(我本应将其包含在原始问题中)：

avformat-58.dll!compute_muxer_pkt_fields(AVFormatContext * s, AVStream * st, AVPacket * pkt) Line 630   C
avformat-58.dll!write_packet_common(AVFormatContext * s, AVStream * st, AVPacket * pkt, int interleaved) Line 1122  C
avformat-58.dll!write_packets_common(AVFormatContext * s, AVPacket * pkt, int interleaved) Line 1186    C
avformat-58.dll!av_interleaved_write_frame(AVFormatContext * s, AVPacket * pkt) Line 1241   C
CamBot.exe!MediaContainerMgr::output_video_frame(unsigned char * buf) Line 553  C++
CamBot.exe!main() Line 240  C++

如果我将调用移动到avformat_write_header，使其紧接在音频流初始化之前，我仍然会崩溃，但发生在不同的地方。崩溃发生在movenc.c的6459线上，我们有：

/* Non-seekable output is ok if using fragmentation. If ism_lookahead
* is enabled, we don't support non-seekable output at all. */
if (!(s->pb->seekable & AVIO_SEEKABLE_NORMAL) &&  //  CRASH IS HERE
(!(mov->flags & FF_MOV_FLAG_FRAGMENT) || mov->ism_lookahead)) {
av_log(s, AV_LOG_ERROR, "muxer does not support non seekable outputn");
return AVERROR(EINVAL);
}

该异常是nullptr异常，其中s->pb为空。调用堆栈为：

avformat-58.dll!mov_init(AVFormatContext * s) Line 6459 C
avformat-58.dll!init_muxer(AVFormatContext * s, AVDictionary * * options) Line 407  C
[Inline Frame] avformat-58.dll!avformat_init_output(AVFormatContext *) Line 489 C
avformat-58.dll!avformat_write_header(AVFormatContext * s, AVDictionary * * options) Line 512   C
CamBot.exe!MediaContainerMgr::init_video_output(const std::string & video_file_name, unsigned int width, unsigned int height) Line 424  C++
CamBot.exe!main() Line 183  C++

请注意，您应该始终尝试提供一个自包含的最小工作示例，以便其他人更容易提供帮助。对于实际的代码、匹配的FFmpeg版本和触发分段故障的输入视频(可以肯定的是)，问题将是分析控制流以确定为什么没有分配st->internal->priv_pts。如果没有完整的场景，我必须报告做出可能与您的实际代码对应或不对应的假设。

根据你的描述，我试图通过克隆来复制这个问题https://github.com/FFmpeg/FFmpeg.git并从commit b52e0d95(2020年11月4日)创建一个新分支，以近似您的FFmpeg版本。

我使用提供的代码片段重新创建了您的场景

包括音频流的avformat_new_stream()调用
保留剩余的音频初始化注释
包括原始avformat_write_header()呼叫站点(顺序不变)

在这种情况下，使用MP4视频/音频输入的视频写入在avformat_write_header():中失败

[mp4 @ 0x2b39f40] sample rate not set 0

错误位置的调用堆栈：

#0  0x00007ffff75253d7 in raise () from /lib64/libc.so.6
#1  0x00007ffff7526ac8 in abort () from /lib64/libc.so.6
#2  0x000000000094feca in init_muxer (s=0x2b39f40, options=0x0) at libavformat/mux.c:309
#3  0x00000000009508f4 in avformat_init_output (s=0x2b39f40, options=0x0) at libavformat/mux.c:490
#4  0x0000000000950a10 in avformat_write_header (s=0x2b39f40, options=0x0) at libavformat/mux.c:514
[...]

在init_muxer()中，无条件地检查流参数中的采样率：

case AVMEDIA_TYPE_AUDIO:
if (par->sample_rate <= 0) {
av_log(s, AV_LOG_ERROR, "sample rate not set %dn", par->sample_rate); abort();
ret = AVERROR(EINVAL);
goto fail;
}

这种情况至少从2014-06-18年开始生效(没有再追溯)，现在仍然存在。对于2020年11月起的版本，检查必须处于活动状态，并且必须相应地设置参数。

如果我取消注释剩余的音频初始化，情况将保持不变(如预期)。因此，满足条件后，我添加了缺失的参数如下：

m_output_audio_stream->codecpar->sample_rate =
m_format_context->streams[m_audio_stream_index]->codecpar->sample_rate;

检查成功，avformat_write_header()成功，实际视频写入成功。

正如您在问题中所指出的，分段故障是由st->internal->priv_pts在此位置为NULL引起的：

#0  0x00000000009516db in compute_muxer_pkt_fields (s=0x2b39f40, st=0x2b3a580, pkt=0x7fffffffe2d0) at libavformat/mux.c:632
#1  0x0000000000953128 in write_packet_common (s=0x2b39f40, st=0x2b3a580, pkt=0x7fffffffe2d0, interleaved=1) at libavformat/mux.c:1125
#2  0x0000000000953473 in write_packets_common (s=0x2b39f40, pkt=0x7fffffffe2d0, interleaved=1) at libavformat/mux.c:1188
#3  0x0000000000953634 in av_interleaved_write_frame (s=0x2b39f40, pkt=0x7fffffffe2d0) at libavformat/mux.c:1243
[...]

在FFmpeg代码库中，priv_pts的分配由init_pts()为上下文引用的所有流处理。init_pts()有两个呼叫站点：

libavformat/mux.c:496:

if (s->oformat->init && ret) {
if ((ret = init_pts(s)) < 0)
return ret;

return AVSTREAM_INIT_IN_INIT_OUTPUT;
}

libavformat/mux.c:530:

if (!s->internal->streams_initialized) {
if ((ret = init_pts(s)) < 0)
goto fail;
}

在这两种情况下，呼叫都由avformat_write_header()触发(第一次通过avformat_init_output()间接触发，第二次直接触发)。根据控制流分析，不存在未分配priv_pts的成功案例。

考虑到我们的FFmpeg版本在行为方面兼容的可能性很高，我不得不假设1)必须为音频流提供采样率，并且1)在没有错误的情况下，priv_pts总是由avformat_write_header()分配。因此，脑海中浮现出两个可能的根本原因：

您的流不是音频流(不太可能；类型基于编解码器，而编解码器又基于输出文件扩展名-假设为mp4)
您没有调用avformat_write_header()(不太可能)，或者没有处理C++成员函数调用方中的错误(已检查avformat_write_header()的返回值，但我没有与C++成员函数的调用方对应的代码；您的实际代码可能与提供的代码有很大差异，因此这是可能的，也是从可用数据中得出的唯一合理结论)

解决方案：如果avformat_write_header()失败，请确保处理不会继续。通过添加音频流，avformat_write_header()开始失败，除非您设置流采样率。如果忽略该错误，则av_interleaved_write_frame()通过访问未分配的st->internal->priv_pts来触发分段故障。

如前所述，场景是不完整的。如果您确实调用了avformat_write_header()并在出现错误时停止处理(意味着您没有调用av_interleaved_write_frame())，则需要更多信息。就目前情况来看，这不太可能。为了进行进一步的分析，需要可执行输出(stdout、stderr)来查看跟踪和FFmpeg日志消息。如果这不能揭示新的信息，那么需要一个独立的最小工作示例和视频输入来获得所有的全貌。