Abstract:The current evaluation of AIGC tools is mostly based on the measurement of general technology applications, and the evaluation results by model evaluation toolkits, benchmark datasets, evaluation platforms, and other methods are inconsistent, lacking characteristic analyses of media application contexts and elements such as ideological framework disciplinary norms, and artistic aesthetics, which are insufficient for the development of media business. The performance analysis of AIGC tools in the media field needs to be carried out in specific business scenarios. Based on a survey of media workers’ use of AIGC tools, we set up an interdisciplinary evaluation team and selected 21 large models for subjective and objective analysis to provide professional guidance for media workers’ efficient use of AIGC tools. It is found that Tiangong AI 3.0, Wenxin Big Model 4.0 Turbo, and Doubao AI scored high in text comprehension, text generation, and domain knowledge; GPT-4o and Dreaming AI perform well in machine vision and image generation; KIMI, GPT-4o, Stable Diffusion2.0, Pika2.0 and Jimeng AI perform outstandingly in literary script, shooting outline, storyboard script, video generation, and automatic editing tasks in sequence; Tiangong AI 3.0 performs well in speech recognition and music recognition tasks, Murf AI excels in speech synthesis, and Suno AI V4 in humanities and music. The large models have shown domain knowledge errors and difficulty in generating specific tasks. It is necessary to enhance their core professional data training, drive the model to widely integrate with media application scenarios, and continuously improve their performance in media production.