Vision Tune: A Deep Learning Framework for Sentiment Driven Video, Image and Music Creation

Lukesh Rameshpant Kadu; Manoj Deshpande; Vijaykumar Pawar

doi:10.3844/jcssp.2026.1476.1483

Research Article Open Access

Vision Tune: A Deep Learning Framework for Sentiment Driven Video, Image and Music Creation

Lukesh Rameshpant Kadu¹, Manoj Deshpande¹ and Vijaykumar Pawar¹

¹ Department of Computer Engineering, AC Patil College of Engineering, Kharghar, Navi Mumbai, India

Abstract

Artificial intelligence has enabled powerful generative models for text, images, video, and music, yet most tools still operate independently without a unified, multi-modal workflow. This article proposes an integrated AI framework, Vision Tune, that consolidates these isolated capabilities into a single, sentiment-aware platform for end-to-end media creation. The system leverages deep learning and multi-scope AI models to automatically generate written content, images, videos, and music for both creative and analytical applications, while emphasizing scalability, modular design, and user-centric interaction. By supporting cross-domain media synthesis and sentiment-driven customization, the framework targets real-world use cases in marketing, education, entertainment, and content production, where coordinated multi-modal outputs can enhance engagement and productivity. Beyond unification, the work highlights how the proposed architecture advances current AI media pipelines by reducing tool fragmentation, enabling cross-modal consistency, and providing a foundation for future extensions such as real- time generation, personalization, and human AI collaborative creation.

Journal of Computer Science

Volume 22 No. 4, 2026, 1476-1483

DOI: https://doi.org/10.3844/jcssp.2026.1476.1483

Submitted On: 7 September 2025 Published On: 29 April 2026

How to Cite: Kadu, L. R., Deshpande, M. & Pawar, V. (2026). Vision Tune: A Deep Learning Framework for Sentiment Driven Video, Image and Music Creation. Journal of Computer Science, 22(4), 1476-1483. https://doi.org/10.3844/jcssp.2026.1476.1483

Copyright: © 2026 Lukesh Rameshpant Kadu, Manoj Deshpande and Vijaykumar Pawar. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

37 Views
11 Downloads
0 Citations

Download

Keywords

Artificial Intelligence
Text Generation
Image Synthesis
Video Production
Music Composition
Multi-Modal Media Generation