Enhancing Image Captioning and Auto-Tagging Through a FCLN with Faster R-CNN Integration
Journal Title: Information Dynamics and Applications - Year 2024, Vol 3, Issue 1
Abstract
In the realm of automated image captioning, which entails generating descriptive text for images, the fusion of Natural Language Processing (NLP) and computer vision techniques is paramount. This study introduces the Fully Convolutional Localization Network (FCLN), a novel approach that concurrently addresses localization and description tasks within a singular forward pass. It maintains spatial information and avoids detail loss, streamlining the training process with consistent optimization. The foundation of FCLN is laid by a Convolutional Neural Network (CNN), adept at extracting salient image features. Central to this architecture is a Localization Layer, pivotal in precise object detection and caption generation. The FCLN architecture amalgamates a region detection network, reminiscent of Faster Region-CNN (R-CNN), with a captioning network. This synergy enables the production of contextually meaningful image captions. The incorporation of the Faster R-CNN framework facilitates region-based object detection, offering precise contextual understanding and inter-object relationships. Concurrently, a Long Short-Term Memory (LSTM) network is employed for generating captions. This integration yields superior performance in caption accuracy, particularly in complex scenes. Evaluations conducted on the Microsoft Common Objects in Context (MS COCO) test server affirm the model's superiority over existing benchmarks, underscoring its efficacy in generating precise and context-rich image captions.
Authors and Affiliations
Shalaka Prasad Deore, Taibah Sohail Bagwan, Prachiti Sunil Bhukan, Harsheen Tejindersingh Rajpal, Shantanu Bharat Gade
A Data-Driven Innovation Model of Big Data Digital Learning and Its Empirical Study
Digital learning is the use of telecommunication technology to deliver information for education and training. As the increased acceleration of the propagation speed of the web, a lot of data collected by automated or se...
A Comparative Review of Internet of Things Model Workload Distribution Techniques in Fog Computing Networks
In the realm of fog computing (FC), a vast array of intelligent devices collaborates within an intricate network, a synergy that, while promising, has not been without its challenges. These challenges, including data los...
Routing Attack Detection Using Ensemble Deep Learning Model for IIoT
Smart cities, ITS, supply chains, and smart industries may all be developed with minimal human interaction thanks to the increasing prevalence of automation enabled by machine-type communication (MTC). Yet, MTC has subst...
Optimizing Energy Storage and Hybrid Inverter Performance in Smart Grids Through Machine Learning
The effective integration of renewable energy sources (RES), such as solar and wind power, into smart grids is essential for advancing sustainable energy management. Hybrid inverters play a pivotal role in the conversio...
ECO-LEACH: A Blockchain-Based Distributed Routing Protocol for Energy-Efficient Wireless Sensor Networks
This paper proposes a novel architecture based on blockchain technology to enhance the dependability and safety of wireless sensor networks (WSN) by authenticating WSN nodes. In a WSN, sensor nodes collect and transmit d...