AMBERT-DWPM: An Adaptive Masking and Dynamic Prototype Learning Framework for Few-Shot Text Classification

Apply

AMBERT-DWPM: An Adaptive Masking and Dynamic Prototype Learning Framework for Few-Shot Text Classification

Journal Title: International Journal of Knowledge and Innovation Studies - Year 2025, Vol 3, Issue 1

Abstract

Transformer-based language models have demonstrated remarkable success in few-shot text classification; however, their effectiveness is often constrained by challenges such as high intraclass diversity and interclass similarity, which hinder the extraction of discriminative features. To address these limitations, a novel framework, Adaptive Masking Bidirectional Encoder Representations from Transformers with Dynamic Weighted Prototype Module (AMBERT-DWPM), is introduced, incorporating adaptive masking and dynamic weighted prototypical learning to enhance feature representation and classification performance. The standard BERT architecture is refined by integrating an adaptive masking mechanism based on Layered Integrated Gradients (LIG), enabling the model to dynamically emphasize salient text segments and improve feature discrimination. Additionally, a DWPM is designed to assign adaptive weights to support samples, mitigating inaccuracies in prototype construction caused by intraclass variability. Extensive evaluations conducted on six publicly available benchmark datasets demonstrate the superiority of AMBERT-DWPM over existing few-shot classification approaches. Notably, under the 5-shot setting on the DBpedia14 dataset, an accuracy of 0.978±0.004 is achieved, highlighting significant advancements in feature discrimination and generalization capabilities. These findings suggest that AMBERT-DWPM provides an efficient and robust solution for few-shot text classification, particularly in scenarios characterized by limited and complex textual data.

Authors and Affiliations

Junyu Li, Jialin Ma, Ashim Khadka

Keywords

Few-shot text classification Dynamic Weighted Prototype Module (DWPM) Adaptive Masking Bidirectional Encoder Representations from Transformers (AMBERT) Contrastive learning Feature discrimination

A Method for Creative Scheme Generation for Brand Design of Plush Toys Based on Extension Theory

In the era of branding, the design of plush toy brands often faces a contradiction with the needs of target user groups. Addressing the brand transformation challenges faced by small and micro enterprises in the plush to...

Racism and Hate Speech Detection on Twitter: A QAHA-Based Hybrid Deep Learning Approach Using LSTM-CNN

Twitter, a predominant platform for instantaneous communication and idea dissemination, is often exploited by cybercriminals for victim harassment through sexism, racism, hate speech, and trolling using pseudony-mous acc...

Evaluating the Logistics Performance Index of European Union Countries: An Integrated Multi-Criteria Decision-Making Approach Utilizing the Bonferroni Operator

The evaluation of the Logistics Performance Index (LPI), as computed by the World Bank, incorporates six equally weighted criteria to ascertain the overall performance scores of countries globally. This study aims to scr...

Parametric Similarity Measurement of T-Spherical Fuzzy Sets for Enhanced Decision-Making

The T-spherical fuzzy set (T-SFS), an advancement over the spherical fuzzy set (SFS), offers a refined approach for addressing contradictions and ambiguities in data. In this context, similarity measures (SMs) serve as c...

Application of Knowledge Engineering in Sports Protective Gear Design: A Study on Innovative Methods Based on Extension Theory

This study, rooted in extension theory and the principles of knowledge engineering, explores and formulates a novel method for generating sports protective gear designs. Given the critical role of sports protective gear...

EP ID EP762893
DOI https://doi.org/10.56578/ijkis030104
Views 21
Downloads 0