Ensuring the trustworthiness of large language models (LLMs) is crucial. Most
studies concentrate on fully pre-trained LLMs to better understand and improve
LLMs' trustworthiness. In this paper, to reveal the untapped potential of
pre-training, we pioneer the exploration of LLMs' trustworthiness during this
period, focusing on five key dimensions: reliability, privacy, toxicity,
fairness, and robustness. To begin with, we apply linear probing to LLMs. The
high probing accuracy suggests that \textit{LLMs in early pre-training can
already distinguish concepts in each trustworthiness dimension}. Therefore, to
further uncover the hidden possibilities of pre-training, we extract steering
vectors from a LLM's pre-training checkpoints to enhance the LLM's
trustworthiness. Finally, inspired by~\citet{choi2023understanding} that mutual
information estimation is bounded by linear probing accuracy, we also probe
LLMs with mutual information to investigate the dynamics of trustworthiness
during pre-training. We are the first to observe a similar two-phase
phenomenon: fitting and compression~\citep{shwartz2017opening}. This research
provides an initial exploration of trustworthiness modeling during LLM
pre-training, seeking to unveil new insights and spur further developments in
the field. We will make our code publicly accessible at
\url{https://github.com/ChnQ/TracingLLM}.Abstract
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models
via Counterfactual Probing
Large Vision-Language Models (LVLMs) have been widely adopted in various
applications; however, they exhibit significant gender biases. Existing
benchmarks primarily evaluate gender bias at the demographic group level,
neglecting individual fairness, which emphasizes equal treatment of similar
individuals. This research gap limits the detection of discriminatory
behaviors, as individual fairness offers a more granular examination of biases
that group fairness may overlook. For the first time, this paper introduces the
GenderBias-\emph{VL} benchmark to evaluate occupation-related gender bias in
LVLMs using counterfactual visual questions under individual fairness criteria.
To construct this benchmark, we first utilize text-to-image diffusion models to
generate occupation images and their gender counterfactuals. Subsequently, we
generate corresponding textual occupation options by identifying stereotyped
occupation pairs with high semantic similarity but opposite gender proportions
in real-world statistics. This method enables the creation of large-scale
visual question counterfactuals to expose biases in LVLMs, applicable in both
multimodal and unimodal contexts through modifying gender attributes in
specific modalities. Overall, our GenderBias-\emph{VL} benchmark comprises
34,581 visual question counterfactual pairs, covering 177 occupations. Using
our benchmark, we extensively evaluate 15 commonly used open-source LVLMs (\eg,
LLaVA) and state-of-the-art commercial APIs, including GPT-4o and Gemini-Pro.
Our findings reveal widespread gender biases in existing LVLMs. Our benchmark
offers: (1) a comprehensive dataset for occupation-related gender bias
evaluation; (2) an up-to-date leaderboard on LVLM biases; and (3) a nuanced
understanding of the biases presented by these models. \footnote{The dataset
and code are available at the \href{https://genderbiasvl.github.io/}{website}.}Abstract
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision
Language Model
arXiv:2406.12030v1 »Full PDF »The emergence of Vision Language Models (VLMs) has brought unprecedented
advances in understanding multimodal information. The combination of textual
and visual semantics in VLMs is highly complex and diverse, making the safety
alignment of these models challenging. Furthermore, due to the limited study on
the safety alignment of VLMs, there is a lack of large-scale, high-quality
datasets. To address these limitations, we propose a Safety Preference
Alignment dataset for Vision Language Models named SPA-VL. In terms of breadth,
SPA-VL covers 6 harmfulness domains, 13 categories, and 53 subcategories, and
contains 100,788 samples of the quadruple (question, image, chosen response,
rejected response). In terms of depth, the responses are collected from 12
open- (e.g., QwenVL) and closed-source (e.g., Gemini) VLMs to ensure diversity.
The experimental results indicate that models trained with alignment techniques
on the SPA-VL dataset exhibit substantial improvements in harmlessness and
helpfulness while maintaining core capabilities. SPA-VL, as a large-scale,
high-quality, and diverse dataset, represents a significant milestone in
ensuring that VLMs achieve both harmlessness and helpfulness. We have made our
code https://github.com/EchoseChen/SPA-VL-RLHF and SPA-VL dataset url
https://huggingface.co/datasets/sqrti/SPA-VL publicly available.Abstract
Federated Learning (FL) employs a training approach to address scenarios
where users' data cannot be shared across clients. Achieving fairness in FL is
imperative since training data in FL is inherently geographically distributed
among diverse user groups. Existing research on fairness predominantly assumes
access to the entire training data, making direct transfer to FL challenging.
However, the limited existing research on fairness in FL does not effectively
address two key challenges, i.e., (CH1) Current methods fail to deal with the
inconsistency between fair optimization results obtained with surrogate
functions and fair classification results. (CH2) Directly aggregating local
fair models does not always yield a globally fair model due to non Identical
and Independent data Distributions (non-IID) among clients. To address these
challenges, we propose a Wasserstein Fair Federated Learning framework, namely
WassFFed. To tackle CH1, we ensure that the outputs of local models, rather
than the loss calculated with surrogate functions or classification results
with a threshold, remain independent of various user groups. To resolve CH2, we
employ a Wasserstein barycenter calculation of all local models' outputs for
each user group, bringing local model outputs closer to the global output
distribution to ensure consistency between the global model and local models.
We conduct extensive experiments on three real-world datasets, demonstrating
that WassFFed outperforms existing approaches in striking a balance between
accuracy and fairness.Abstract
A Comprehensive Survey and Guide to Multimodal Large Language Models in
Vision-Language Tasks
arXiv:2411.06284v1 »Full PDF »This survey and application guide to multimodal large language models(MLLMs)
explores the rapidly developing field of MLLMs, examining their architectures,
applications, and impact on AI and Generative Models. Starting with
foundational concepts, we delve into how MLLMs integrate various data types,
including text, images, video and audio, to enable complex AI systems for
cross-modal understanding and generation. It covers essential topics such as
training methods, architectural components, and practical applications in
various fields, from visual storytelling to enhanced accessibility. Through
detailed case studies and technical analysis, the text examines prominent MLLM
implementations while addressing key challenges in scalability, robustness, and
cross-modal learning. Concluding with a discussion of ethical considerations,
responsible AI development, and future directions, this authoritative resource
provides both theoretical frameworks and practical insights. It offers a
balanced perspective on the opportunities and challenges in the development and
deployment of MLLMs, and is highly valuable for researchers, practitioners, and
students interested in the intersection of natural language processing and
computer vision.Abstract
From Word Vectors to Multimodal Embeddings: Techniques, Applications,
and Future Directions For Large Language Models
Word embeddings and language models have transformed natural language
processing (NLP) by facilitating the representation of linguistic elements in
continuous vector spaces. This review visits foundational concepts such as the
distributional hypothesis and contextual similarity, tracing the evolution from
sparse representations like one-hot encoding to dense embeddings including
Word2Vec, GloVe, and fastText. We examine both static and contextualized
embeddings, underscoring advancements in models such as ELMo, BERT, and GPT and
their adaptations for cross-lingual and personalized applications. The
discussion extends to sentence and document embeddings, covering aggregation
methods and generative topic models, along with the application of embeddings
in multimodal domains, including vision, robotics, and cognitive science.
Advanced topics such as model compression, interpretability, numerical
encoding, and bias mitigation are analyzed, addressing both technical
challenges and ethical implications. Additionally, we identify future research
directions, emphasizing the need for scalable training techniques, enhanced
interpretability, and robust grounding in non-textual modalities. By
synthesizing current methodologies and emerging trends, this survey offers
researchers and practitioners an in-depth resource to push the boundaries of
embedding-based language models.Abstract
SA3DIP: Segment Any 3D Instance with Potential 3D Priors
arXiv:2411.03819v1 »Full PDF »The proliferation of 2D foundation models has sparked research into adapting
them for open-world 3D instance segmentation. Recent methods introduce a
paradigm that leverages superpoints as geometric primitives and incorporates 2D
multi-view masks from Segment Anything model (SAM) as merging guidance,
achieving outstanding zero-shot instance segmentation results. However, the
limited use of 3D priors restricts the segmentation performance. Previous
methods calculate the 3D superpoints solely based on estimated normal from
spatial coordinates, resulting in under-segmentation for instances with similar
geometry. Besides, the heavy reliance on SAM and hand-crafted algorithms in 2D
space suffers from over-segmentation due to SAM's inherent part-level
segmentation tendency. To address these issues, we propose SA3DIP, a novel
method for Segmenting Any 3D Instances via exploiting potential 3D Priors.
Specifically, on one hand, we generate complementary 3D primitives based on
both geometric and textural priors, which reduces the initial errors that
accumulate in subsequent procedures. On the other hand, we introduce
supplemental constraints from the 3D space by using a 3D detector to guide a
further merging process. Furthermore, we notice a considerable portion of
low-quality ground truth annotations in ScanNetV2 benchmark, which affect the
fair evaluations. Thus, we present ScanNetV2-INS with complete ground truth
labels and supplement additional instances for 3D class-agnostic instance
segmentation. Experimental evaluations on various 2D-3D datasets demonstrate
the effectiveness and robustness of our approach. Our code and proposed
ScanNetV2-INS dataset are available HERE.Abstract
Deep Learning and Machine Learning -- Natural Language Processing: From
Theory to Application
With a focus on natural language processing (NLP) and the role of large
language models (LLMs), we explore the intersection of machine learning, deep
learning, and artificial intelligence. As artificial intelligence continues to
revolutionize fields from healthcare to finance, NLP techniques such as
tokenization, text classification, and entity recognition are essential for
processing and understanding human language. This paper discusses advanced data
preprocessing techniques and the use of frameworks like Hugging Face for
implementing transformer-based models. Additionally, it highlights challenges
such as handling multilingual data, reducing bias, and ensuring model
robustness. By addressing key aspects of data processing and model fine-tuning,
this work aims to provide insights into deploying effective and ethically sound
AI solutions.Abstract
From Text to Multimodality: Exploring the Evolution and Impact of Large
Language Models in Medical Practice
Large Language Models (LLMs) have rapidly evolved from text-based systems to
multimodal platforms, significantly impacting various sectors including
healthcare. This comprehensive review explores the progression of LLMs to
Multimodal Large Language Models (MLLMs) and their growing influence in medical
practice. We examine the current landscape of MLLMs in healthcare, analyzing
their applications across clinical decision support, medical imaging, patient
engagement, and research. The review highlights the unique capabilities of
MLLMs in integrating diverse data types, such as text, images, and audio, to
provide more comprehensive insights into patient health. We also address the
challenges facing MLLM implementation, including data limitations, technical
hurdles, and ethical considerations. By identifying key research gaps, this
paper aims to guide future investigations in areas such as dataset development,
modality alignment methods, and the establishment of ethical guidelines. As
MLLMs continue to shape the future of healthcare, understanding their potential
and limitations is crucial for their responsible and effective integration into
medical practice.Abstract
NetSafe: Exploring the Topological Safety of Multi-agent Networks
arXiv:2410.15686v1 »Full PDF »Large language models (LLMs) have empowered nodes within multi-agent networks
with intelligence, showing growing applications in both academia and industry.
However, how to prevent these networks from generating malicious information
remains unexplored with previous research on single LLM's safety be challenging
to transfer. In this paper, we focus on the safety of multi-agent networks from
a topological perspective, investigating which topological properties
contribute to safer networks. To this end, we propose a general framework,
NetSafe along with an iterative RelCom interaction to unify existing diverse
LLM-based agent frameworks, laying the foundation for generalized topological
safety research. We identify several critical phenomena when multi-agent
networks are exposed to attacks involving misinformation, bias, and harmful
information, termed as Agent Hallucination and Aggregation Safety. Furthermore,
we find that highly connected networks are more susceptible to the spread of
adversarial attacks, with task performance in a Star Graph Topology decreasing
by 29.7%. Besides, our proposed static metrics aligned more closely with
real-world dynamic evaluations than traditional graph-theoretic metrics,
indicating that networks with greater average distances from attackers exhibit
enhanced safety. In conclusion, our work introduces a new topological
perspective on the safety of LLM-based multi-agent networks and discovers
several unreported phenomena, paving the way for future research to explore the
safety of such networks.Abstract