Simulating realistic behaviors of traffic agents is pivotal for efficiently
validating the safety of autonomous driving systems. Existing data-driven
simulators primarily use an encoder-decoder architecture to encode the
historical trajectories before decoding the future. However, the heterogeneity
between encoders and decoders complicates the models, and the manual separation
of historical and future trajectories leads to low data utilization. Given
these limitations, we propose BehaviorGPT, a homogeneous and fully
autoregressive Transformer designed to simulate the sequential behavior of
multiple agents. Crucially, our approach discards the traditional separation
between "history" and "future" by modeling each time step as the "current" one
for motion generation, leading to a simpler, more parameter- and data-efficient
agent simulator. We further introduce the Next-Patch Prediction Paradigm (NP3)
to mitigate the negative effects of autoregressive modeling, in which models
are trained to reason at the patch level of trajectories and capture long-range
spatial-temporal interactions. Despite having merely 3M model parameters,
BehaviorGPT won first place in the 2024 Waymo Open Sim Agents Challenge with a
realism score of 0.7473 and a minADE score of 1.4147, demonstrating its
exceptional performance in traffic agent simulation.Abstract
Grade Like a Human: Rethinking Automated Assessment with Large Language
Models
arXiv:2405.19694v1 »Full PDF »While large language models (LLMs) have been used for automated grading, they
have not yet achieved the same level of performance as humans, especially when
it comes to grading complex questions. Existing research on this topic focuses
on a particular step in the grading procedure: grading using predefined
rubrics. However, grading is a multifaceted procedure that encompasses other
crucial steps, such as grading rubrics design and post-grading review. There
has been a lack of systematic research exploring the potential of LLMs to
enhance the entire grading~process.
In this paper, we propose an LLM-based grading system that addresses the
entire grading procedure, including the following key components: 1) Developing
grading rubrics that not only consider the questions but also the student
answers, which can more accurately reflect students' performance. 2) Under the
guidance of grading rubrics, providing accurate and consistent scores for each
student, along with customized feedback. 3) Conducting post-grading review to
better ensure accuracy and fairness. Additionally, we collected a new dataset
named OS from a university operating system course and conducted extensive
experiments on both our new dataset and the widely used Mohler dataset.
Experiments demonstrate the effectiveness of our proposed approach, providing
some new insights for developing automated grading systems based on LLMs.Abstract
Ultra-marginal Feature Importance: Learning from Data with Causal
Guarantees
arXiv:2204.09938v5 »Full PDF »Scientists frequently prioritize learning from data rather than training the
best possible model; however, research in machine learning often prioritizes
the latter. Marginal contribution feature importance (MCI) was developed to
break this trend by providing a useful framework for quantifying the
relationships in data. In this work, we aim to improve upon the theoretical
properties, performance, and runtime of MCI by introducing ultra-marginal
feature importance (UMFI), which uses dependence removal techniques from the AI
fairness literature as its foundation. We first propose axioms for feature
importance methods that seek to explain the causal and associative
relationships in data, and we prove that UMFI satisfies these axioms under
basic assumptions. We then show on real and simulated data that UMFI performs
better than MCI, especially in the presence of correlated interactions and
unrelated features, while partially learning the structure of the causal graph
and reducing the exponential runtime of MCI to super-linear.Abstract
When AI Eats Itself: On the Caveats of AI Autophagy
arXiv:2405.09597v3 »Full PDF »Generative Artificial Intelligence (AI) technologies and large models are
producing realistic outputs across various domains, such as images, text,
speech, and music. Creating these advanced generative models requires
significant resources, particularly large and high-quality datasets. To
minimise training expenses, many algorithm developers use data created by the
models themselves as a cost-effective training solution. However, not all
synthetic data effectively improve model performance, necessitating a strategic
balance in the use of real versus synthetic data to optimise outcomes.
Currently, the previously well-controlled integration of real and synthetic
data is becoming uncontrollable. The widespread and unregulated dissemination
of synthetic data online leads to the contamination of datasets traditionally
compiled through web scraping, now mixed with unlabeled synthetic data. This
trend, known as the AI autophagy phenomenon, suggests a future where generative
AI systems may increasingly consume their own outputs without discernment,
raising concerns about model performance, reliability, and ethical
implications. What will happen if generative AI continuously consumes itself
without discernment? What measures can we take to mitigate the potential
adverse effects? To address these research questions, this study examines the
existing literature, delving into the consequences of AI autophagy, analyzing
the associated risks, and exploring strategies to mitigate its impact. Our aim
is to provide a comprehensive perspective on this phenomenon advocating for a
balanced approach that promotes the sustainable development of generative AI
technologies in the era of large models.Abstract
Flexible Fairness-Aware Learning via Inverse Conditional Permutation
arXiv:2404.05678v3 »Full PDF »Equalized odds, as a popular notion of algorithmic fairness, aims to ensure
that sensitive variables, such as race and gender, do not unfairly influence
the algorithm's prediction when conditioning on the true outcome. Despite rapid
advancements, current research primarily focuses on equalized odds violations
caused by a single sensitive attribute, leaving the challenge of simultaneously
accounting for multiple attributes largely unaddressed. We bridge this gap by
introducing an in-processing fairness-aware learning approach, FairICP, which
integrates adversarial learning with a novel inverse conditional permutation
scheme. FairICP offers a theoretically justified, flexible, and efficient
scheme to promote equalized odds under fairness conditions described by complex
and multidimensional sensitive attributes. The efficacy and adaptability of our
method are demonstrated through both simulation studies and empirical analyses
of real-world datasets.Abstract
Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and
Toolbox
Long-tailed data distributions pose challenges for a variety of domains like
e-commerce, finance, biomedical science, and cyber security, where the
performance of machine learning models is often dominated by head categories
while tail categories are inadequately learned. This work aims to provide a
systematic view of long-tailed learning with regard to three pivotal angles:
(A1) the characterization of data long-tailedness, (A2) the data complexity of
various domains, and (A3) the heterogeneity of emerging tasks. We develop
HeroLT, a comprehensive long-tailed learning benchmark integrating 18
state-of-the-art algorithms, 10 evaluation metrics, and 17 real-world datasets
across 6 tasks and 4 data modalities. HeroLT with novel angles and extensive
experiments (315 in total) enables effective and fair evaluation of newly
proposed methods compared with existing baselines on varying dataset types.
Finally, we conclude by highlighting the significant applications of
long-tailed learning and identifying several promising future directions. For
accessibility and reproducibility, we open-source our benchmark HeroLT and
corresponding results at https://github.com/SSSKJ/HeroLT.Abstract
V2X-Assisted Distributed Computing and Control Framework for Connected
and Automated Vehicles under Ramp Merging Scenario
This paper has been submitted to IEEE Journal. The source code has
been released at:
https://git...
This paper investigates distributed computing and cooperative control of
connected and automated vehicles (CAVs) in ramp merging scenario under
transportation cyber-physical system. Firstly, a centralized cooperative
trajectory planning problem is formulated subject to the safely constraints and
traffic performance in ramp merging scenario, where the trajectories of all
vehicles are jointly optimized. To get rid of the reliance on a central
controller and reduce computation time, a distributed solution to this problem
implemented among CAVs through Vehicles-to-Everything (V2X) communication is
proposed. Unlike existing method, our method can distribute the computational
task among CAVs and carry out parallel solving through V2X communication. Then,
a multi-vehicles model predictive control (MPC) problem aimed at maximizing
system stability and minimizing control input is formulated based on the
solution of the first problem subject to strict safety constants and input
limits. Due to these complex constraints, this problem becomes
high-dimensional, centralized, and non-convex. To solve it in a short time, a
decomposition and convex reformulation method, namely distributed cooperative
iterative model predictive control (DCIMPC), is proposed. This method leverages
the communication capability of CAVs to decompose the problem, making full use
of the computational resources on vehicles to achieve fast solutions and
distributed control. The two above problems with their corresponding solving
methods form the systemic framework of the V2X assisted distributed computing
and control. Simulations have been conducted to evaluate the framework's
convergence, safety, and solving speed. Additionally, extra experiments are
conducted to validate the performance of DCIMPC. The results show that our
method can greatly improve computation speed without sacrificing system
performance.Abstract
Beyond Efficiency: A Systematic Survey of Resource-Efficient Large
Language Models
The burgeoning field of Large Language Models (LLMs), exemplified by
sophisticated models like OpenAI's ChatGPT, represents a significant
advancement in artificial intelligence. These models, however, bring forth
substantial challenges in the high consumption of computational, memory,
energy, and financial resources, especially in environments with limited
resource capabilities. This survey aims to systematically address these
challenges by reviewing a broad spectrum of techniques designed to enhance the
resource efficiency of LLMs. We categorize methods based on their optimization
focus: computational, memory, energy, financial, and network resources and
their applicability across various stages of an LLM's lifecycle, including
architecture design, pretraining, finetuning, and system design. Additionally,
the survey introduces a nuanced categorization of resource efficiency
techniques by their specific resource types, which uncovers the intricate
relationships and mappings between various resources and corresponding
optimization techniques. A standardized set of evaluation metrics and datasets
is also presented to facilitate consistent and fair comparisons across
different models and techniques. By offering a comprehensive overview of the
current sota and identifying open research avenues, this survey serves as a
foundational reference for researchers and practitioners, aiding them in
developing more sustainable and efficient LLMs in a rapidly evolving landscape.Abstract
Towards Fair Graph Representation Learning in Social Networks
arXiv:2410.11493v2 »Full PDF »With the widespread use of Graph Neural Networks (GNNs) for representation
learning from network data, the fairness of GNN models has raised great
attention lately. Fair GNNs aim to ensure that node representations can be
accurately classified, but not easily associated with a specific group.
Existing advanced approaches essentially enhance the generalisation of node
representation in combination with data augmentation strategy, and do not
directly impose constraints on the fairness of GNNs. In this work, we identify
that a fundamental reason for the unfairness of GNNs in social network learning
is the phenomenon of social homophily, i.e., users in the same group are more
inclined to congregate. The message-passing mechanism of GNNs can cause users
in the same group to have similar representations due to social homophily,
leading model predictions to establish spurious correlations with sensitive
attributes. Inspired by this reason, we propose a method called Equity-Aware
GNN (EAGNN) towards fair graph representation learning. Specifically, to ensure
that model predictions are independent of sensitive attributes while
maintaining prediction performance, we introduce constraints for fair
representation learning based on three principles: sufficiency, independence,
and separation. We theoretically demonstrate that our EAGNN method can
effectively achieve group fairness. Extensive experiments on three datasets
with varying levels of social homophily illustrate that our EAGNN method
achieves the state-of-the-art performance across two fairness metrics and
offers competitive effectiveness.Abstract
Class-RAG: Content Moderation with Retrieval Augmented Generation
Robust content moderation classifiers are essential for the safety of
Generative AI systems. Content moderation, or safety classification, is
notoriously ambiguous: differences between safe and unsafe inputs are often
extremely subtle, making it difficult for classifiers (and indeed, even humans)
to properly distinguish violating vs. benign samples without further context or
explanation. Furthermore, as these technologies are deployed across various
applications and audiences, scaling risk discovery and mitigation through
continuous model fine-tuning becomes increasingly challenging and costly. To
address these challenges, we propose a Classification approach employing
Retrieval-Augmented Generation (Class-RAG). Class-RAG extends the capability of
its base LLM through access to a retrieval library which can be dynamically
updated to enable semantic hotfixing for immediate, flexible risk mitigation.
Compared to traditional fine-tuned models, Class-RAG demonstrates flexibility
and transparency in decision-making. As evidenced by empirical studies,
Class-RAG outperforms on classification and is more robust against adversarial
attack. Besides, our findings suggest that Class-RAG performance scales with
retrieval library size, indicating that increasing the library size is a viable
and low-cost approach to improve content moderation.Abstract