Arxiv Day: Article

Local Node Differential Privacy

We initiate an investigation of node differential privacy for graphs in the local model of private data analysis. In our model, dubbed LNDP*, each node sees its own edge list and releases the output of a local randomizer on this input. These outputs are aggregated by an untrusted server to obtain a final output. We develop a novel algorithmic framework for this setting that allows us to accurately answer arbitrary linear queries about the input graph's degree distribution. Our framework is based on a new object, called the blurry degree distribution, which closely approximates the degree distribution and has lower sensitivity. Instead of answering queries about the degree distribution directly, our algorithms answer queries about the blurry degree distribution. This framework yields accurate LNDP* algorithms for the edge count, PMF and CDF of the degree distribution, and other graph statistics. For some natural problems, our algorithms match the accuracy achievable with node privacy in the central model, where data are held and processed by a trusted server. We also prove lower bounds on the error required by LNDP* algorithms that imply the optimality of our framework for edge counting in sparse graphs and Erdos-Renyi parameter estimation. Our lower bounds apply even to interactive protocols with a constant number of rounds of interaction between the nodes and the server. Existing lower-bound techniques for related models either yield loose bounds or do not apply in our setting, as graph data results in inherently overlapping inputs to local randomizers. To prove our bounds, we develop a splicing argument that stitches together views from locally similar but globally different distributions on graphs to obtain hard instances. Finally, we prove structural results that reveal qualitative differences between local node privacy and the standard local model for tabular data.

Updated: 2026-04-01 23:51:13

标题: 本地节点差分隐私

摘要: 我们在私人数据分析的本地模型中，启动了对图形节点差分隐私的调查。在我们的模型中，称为LNDP*，每个节点都可以看到自己的边缘列表，并发布在此输入上的本地随机化器的输出。这些输出由一个不受信任的服务器聚合以获得最终输出。我们为这种设置开发了一种新颖的算法框架，使我们能够准确回答关于输入图的度分布的任意线性查询。我们的框架基于一个新对象，称为模糊度分布，它紧密逼近度分布并具有较低的灵敏度。我们的算法不直接回答关于度分布的查询，而是回答关于模糊度分布的查询。这种框架为度分布的边缘计数、PMF和CDF以及其他图统计量提供了准确的LNDP*算法。对于一些自然问题，我们的算法与中心模型中节点隐私可实现的准确度相匹配，其中数据由受信任的服务器持有和处理。我们还证明了LNDP*算法所需错误的下界，这意味着我们的边缘计数在稀疏图和Erdos-Renyi参数估计中的最佳性。我们的下界甚至适用于节点和服务器之间交互协议的常数轮次的交互。现有的相关模型下界技术要么给出宽松的下界，要么不适用于我们的设置，因为图数据导致本地随机化器的输入在本质上重叠。为了证明我们的下界，我们开发了一个拼接参数，通过将在全球上不同的分布上的局部相似但全球不同的视图拼接在一起，以获得难实例。最后，我们证明了揭示本地节点隐私与表格数据的标准本地模型之间的定性差异的结构结果。

更新时间: 2026-04-01 23:51:13

领域: cs.DS,cs.CR

下载: http://arxiv.org/abs/2602.15802v2

Type-Checked Compliance: Deterministic Guardrails for Agentic Financial Systems Using Lean 4 Theorem Proving

The rapid evolution of autonomous, agentic artificial intelligence within financial services has introduced an existential architectural crisis: large language models (LLMs) are probabilistic, non-deterministic systems operating in domains that demand absolute, mathematically verifiable compliance guarantees. Existing guardrail solutions -- including NVIDIA NeMo Guardrails and Guardrails AI -- rely on probabilistic classifiers and syntactic validators that are fundamentally inadequate for enforcing complex multi-variable regulatory constraints mandated by the SEC, FINRA, and OCC. This paper presents the Lean-Agent Protocol, a formal-verification-based AI guardrail platform that leverages the Aristotle neural-symbolic model developed by Harmonic AI to auto-formalize institutional policies into Lean 4 code. Every proposed agentic action is treated as a mathematical conjecture: execution is permitted if and only if the Lean 4 kernel proves that the action satisfies pre-compiled regulatory axioms. This architecture provides cryptographic-level compliance certainty at microsecond latency, directly satisfying SEC Rule 15c3-5, OCC Bulletin 2011-12, FINRA Rule 3110, and CFPB explainability mandates. A three-phase implementation roadmap from shadow verification through enterprise-scale deployment is provided.

Updated: 2026-04-01 23:39:43

标题: 类型检查合规性：使用Lean 4定理证明为代理金融系统设置确定性防护栏。

摘要: 金融服务领域中自主、有主动性的人工智能的快速发展引入了一场存在主义架构危机：大型语言模型(LLMs)是概率的、非确定性系统，在需要绝对、可数学验证的合规性保证的领域中运作。现有的防护栏解决方案--包括NVIDIA NeMo Guardrails和Guardrails AI--依赖于概率分类器和句法验证器，这些解决方案基本上无法强制执行由SEC、FINRA和OCC规定的复杂多变量监管限制。本文介绍了Lean-Agent Protocol，这是一个基于形式验证的人工智能防护栏平台，利用Harmonic AI开发的亚里士多德神经符号模型将机构政策自动形式化为Lean 4代码。每个提出的行动都被视为一个数学猜想：只有当Lean 4内核证明该行动符合预编译的监管公理时，才允许执行。该架构提供微秒级延迟的加密级合规性确定性，直接满足SEC规则15c3-5、OCC公告2011-12、FINRA规则3110和CFPB可解释性要求。提供了一个从影子验证到企业规模部署的三阶段实施路线图。

更新时间: 2026-04-01 23:39:43

领域: cs.LO,cs.AI,cs.CR

下载: http://arxiv.org/abs/2604.01483v1

SelfGrader: Stable Jailbreak Detection for Large Language Models using Token-Level Logits

Large Language Models (LLMs) are powerful tools for answering user queries, yet they remain highly vulnerable to jailbreak attacks. Existing guardrail methods typically rely on internal features or textual responses to detect malicious queries, which either introduce substantial latency or suffer from the randomness in text generation. To overcome these limitations, we propose SelfGrader, a lightweight guardrail method that formulates jailbreak detection as a numerical grading problem using token-level logits. Specifically, SelfGrader evaluates the safety of a user query within a compact set of numerical tokens (NTs) (e.g., 0-9) and interprets their logit distribution as an internal safety signal. To align these signals with human intuition of maliciousness, SelfGrader introduces a dual-perspective scoring rule that considers both the maliciousness and benignness of the query, yielding a stable and interpretable score that reflects harmfulness and reduces the false positive rate simultaneously. Extensive experiments across diverse jailbreak benchmarks, multiple LLMs, and state-of-the-art guardrail baselines demonstrate that SelfGrader achieves up to a 22.66% reduction in ASR on LLaMA-3-8B, while maintaining significantly lower memory overhead (up to 173x) and latency (up to 26x).

Updated: 2026-04-01 23:29:12

标题: SelfGrader：使用标记级别Logits的大型语言模型的稳定越狱检测

摘要: 大型语言模型（LLMs）是回答用户查询的强大工具，但它们仍然极易受到破解攻击的影响。现有的防护方法通常依赖于内部特征或文本响应来检测恶意查询，这要么引入了相当大的延迟，要么受到文本生成中随机性的影响。为了克服这些限制，我们提出了SelfGrader，这是一种轻量级的防护方法，将破解检测形式化为一个使用标记级别对数(logit)的数值评分问题。具体来说，SelfGrader评估用户查询在一组紧凑的数值标记（NTs）（例如0-9）内的安全性，并将它们的对数分布解释为内部安全信号。为了使这些信号与恶意性的人类直觉保持一致，SelfGrader引入了一种双重视角评分规则，考虑查询的恶意性和善意性，产生一个稳定且可解释的得分，同时降低了误报率。在多样化的破解基准测试、多个LLMs和最先进的防护基线上进行的大量实验表明，SelfGrader在LLaMA-3-8B上实现了高达22.66%的ASR降低，同时保持了显著较低的内存开销（高达173倍）和延迟（高达26倍）。

更新时间: 2026-04-01 23:29:12

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2604.01473v1

Taxonomy for Cybersecurity Threat Attributes and Countermeasures in Smart Manufacturing Systems

An attack taxonomy offers a consistent and structured classification scheme to systematically understand, identify, and classify cybersecurity threat attributes. However, existing taxonomies only focus on a narrow range of attacks and limited threat attributes, lacking a comprehensive characterization of manufacturing cybersecurity threats. There is little to no focus on characterizing threat actors and their intent, specific system and machine behavioral deviations introduced by cyberattacks, system-level and operational implications of attacks, and potential countermeasures against those attacks. To close this pressing research gap, this work proposes a comprehensive attack taxonomy for a holistic understanding and characterization of cybersecurity threats in manufacturing systems. Specifically, it introduces taxonomical classifications for threat actors and their intent and potential alterations in system behavior due to threat events. The proposed taxonomy categorizes attack methods/vectors and targets/locations and incorporates operational and system-level attack impacts. This paper also presents a classification structure for countermeasures, provides examples of potential countermeasures, and explains how they fit into the proposed taxonomical classification. Finally, the implementation of the proposed taxonomy is illustrated using two realistic scenarios of attacks on typical smart manufacturing systems, as well as several real-world cyber-physical attack incidents and academic case studies. The developed manufacturing attack taxonomy offers a holistic view of the attack chain in manufacturing systems, starting from the attack launch to the possible damages and system behavior changes within the system. Furthermore, it guides the design and development of appropriate protective and detective countermeasures by leveraging the attack realization through observed system deviations.

Updated: 2026-04-01 23:27:44

标题: 智能制造系统中的网络安全威胁属性和对策分类学

摘要: 攻击分类法提供了一种一致和结构化的分类方案，以系统地理解、识别和分类网络安全威胁属性。然而，现有的分类法只关注狭窄范围的攻击和有限的威胁属性，缺乏对制造业网络安全威胁的全面描述。对于威胁行为者及其意图、由网络攻击引起的特定系统和机器行为偏差、攻击的系统级和操作级影响以及针对这些攻击的潜在对策几乎没有关注。为了填补这一紧迫的研究空白，本文提出了一个全面的攻击分类法，以全面理解和描述制造系统中的网络安全威胁。具体来说，它引入了对威胁行为者及其意图以及由威胁事件引起的系统行为可能变化的分类方法。所提出的分类法对攻击方法/向量和目标/位置进行分类，并融入了操作和系统级攻击影响。本文还提出了针对措施的分类结构，提供了潜在措施的示例，并解释了它们如何符合所提出的分类法。最后，本文通过两个典型智能制造系统的攻击场景，以及几起真实的网络物理攻击事件和学术案例研究，演示了所提出分类法的实施情况。开发的制造业攻击分类法提供了对制造系统中的攻击链的整体视图，从攻击发动到系统内可能造成的损害和系统行为变化开始。此外，它通过观察系统偏差来指导设计和开发适当的防护和检测对策，从而利用攻击实现。

更新时间: 2026-04-01 23:27:44

领域: cs.CR

下载: http://arxiv.org/abs/2401.01374v2

Preserving Target Distributions With Differentially Private Count Mechanisms

Differentially private mechanisms are increasingly used to publish tables of counts, where each entry represents the number of individuals belonging to a particular category. A distribution of counts summarizes the information in the count column, unlinking counts from categories. This object is useful for answering a class of research questions, but it is subject to statistical biases when counts are privatized with standard mechanisms. This motivates a novel design criterion we term accuracy of distribution. This study formalizes a two-stage framework for privatizing tables of counts that balances accuracy of distribution with two standard criteria of accuracy of counts and runtime. In the first stage, a distribution privatizer generates an estimate for the true distribution of counts. We introduce a new mechanism, called the cyclic Laplace, specifically tailored to distributions of counts, that outperforms existing general-purpose differentially private histogram mechanisms. In the second stage, a constructor algorithm generates a count mechanism, represented as a transition matrix, whose fixed-point is the privatized distribution of counts. We develop a mathematical theory that describes such transition matrices in terms of simple building blocks we call epsilon-scales. This theory informs the design of a new constructor algorithm that generates transition matrices with favorable properties more efficiently than standard optimization algorithms. We explore the practicality of our framework with a set of experiments, highlighting situations in which a fixed-point method provides a favorable tradeoff among performance criteria.

Updated: 2026-04-01 23:25:05

标题: 用差分隐私计数机制保护目标分布

摘要: 差分隐私机制越来越被用于发布计数表，其中每个条目表示属于特定类别的个体数量。计数分布总结了计数列中的信息，使计数与类别脱钩。这个对象对回答一类研究问题很有用，但在使用标准机制对计数进行私有化时存在统计偏差。这促使我们提出了一个新的设计标准，称为分布准确性。本研究正式化了一个用于私有化计数表的两阶段框架，平衡了分布准确性与计数准确性和运行时间这两个标准。在第一阶段，一个分布私有化器生成计数的真实分布的估计。我们引入了一个新的机制，称为循环拉普拉斯，专门针对计数分布，优于现有的通用差分隐私直方图机制。在第二阶段，一个构造算法生成一个计数机制，表示为一个转移矩阵，其固定点是私有化的计数分布。我们发展了一个数学理论，用简单构建块ε-尺度来描述这样的转移矩阵。这个理论指导了设计一个新的构造算法，比标准优化算法更有效地生成具有有利特性的转移矩阵。我们通过一组实验探讨了我们框架的实用性，突出了固定点方法在性能标准中提供有利折衷的情况。

更新时间: 2026-04-01 23:25:05

领域: cs.CR

下载: http://arxiv.org/abs/2604.01468v1

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models

Large language models (LLMs) are increasingly deployed for everyday tasks, including food preparation and health-related guidance. However, food safety remains a high-stakes domain where inaccurate or misleading information can cause severe real-world harm. Despite these risks, current LLMs and safety guardrails lack rigorous alignment tailored to domain-specific food hazards. To address this gap, we introduce FoodGuardBench, the first comprehensive benchmark comprising 3,339 queries grounded in FDA guidelines, designed to evaluate the safety and robustness of LLMs. By constructing a taxonomy of food safety principles and employing representative jailbreak attacks (e.g., AutoDAN and PAP), we systematically evaluate existing LLMs and guardrails. Our evaluation results reveal three critical vulnerabilities: First, current LLMs exhibit sparse safety alignment in the food-related domain, easily succumbing to a few canonical jailbreak strategies. Second, when compromised, LLMs frequently generate actionable yet harmful instructions, inadvertently empowering malicious actors and posing tangible risks. Third, existing LLM-based guardrails systematically overlook these domain-specific threats, failing to detect a substantial volume of malicious inputs. To mitigate these vulnerabilities, we introduce FoodGuard-4B, a specialized guardrail model fine-tuned on our datasets to safeguard LLMs within food-related domains.

Updated: 2026-04-01 22:38:38

标题: 烹饪风险：在大型语言模型中对食品安全风险进行基准测试和降低

摘要: 大型语言模型（LLMs）越来越多地被用于日常任务，包括食品准备和健康相关指导。然而，食品安全仍然是一个高风险领域，不准确或误导性信息可能导致严重的现实伤害。尽管存在这些风险，当前的LLMs和安全防护措施缺乏针对特定领域食品危害的严格对齐。为了填补这一空白，我们引入了FoodGuardBench，这是第一个包含3339个基于FDA指南的查询的综合基准，旨在评估LLMs的安全性和稳健性。通过构建食品安全原则的分类法并采用代表性的越狱攻击（例如AutoDAN和PAP），我们系统评估了现有的LLMs和防护措施。我们的评估结果揭示了三个关键漏洞：首先，当前的LLMs在与食品相关的领域中表现出稀疏的安全对齐，很容易受到一些经典越狱策略的攻击。其次，当受到攻击时，LLMs经常生成可操作但有害的指令，无意中赋予了恶意行为者权力并构成切实风险。第三，现有基于LLMs的防护措施系统地忽视了这些特定领域的威胁，未能检测到大量恶意输入。为了减轻这些漏洞，我们引入了FoodGuard-4B，这是一个在我们的数据集上进行了精细调整的专门的防护模型，用于保护与食品相关领域中的LLMs。

更新时间: 2026-04-01 22:38:38

领域: cs.CR

下载: http://arxiv.org/abs/2604.01444v1

When the Server Steps In: Calibrated Updates for Fair Federated Learning

Federated learning (FL) has emerged as a transformative distributed learning paradigm, enabling multiple clients to collaboratively train a global model under the coordination of a central server without sharing their raw training data. While FL offers notable advantages, it faces critical challenges in ensuring fairness across diverse demographic groups. To address these fairness concerns, various fairness-aware debiasing methods have been proposed. However, many of these approaches either require modifications to clients' training protocols or lack flexibility in their aggregation strategies. In this work, we address these limitations by introducing EquFL, a novel server-side debiasing method designed to mitigate bias in FL systems. EquFL operates by allowing the server to generate a single calibrated update after receiving model updates from the clients. This calibrated update is then integrated with the aggregated client updates to produce an adjusted global model that reduces bias. Theoretically, we establish that EquFL converges to the optimal global model achieved by FedAvg and effectively reduces fairness loss over training rounds. Empirically, we demonstrate that EquFL significantly mitigates bias within the system, showcasing its practical effectiveness.

Updated: 2026-04-01 21:57:20

标题: 当服务器介入时：公平联邦学习的校准更新

摘要: 联邦学习（FL）已经成为一种变革性的分布式学习范式，使多个客户端在中央服务器的协调下共同训练一个全局模型，而无需共享其原始训练数据。虽然FL提供了显著优势，但在确保跨不同人口群体的公平性方面面临着关键挑战。为了解决这些公平性问题，提出了各种公平意识去偏见方法。然而，许多这些方法要么需要修改客户端的训练协议，要么在聚合策略上缺乏灵活性。在这项工作中，我们通过引入EquFL来解决这些限制，这是一种新颖的服务器端去偏见方法，旨在减轻FL系统中的偏见。EquFL的操作方式是允许服务器在接收到来自客户端的模型更新后生成一个经过校准的单个更新。然后将这个校准更新与聚合的客户端更新整合，以产生一个调整后的全局模型，从而减少偏见。理论上，我们证明了EquFL收敛到由FedAvg实现的最佳全局模型，并在训练轮次中有效减少公平性损失。实证上，我们展示了EquFL显著减轻了系统内的偏见，展示了其实际有效性。

更新时间: 2026-04-01 21:57:20

领域: cs.LG,cs.CR,cs.IR,cs.SI

下载: http://arxiv.org/abs/2601.05352v2

Machine Learning for Network Attacks Classification and Statistical Evaluation of Adversarial Learning Methodologies for Synthetic Data Generation

Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time for artificial intelligence (AI), with even more sophisticated attacks that utilize advanced techniques, such as generative artificial intelligence (GenAI) and reinforcement learning, it has become a vital component if we wish to protect our personal data, which are scattered across the web. In this paper, we address two tasks, in the first unified multi-modal NIDS dataset, which incorporates flow-level data, packet payload information and temporal contextual features, from the reprocessed CIC-IDS-2017, CIC-IoT-2023, UNSW-NB15 and CIC-DDoS-2019, with the same feature space. In the first task we use machine learning (ML) algorithms, with stratified cross validation, in order to prevent network attacks, with stability and reliability. In the second task we use adversarial learning algorithms to generate synthetic data, compare them with the real ones and evaluate their fidelity, utility and privacy using the SDV framework, f-divergences, distinguishability and non-parametric statistical tests. The findings provide stable ML models for intrusion detection and generative models with high fidelity and utility, by combining the Synthetic Data Vault framework, the TRTS and TSTR tests, with non-parametric statistical tests and f-divergence measures.

Updated: 2026-04-01 20:58:02

标题: 机器学习用于网络攻击分类和对抗学习方法在合成数据生成中的统计评估

摘要: 监督网络攻击检测一直是网络入侵检测系统（NIDS）的关键部分。如今，在人工智能（AI）关键时期，更加复杂的攻击利用先进技术，如生成式人工智能（GenAI）和强化学习，如果我们希望保护分散在网络中的个人数据，这已经成为一个至关重要的组件。本文中，我们解决了两个任务，第一个是统一的多模态NIDS数据集，它包含来自重新处理的CIC-IDS-2017、CIC-IoT-2023、UNSW-NB15和CIC-DDoS-2019的流级数据、数据包负载信息和时间上下文特征，具有相同的特征空间。在第一个任务中，我们使用机器学习（ML）算法，并采用分层交叉验证，以预防网络攻击，确保稳定性和可靠性。在第二个任务中，我们使用对抗学习算法生成合成数据，将其与真实数据进行比较，并使用SDV框架、f-散度、可区分性和非参数统计测试来评估其忠实度、实用性和隐私性。研究结果提供了用于入侵检测的稳定ML模型和具有高忠实度和实用性的生成模型，通过结合合成数据保险库框架、TRTS和TSTR测试、非参数统计测试和f-散度测量。

更新时间: 2026-04-01 20:58:02

领域: cs.CR,cs.AI,stat.AP,stat.ML

下载: http://arxiv.org/abs/2603.17717v2

"The System Will Choose Security Over Humanity Every Time": Understanding Security and Privacy for U.S. Incarcerated Users

Digital devices like tablets, media players, and kiosks are increasingly deployed in U.S. prisons. These technologies can enable incarcerated people to access education, communicate with loved ones, and develop vital reentry skills. However, they can also introduce new privacy and security risks for incarcerated people who have little agency over their usage and contracts, and are currently carved out of many consumer protection safeguards. To investigate these issues, we conducted focus groups and interviews with system-impacted people (n=17), i.e., those formerly incarcerated, and their relatives, to investigate experiences with device-related security and privacy vulnerabilities and the power dynamics that affect their use. In our findings, participants describe pervasive surveillance, censorship, and usability problems with the technology available to them, including shifting and seemingly arbitrary usage policies. These policies strain relationships both inside and outside prisons and contribute to negative downstream effects for incarcerated users. We recommend ways to better balance prison security concerns with privacy-related needs of system-impacted individuals by promoting accountability for technology-related decisions, providing public oversight of digital purchasing and use policies, and designing digital tools with them -- the actual end-users -- in mind.

Updated: 2026-04-01 20:28:09

标题: “系统总是会选择安全而非人性：理解美国监禁用户的安全和隐私问题”

摘要: 数字设备，如平板电脑、媒体播放器和信息亭，越来越多地被部署在美国监狱中。这些技术可以使被监禁的人能够获取教育、与亲人交流，并发展重要的重新进入社会的技能。然而，它们也可能为那些对自己的使用和合同几乎没有主权的被监禁的人引入新的隐私和安全风险，并且目前在许多消费者保护措施中被排除在外。为了调查这些问题，我们与受系统影响的人（n=17），即曾经被监禁的人及其亲属进行焦点小组讨论和访谈，以调查与设备相关的安全和隐私漏洞以及影响他们使用的权力动态的经验。在我们的调查中，参与者描述了他们可用的技术中普遍存在的监视、审查和可用性问题，包括日益变化和似乎是武断的使用政策。这些政策对监狱内外的关系造成了压力，并为被监禁的用户带来了负面的后果。我们建议通过促进对技术相关决策的问责制，公开监督数字购买和使用政策，并设计以他们——实际终端用户——为中心的数字工具，更好地平衡监狱安全问题和系统影响个体的与隐私相关需求。

更新时间: 2026-04-01 20:28:09

领域: cs.CR

下载: http://arxiv.org/abs/2604.01370v1

No Attacker Needed: Unintentional Cross-User Contamination in Shared-State LLM Agents

LLM-based agents increasingly operate across repeated sessions, maintaining task states to ensure continuity. In many deployments, a single agent serves multiple users within a team or organization, reusing a shared knowledge layer across user identities. This shared persistence expands the failure surface: information that is locally valid for one user can silently degrade another user's outcome when the agent reapplies it without regard for scope. We refer to this failure mode as unintentional cross-user contamination (UCC). Unlike adversarial memory poisoning, UCC requires no attacker; it arises from benign interactions whose scope-bound artifacts persist and are later misapplied. We formalize UCC through a controlled evaluation protocol, introduce a taxonomy of three contamination types, and evaluate the problem in two shared-state mechanisms. Under raw shared state, benign interactions alone produce contamination rates of 57--71%. A write-time sanitization is effective when shared state is conversational, but leaves substantial residual risk when shared state includes executable artifacts, with contamination often manifesting as silent wrong answers. These results indicate that shared-state agents need artifact-level defenses beyond text-level sanitization to prevent silent cross-user failures.

Updated: 2026-04-01 20:03:56

标题: 无需攻击者：共享状态LLM代理中的无意跨用户污染

摘要: 基于LLM的代理越来越多地在重复的会话中运行，保持任务状态以确保连续性。在许多部署中，单个代理为团队或组织内的多个用户提供服务，跨用户身份重用共享的知识层。这种共享的持久性扩大了故障表面：对于一个用户而言在本地有效的信息在代理无视范围重新应用时可能会悄悄地损害另一个用户的结果。我们将这种故障模式称为无意的跨用户污染（UCC）。与对抗性内存中毒不同，UCC不需要攻击者；它起源于善意的交互，其范围限定的产物持续存在并在之后被错误地应用。我们通过一种受控评估协议形式化了UCC，引入了三种污染类型的分类，并在两种共享状态机制中评估了这个问题。在原始共享状态下，仅善意的交互就会产生57-71%的污染率。当共享状态是对话时，写入时的净化是有效的，但当共享状态包含可执行的产物时，仍存在相当大的剩余风险，污染往往表现为悄无声息的错误答案。这些结果表明，共享状态代理需要超越文本级净化的产物级防御措施，以防止悄悄的跨用户故障。

更新时间: 2026-04-01 20:03:56

领域: cs.CL,cs.AI,cs.CR

下载: http://arxiv.org/abs/2604.01350v1

Safety, Security, and Cognitive Risks in World Models

World models -- learned internal simulators of environment dynamics -- are rapidly becoming foundational to autonomous decision-making in robotics, autonomous vehicles, and agentic AI. Yet this predictive power introduces a distinctive set of safety, security, and cognitive risks. Adversaries can corrupt training data, poison latent representations, and exploit compounding rollout errors to cause catastrophic failures in safety-critical deployments. World model-equipped agents are more capable of goal misgeneralisation, deceptive alignment, and reward hacking precisely because they can simulate the consequences of their own actions. Authoritative world model predictions further foster automation bias and miscalibrated human trust that operators lack the tools to audit. This paper surveys the world model landscape; introduces formal definitions of trajectory persistence and representational risk; presents a five-profile attacker capability taxonomy; and develops a unified threat model extending MITRE ATLAS and the OWASP LLM Top 10 to the world model stack. We provide an empirical proof-of-concept on trajectory-persistent adversarial attacks (GRU-RSSM: A_1 = 2.26x amplification, -59.5% reduction under adversarial fine-tuning; stochastic RSSM proxy: A_1 = 0.65x; DreamerV3 checkpoint: non-zero action drift confirmed). We illustrate risks through four deployment scenarios and propose interdisciplinary mitigations spanning adversarial hardening, alignment engineering, NIST AI RMF and EU AI Act governance, and human-factors design. We argue that world models must be treated as safety-critical infrastructure requiring the same rigour as flight-control software or medical devices.

Updated: 2026-04-01 19:57:33

标题: 世界模型中的安全、安全性和认知风险

摘要: 世界模型——学习环境动态的内部模拟器——正在快速成为机器人、自动驾驶车辆和主动型人工智能自主决策的基础。然而，这种预测能力引入了一系列独特的安全、安全性和认知风险。对手可以破坏训练数据，操纵潜在表示，并利用复合推出错误来引发安全关键部署中的灾难性故障。配备世界模型的代理更有能力发生目标误概化、欺骗性对齐和奖励黑客，因为他们可以模拟自己行动的后果。权威性的世界模型预测进一步促进了自动化偏见和操作员缺乏审计工具的误校准人类信任。本文调查了世界模型的景观；引入了轨迹持久性和表征风险的形式定义；提出了一个五个概要的攻击者能力分类法；并开发了一个统一的威胁模型，将MITRE ATLAS和OWASP LLM Top 10扩展到世界模型堆栈。我们提供了关于轨迹持续对抗攻击的经验性概念证明（GRU-RSSM：A_1 = 2.26倍放大，经过对抗微调后减少了59.5%；随机RSSM代理：A_1 = 0.65倍；DreamerV3检查点：确认了非零行动漂移）。我们通过四个部署场景展示了风险，并提出了跨领域的缓解措施，包括对抗强化、对齐工程、NIST AI RMF和EU AI Act治理，以及人因设计。我们认为世界模型必须被视为需要与飞行控制软件或医疗设备一样严格的安全关键基础设施。

更新时间: 2026-04-01 19:57:33

领域: cs.CR,cs.AI,cs.LG,cs.RO

下载: http://arxiv.org/abs/2604.01346v1

Evolutionary Multi-Objective Fusion of Deepfake Speech Detectors

While deepfake speech detectors built on large self-supervised learning (SSL) models achieve high accuracy, employing standard ensemble fusion to further enhance robustness often results in oversized systems with diminishing returns. To address this, we propose an evolutionary multi-objective score fusion framework that jointly minimizes detection error and system complexity. We explore two encodings optimized by NSGA-II: binary-coded detector selection for score averaging and a real-valued scheme that optimizes detector weights for a weighted sum. Experiments on the ASVspoof 5 dataset with 36 SSL-based detectors show that the obtained Pareto fronts outperform simple averaging and logistic regression baselines. The real-valued variant achieves 2.37% EER (0.0684 minDCF) and identifies configurations that match state-of-the-art performance while significantly reducing system complexity, requiring only half the parameters. Our method also provides a diverse set of trade-off solutions, enabling deployment choices that balance accuracy and computational cost.

Updated: 2026-04-01 19:17:59

标题: 深度伪造语音检测器的进化多目标融合

摘要: 虽然基于大型自监督学习（SSL）模型构建的深度伪造语音检测器可以实现高准确性，但使用标准的集成融合来进一步增强鲁棒性通常会导致系统过大且回报递减。为了解决这个问题，我们提出了一个进化多目标得分融合框架，同时最小化检测误差和系统复杂性。我们探索了两种由NSGA-II优化的编码方式：用于得分平均的二进制编码检测器选择和优化检测器权重以获得加权和的实值方案。在包含36个基于SSL的检测器的ASVspoof 5数据集上进行的实验表明，所得到的帕累托前沿优于简单平均和逻辑回归基线。实值变体实现了2.37%的EER（0.0684的minDCF）并识别出与最先进性能相匹配的配置，同时显著减少系统复杂性，仅需要一半的参数。我们的方法还提供了多样的权衡解决方案集，使得可以进行平衡准确性和计算成本的部署选择。

更新时间: 2026-04-01 19:17:59

领域: cs.SD,cs.AI,cs.CR,cs.LG,cs.NE

下载: http://arxiv.org/abs/2604.01330v1

HippoCamp: Benchmarking Contextual Agents on Personal Computers

We present HippoCamp, a new benchmark designed to evaluate agents' capabilities on multimodal file management. Unlike existing agent benchmarks that focus on tasks like web interaction, tool use, or software automation in generic settings, HippoCamp evaluates agents in user-centric environments to model individual user profiles and search massive personal files for context-aware reasoning. Our benchmark instantiates device-scale file systems over real-world profiles spanning diverse modalities, comprising 42.4 GB of data across over 2K real-world files. Building upon the raw files, we construct 581 QA pairs to assess agents' capabilities in search, evidence perception, and multi-step reasoning. To facilitate fine-grained analysis, we provide 46.1K densely annotated structured trajectories for step-wise failure diagnosis. We evaluate a wide range of state-of-the-art multimodal large language models (MLLMs) and agentic methods on HippoCamp. Our comprehensive experiments reveal a significant performance gap: even the most advanced commercial models achieve only 48.3% accuracy in user profiling, struggling particularly with long-horizon retrieval and cross-modal reasoning within dense personal file systems. Furthermore, our step-wise failure diagnosis identifies multimodal perception and evidence grounding as the primary bottlenecks. Ultimately, HippoCamp exposes the critical limitations of current agents in realistic, user-centric environments and provides a robust foundation for developing next-generation personal AI assistants.

Updated: 2026-04-01 17:58:33

标题: HippoCamp：在个人电脑上对上下文代理进行基准测试

摘要: 我们提出了HippoCamp，这是一个新的基准，旨在评估代理在多模态文件管理方面的能力。与现有的代理基准不同，这些基准侧重于诸如网络交互、工具使用或通用设置中的软件自动化等任务，HippoCamp在用户为中心的环境中评估代理，以建模个人用户配置文件并搜索大量个人文件以进行上下文感知推理。我们的基准在现实世界的文件配置文件上实例化了设备规模的文件系统，覆盖了多种模态，涵盖了超过2K个现实世界文件，总计42.4GB的数据。在基于原始文件的基础上，我们构建了581个问答对，用于评估代理在搜索、证据感知和多步推理方面的能力。为了便于细粒度分析，我们提供了46.1K个密集注释的结构化轨迹，用于逐步故障诊断。我们在HippoCamp上评估了各种最先进的多模态大型语言模型（MLLMs）和代理方法。我们的综合实验揭示了显著的性能差距：即使是最先进的商业模型也只能在用户配置文件中实现48.3%的准确性，特别是在长时间检索和在密集个人文件系统中进行跨模态推理方面表现不佳。此外，我们的逐步故障诊断确定多模态感知和证据基础为主要瓶颈。最终，HippoCamp暴露了当前代理在现实的、以用户为中心的环境中的关键限制，并为开发下一代个人AI助手提供了坚实的基础。

更新时间: 2026-04-01 17:58:33

领域: cs.AI,cs.CV

下载: http://arxiv.org/abs/2604.01221v1

LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)

Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central challenge in complex systems, as measurements can be spatially incomplete and can be also limited to narrow temporal windows. Yet approximating the complete spatio-temporal trajectory is essential for mechanistic insight and understanding, model calibration, and operational decision-making. We introduce LAPIS-SHRED (LAtent Phase Inference from Short time sequence using SHallow REcurrent Decoders), a modular architecture that reconstructs and/or forecasts complete spatiotemporal dynamics from sparse sensor observations confined to short temporal windows. LAPIS-SHRED operates through a three-stage pipeline: (i) a SHRED model is pre-trained entirely on simulation data to map sensor time-histories into a structured latent space, (ii) a temporal sequence model, trained on simulation-derived latent trajectories, learns to propagate latent states forward or backward in time to span unobserved temporal regions from short observational time windows, and (iii) at deployment, only a short observation window of hyper-sparse sensor measurements from the true system is provided, from which the frozen SHRED model and the temporal model jointly reconstruct or forecast the complete spatiotemporal trajectory. The framework supports bidirectional inference, inherits data assimilation and multiscale reconstruction capabilities from its modular structure, and accommodates extreme observational constraints including single-frame terminal inputs. We evaluate LAPIS-SHRED on six experiments spanning complex spatio-temporal physics: turbulent flows, multiscale propulsion physics, volatile combustion transients, and satellite-derived environmental fields, highlighting a lightweight, modular architecture suited for operational settings where observation is constrained by physical or logistical limitations.

Updated: 2026-04-01 17:55:10

标题: 通过浅层递归解码器从短时间序列推断潜在阶段（LAPIS-SHRED）

摘要: 在复杂系统中，从空间和时间上稀疏观测中重建完整的时空动态仍然是一个中心挑战，因为测量可以在空间上不完整，也可以限制在狭窄的时间窗口内。然而，近似完整的时空轨迹对于机械洞察和理解、模型校准以及运营决策是至关重要的。我们介绍了LAPIS-SHRED（使用SHallow REcurrent解码器从短时间序列中推断潜在相位），这是一个模块化架构，可以从仅限于短时间窗口的稀疏传感器观测中重建和/或预测完整的时空动态。LAPIS-SHRED通过一个三阶段流程运行：（i）一个SHRED模型完全在模拟数据上进行预训练，将传感器时间历史映射到结构化的潜在空间，（ii）一个在模拟衍生的潜在轨迹上进行训练的时间序列模型学习如何将潜在状态向前或向后传播，以跨越从短观测时间窗口到未观测时间区域，（iii）在部署时，仅提供真实系统的一个超稀疏传感器测量的短观测窗口，从中冻结的SHRED模型和时间模型共同重建或预测完整的时空轨迹。该框架支持双向推断，从其模块化结构中继承数据同化和多尺度重建能力，并适应包括单帧终端输入在内的极端观测约束。我们在涵盖复杂时空物理学的六个实验上评估了LAPIS-SHRED：湍流流动、多尺度推进物理学、挥发性燃烧瞬态和卫星导出的环境场，突出了适用于观测受到物理或后勤限制的运营环境的轻量级、模块化架构。

更新时间: 2026-04-01 17:55:10

领域: cs.LG,cs.AI,cs.CV

下载: http://arxiv.org/abs/2604.01216v1

The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline

AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines forecast skill. Existing theory addresses specific architectural choices rather than the learning pipeline as a whole, while operational evidence from 2023-2026 demonstrates that training methodology, loss function design, and data diversity matter at least as much as architecture selection. This paper makes two interleaved contributions. Theoretically, we construct a framework rooted in approximation theory on the sphere, dynamical systems theory, information theory, and statistical learning theory that treats the complete learning pipeline (architecture, loss function, training strategy, data distribution) rather than architecture alone. We establish a Learning Pipeline Error Decomposition showing that estimation error (loss- and data-dependent) dominates approximation error (architecture-dependent) at current scales. We develop a Loss Function Spectral Theory formalizing MSE-induced spectral blurring in spherical harmonic coordinates, and derive Out-of-Distribution Extrapolation Bounds proving that data-driven models systematically underestimate record-breaking extremes with bias growing linearly in record exceedance. Empirically, we validate these predictions via inference across ten architecturally diverse AI weather models using NVIDIA Earth2Studio with ERA5 initial conditions, evaluating six metrics across 30 initialization dates spanning all seasons. Results confirm universal spectral energy loss at high wavenumbers for MSE-trained models, rising Error Consensus Ratios showing that the majority of forecast error is shared across architectures, and linear negative bias during extreme events. A Holistic Model Assessment Score provides unified multi-dimensional evaluation, and a prescriptive framework enables mathematical evaluation of proposed pipelines before training.

Updated: 2026-04-01 17:53:51

标题: 食谱比厨房更重要：人工智能天气预测管道的数学基础

摘要: 人工智能天气预测已经迅速发展，但尚无统一的数学框架来解释决定预测技能的因素。现有理论主要涉及特定的架构选择，而不是整个学习流程，而来自2023-2026的操作证据表明，训练方法论、损失函数设计和数据多样性至少与架构选择同等重要。本文提出了两个交叉贡献。从理论上讲，我们构建了一个基于近似理论、动力系统理论、信息论和统计学习理论的框架，该框架处理了完整的学习流程（架构、损失函数、训练策略、数据分布），而不仅仅是架构本身。我们建立了一个学习流程误差分解，表明在当前尺度上，估计误差（损失和数据相关）主导了近似误差（依赖于架构）。我们开发了一种损失函数谱理论，形式化了球面谐波坐标中由MSE引起的谱模糊，并推导出了越界外推界限，证明了数据驱动模型在记录性极端事件中系统地低估了偏差，偏差随着记录超过程度的线性增长。在经验上，我们通过在具有不同架构的十个AI天气模型上进行推理验证了这些预测，使用NVIDIA Earth2Studio和ERA5初始条件，评估了跨越所有季节的30个初始化日期上的六个指标。结果确认了MSE训练模型在高波数上普遍存在谱能量损失，错误共识比率上升，表明大多数预测错误是跨架构共享的，并且在极端事件期间存在线性负偏差。综合模型评估分数提供了统一的多维评估，一个规范框架使得可以在训练之前对提出的流程进行数学评估。

更新时间: 2026-04-01 17:53:51

领域: cs.LG,cs.AI,physics.ao-ph

下载: http://arxiv.org/abs/2604.01215v1

A Self-Improving Architecture for Dynamic Safety in Large Language Models

Context: Large Language Models (LLMs) rely on static, pre-deployment safety mechanisms that cannot adapt to adversarial threats discovered after release. Objective: To design a software architecture enabling LLM-based systems to autonomously detect safety failures and synthesize defense policies at runtime, without retraining or manual intervention. Method: We propose the Self-Improving Safety Framework (SISF), grounded in the MAPE-K reference model. The framework couples a target LLM with a feedback loop: an Adjudicator detects breaches, a Policy Synthesis Module generates dual-mechanism defense policies (heuristic and semantic), and a Warden enforces them. We conducted seven experiments (10,061 evaluations) across four model families. Results: Across five reproducibility trials, SISF achieved a mean Attack Success Rate (ASR) of 0.27% (+/-0.15%), autonomously generating 240 policies per trial. Cross-model evaluation confirmed deployment portability. A held-out test showed a 68.5% proactive interception rate on unseen attacks. Stacked behind Llama Guard 4, the combined defense reduced residual ASR from 7.88% to 0.00%. Ablation confirmed both heuristic and semantic policy types are architecturally required. Conclusion: Self-adaptive architecture is a viable approach to LLM safety. SISF achieves sub-1% ASR through synchronous output monitoring, progressively shifting enforcement to fast, local Warden policies via the MAPE-K loop, offering a new pattern for building resilient AI systems.

Updated: 2026-04-01 17:52:48

标题: 一个自我改进的架构，用于大型语言模型中的动态安全性

摘要: 背景：大型语言模型（LLMs）依赖静态的、部署前的安全机制，无法适应释放后发现的对抗威胁。目标：设计一种软件架构，使基于LLM的系统能够在运行时自主检测安全故障并合成防御策略，无需重新训练或手动干预。方法：我们提出了自我改进安全框架（SISF），基于MAPE-K参考模型。该框架将目标LLM与反馈循环耦合：一个仲裁员检测违规行为，一个策略合成模块生成双重机制的防御策略（启发式和语义），一个典狱长强制执行这些策略。我们在四个模型系列中进行了七次实验（10,061次评估）。结果：在五次可重现性试验中，SISF实现了平均攻击成功率（ASR）为0.27%（+/-0.15%），每次试验自主生成240个策略。跨模型评估证实了部署可移植性。一个隐含的测试显示对未知攻击的主动拦截率为68.5%。在Llama Guard 4之后堆叠，组合防御将残留的ASR从7.88%降低到0.00%。消融实验证实了启发式和语义策略类型在体系结构上的必要性。结论：自适应架构是一种可行的LLM安全方法。通过同步输出监控，SISF通过MAPE-K循环逐渐将执法转移到快速、本地的典狱长策略，提供了建立弹性人工智能系统的新模式。

更新时间: 2026-04-01 17:52:48

领域: cs.SE,cs.AI,cs.CR

下载: http://arxiv.org/abs/2511.07645v2

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

As LLM agents tackle increasingly complex tasks, a critical question is whether they can maintain strategic coherence over long horizons: planning under uncertainty, learning from delayed feedback, and adapting when early mistakes compound. We introduce $\texttt{YC-Bench}$, a benchmark that evaluates these capabilities by tasking an agent with running a simulated startup over a one-year horizon spanning hundreds of turns. The agent must manage employees, select task contracts, and maintain profitability in a partially observable environment where adversarial clients and growing payroll create compounding consequences for poor decisions. We evaluate 12 models, both proprietary and open source, across 3 seeds each. Only three models consistently surpass the starting capital of \$200K, with Claude Opus 4.6 achieving the highest average final funds at \$1.27 M, followed by GLM-5 at \$1.21 M at 11$\times$ lower inference cost. Scratchpad usage, the sole mechanism for persisting information across context truncation, is the strongest predictor of success, and adversarial client detection is the primary failure mode, accounting for $47\%$ of bankruptcies. Our analysis reveals that frontier models still fail through distinct failure modes such as over-parallelization, demonstrating the capability gaps for long-horizon performance. $\texttt{YC-Bench}$ is open-source, reproducible, and configurable.

Updated: 2026-04-01 17:52:19

标题: $\texttt{YC-Bench}$：为长期规划和一致执行的AI代理进行基准测试

摘要: 随着LLM代理人处理日益复杂的任务，一个关键问题是它们能否在长期的视野内保持战略连贯性：在不确定性下进行规划，从延迟反馈中学习，并在早期错误累积时进行调整。我们介绍了$\texttt{YC-Bench}$，这是一个基准测试，通过让代理人在跨越数百个回合的一年期间内经营一个模拟初创企业来评估这些能力。代理人必须管理员工，选择任务合同，并在部分可观察的环境中保持盈利，其中敌对客户和增长的工资单会对糟糕的决策产生复合影响。我们评估了12个模型，包括专有和开源的模型，每个模型使用3个种子。只有三个模型始终超过了起始资本的\$200K，其中Claude Opus 4.6在平均最终资金上取得了最高的\$1.27M，其次是GLM-5，在11倍更低的推理成本下达到了\$1.21M。草稿使用是在上下文截断中保持信息的唯一机制，是成功的最强预测因素，敌对客户检测是主要的失败模式，占破产案例的47％。我们的分析显示，前沿模型仍然通过不同的失败模式（如过度并行化）失败，展示了长期表现的能力差距。$\texttt{YC-Bench}$是开源的，可重现的，并且可配置的。

更新时间: 2026-04-01 17:52:19

领域: cs.CL,cs.AI

下载: http://arxiv.org/abs/2604.01212v1

CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery

Scientific algorithm discovery is iterative: hypotheses are proposed, implemented, stress-tested, and revised. Current LLM-guided search systems accelerate proposal generation, but often under-represent scientific structure by optimizing code-only artifacts with weak correctness/originality gating. We present CliffSearch, an agentic evolutionary framework in which the core evolution operators (pair selection, crossover, mutation, and review) are implemented as LLM agents, and the loop is designed around three principles: (1) each node is a structured scientific artifact, instantiated in either theory+code or code_only mode, (2) reviewer judgments of correctness and originality are first-class selection gates alongside optimization of the benchmark metric of interest, and (3) mutation is split into exploration and correction pathways with distinct objectives. Exploration mutation imports ideas from adjacent scientific domains to increase novelty, while correction mutation performs targeted evidence-guided repair using reviewer signals over theory, code, benchmark results, and runtime errors. We illustrate the framework on three benchmark-grounded studies: transformer hyper-connection evolution, optimizer discovery on a fixed nanoGPT stack, and a smaller native-optimizer ablation. Across these settings, the same loop supports explicit metric direction, reproducible persistence, and reviewer-gated comparison of discoveries under controlled search conditions. The result is a discovery workflow that prioritizes scientific interpretability and correctness while optimizing task metrics under controlled novelty constraints, rather than maximizing candidate throughput alone. Full run artifacts, interactive visualizations, and exported best nodes for the reported studies are available at https://cliffsearch.ai .

Updated: 2026-04-01 17:51:26

标题: CliffSearch：科学算法发现中的结构化主体共同进化理论和代码

摘要: 科学算法的发现是迭代的：假设被提出、实施、经过压力测试并进行修订。当前的LLM引导搜索系统加速了提议的生成，但往往通过优化仅包含代码的人工制品，弱化了科学结构。我们提出了CliffSearch，这是一个具有代理性的进化框架，其中核心演化操作（配对选择、交叉、突变和审查）被实现为LLM代理，循环围绕三个原则设计：（1）每个节点都是一个结构化的科学人工制品，实例化为理论+代码或仅代码模式，（2）正确性和原创性的评审判断是第一类选择门，同时优化感兴趣的基准度量，（3）突变被分为探索和纠错路径，具有不同的目标。探索突变从相邻的科学领域引入想法以增加新颖性，而纠错突变则使用审稿人信号对理论、代码、基准结果和运行时错误进行有针对性的修复。我们在三个基准研究中展示了该框架：变压器超连接演化、在固定的nanoGPT堆栈上进行优化器发现，以及较小的本地优化器消融。在这些设置中，相同的循环支持明确的度量方向、可重现的持久性，并在受控搜索条件下进行发现的审稿人门控比较。结果是一种优先考虑科学可解释性和正确性的发现工作流程，同时在受控的新颖性约束下优化任务指标，而不仅仅是最大化候选量。报告研究的完整运行人工制品、交互式可视化和导出的最佳节点可在https://cliffsearch.ai 上找到。

更新时间: 2026-04-01 17:51:26

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.01210v1

LLM REgression with a Latent Iterative State Head

We present RELISH (REgression with a Latent Iterative State Head), a novel, lightweight architecture designed for text regression with large language models. Rather than decoding numeric targets as text or aggregating multiple generated outputs, RELISH predicts scalar values directly from frozen LLM representations by iteratively refining a learned latent state through cross-attention over token-level representations, and then mapping the final state to a point estimate with a linear regressor. Across five datasets, four LLM backbones, and two LLM training regimes, RELISH consistently outperforms prior baselines from all three major LLM regression families, including autoregressive decoding, regression-aware inference, and existing predictive head methods. Despite these gains, RELISH remains highly parameter-efficient, requiring only 3.4-3.7M trainable parameters across frozen LLM backbones (only 0.01-0.04% additional overhead), far less than LoRA-based alternatives that grow with model size (0.26-0.42%).

Updated: 2026-04-01 17:50:32

标题: 用带有潜在迭代状态头的LLM回归

摘要: 我们提出了RELISH（具有潜在迭代状态头的回归），这是一种新颖的轻量级架构，专为使用大型语言模型进行文本回归而设计。与将数字目标解码为文本或汇总多个生成的输出不同，RELISH通过在标记级别的表示上交叉注意力，通过迭代地优化学习到的潜在状态，然后将最终状态映射到一个点估计值，直接从冻结的LLM表示中预测标量值。在五个数据集、四个LLM骨干和两种LLM训练制度中，RELISH始终优于先前三个主要LLM回归家族（包括自回归解码、回归感知推理和现有预测头方法）的基线。尽管取得了这些收益，RELISH仍然具有高参数效率，仅在冻结的LLM骨干上需要3.4-3.7M可训练参数（仅增加0.01-0.04%的额外开销），远远少于随着模型大小增长而增加的基于LoRA的替代方案（0.26-0.42%）。

更新时间: 2026-04-01 17:50:32

领域: cs.CL,cs.LG

下载: http://arxiv.org/abs/2604.01206v1

Neural Harmonic Textures for High-Quality Primitive Based Neural Reconstruction

Primitive-based methods such as 3D Gaussian Splatting have recently become the state-of-the-art for novel-view synthesis and related reconstruction tasks. Compared to neural fields, these representations are more flexible, adaptive, and scale better to large scenes. However, the limited expressivity of individual primitives makes modeling high-frequency detail challenging. We introduce Neural Harmonic Textures, a neural representation approach that anchors latent feature vectors on a virtual scaffold surrounding each primitive. These features are interpolated within the primitive at ray intersection points. Inspired by Fourier analysis, we apply periodic activations to the interpolated features, turning alpha blending into a weighted sum of harmonic components. The resulting signal is then decoded in a single deferred pass using a small neural network, significantly reducing computational cost. Neural Harmonic Textures yield state-of-the-art results in real-time novel view synthesis while bridging the gap between primitive- and neural-field-based reconstruction. Our method integrates seamlessly into existing primitive-based pipelines such as 3DGUT, Triangle Splatting, and 2DGS. We further demonstrate its generality with applications to 2D image fitting and semantic reconstruction.

Updated: 2026-04-01 17:48:22

标题: 神经和谐纹理用于高质量基于基元的神经重建

摘要: 原始基于方法，如3D高斯Splatting最近已成为新视图合成和相关重建任务的最先进技术。与神经场相比，这些表示更加灵活，适应性更强，能更好地适应大场景。然而，个体基元的有限表达能力使得建模高频细节具有挑战性。我们引入神经谐波纹理，一种神经表示方法，将潜在特征向量锚定在围绕每个基元的虚拟支架上。这些特征在基元内的射线交点处进行插值。受傅里叶分析启发，我们将周期激活应用于插值特征，将alpha混合转化为谐波分量的加权和。然后，使用小型神经网络在单个延迟传递中解码结果信号，显著降低计算成本。神经谐波纹理在实时新视图合成中产生了最先进的结果，同时弥合了基于基元和基于神经场的重建之间的差距。我们的方法可以无缝集成到现有的基于基元的管道中，如3DGUT、Triangle Splatting和2DGS。我们进一步演示其通用性，应用于2D图像拟合和语义重建。

更新时间: 2026-04-01 17:48:22

领域: cs.CV,cs.AI,cs.GR,cs.LG

下载: http://arxiv.org/abs/2604.01204v1

Therefore I am. I Think

We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable, early-encoded decisions shape chain-of-thought in reasoning models. Specifically, we show that a simple linear probe successfully decodes tool-calling decisions from pre-generation activations with very high confidence, and in some cases, even before a single reasoning token is produced. Activation steering supports this causally: perturbing the decision direction leads to inflated deliberation, and flips behavior in many examples (between 7 - 79% depending on model and benchmark). We also show through behavioral analysis that, when steering changes the decision, the chain-of-thought process often rationalizes the flip rather than resisting it. Together, these results suggest that reasoning models can encode action choices before they begin to deliberate in text.

Updated: 2026-04-01 17:46:23

标题: 因此我是。我思故我在。

摘要: 我们考虑一个问题：当一个大型语言推理模型做出选择时，它是先思考然后再决定，还是先决定然后再思考？在本文中，我们提供证据表明可以检测到的早期编码的决定塑造了推理模型的思维链。具体地，我们展示了一个简单的线性探测器可以成功地从预生成激活中高度自信地解码工具调用决策，有时甚至在产生一个推理令牌之前。激活引导支持这种因果关系：扰动决策方向会导致膨胀的思考过程，并在许多例子中改变行为（根据模型和基准测试不同，变化在7-79%之间）。我们还通过行为分析展示，当引导改变决定时，思维链过程通常会合理化这种改变而不是抵制它。综合这些结果表明，推理模型可以在文本中开始深思之前编码行动选择。

更新时间: 2026-04-01 17:46:23

领域: cs.AI

下载: http://arxiv.org/abs/2604.01202v1

Learning and Generating Mixed States Prepared by Shallow Channel Circuits

Learning quantum states from measurement data is a central problem in quantum information and computational complexity. In this work, we study the problem of learning to generate mixed states on a finite-dimensional lattice. Motivated by recent developments in mixed state phases of matter, we focus on arbitrary states in the trivial phase. A state belongs to the trivial phase if there exists a shallow preparation channel circuit under which local reversibility is preserved throughout the preparation. We prove that any mixed state in this class can be efficiently learned from measurement access alone. Specifically, given copies of an unknown trivial phase mixed state, our algorithm outputs a shallow local channel circuit that approximately generates this state in trace distance. The sample complexity and runtime are polynomial (or quasi-polynomial) in the number of qubits, assuming constant (or polylogarithmic) circuit depth and gate locality. Importantly, the learner is not given the original preparation circuit and relies only on its existence. Our results provide a structural foundation for quantum generative models based on shallow channel circuits. In the classical limit, our framework also inspires an efficient algorithm for classical diffusion models using only a polynomial overhead of training and generation.

Updated: 2026-04-01 17:42:56

标题: 学习和生成由浅通道电路准备的混合态

摘要: 学习从测量数据中学习量子态是量子信息和计算复杂性中的一个核心问题。在这项工作中，我们研究了学习在有限维晶格上生成混合态的问题。受混合态物质相最近的发展启发，我们专注于平凡相中的任意态。如果存在一个浅层制备通道电路，使得在整个制备过程中局部可逆性得以保持，那么状态属于平凡相。我们证明了这一类中的任何混合态都可以仅通过测量访问有效地学习。具体来说，给定一个未知的平凡相混合态的副本，我们的算法输出一个近似生成该状态的浅层局部通道电路，其迹距离。样本复杂度和运行时间是多项式的（或者准多项式的），假设电路深度和门定位是恒定的（或者对数多项式的）。重要的是，学习者没有原始的制备电路，只能依赖其存在。我们的结果为基于浅层通道电路的量子生成模型提供了结构基础。在经典极限下，我们的框架还启发了一个仅使用多项式训练和生成的高效的经典扩散模型算法。

更新时间: 2026-04-01 17:42:56

领域: quant-ph,cond-mat.stat-mech,cs.CC,cs.LG

下载: http://arxiv.org/abs/2604.01197v1

ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget

Search agents, which integrate language models (LMs) with web search, are becoming crucial for answering complex user queries. Constructing training datasets for deep research tasks, involving multi-step retrieval and reasoning, remains challenging due to expensive human annotation, or cumbersome prerequisites. In this work, we introduce ORBIT, a training dataset with 20K reasoning-intensive queries with short verifiable answers, generated using a frugal framework without relying on paid API services. The modular framework relies on four stages: seed creation, question--answer pair generation, and two stages of verification: self and external. ORBIT spans 15 domains and each training pair requires 4--5 reasoning steps, with external search verification required from the complete web. We train Qwen3-4B as the base model on ORBIT using GRPO and evaluate it on Wikipedia question answering tasks. Extensive experiment results demonstrate that ORBIT-4B achieves strong performance among sub-4B LLMs as search agents, proving the utility of synthetic datasets. Our framework, code and datasets are open-sourced and available publicly.

Updated: 2026-04-01 17:42:41

标题: ORBIT：紧密预算下搜索代理的可扩展和可验证数据生成

摘要: 搜索代理，将语言模型（LMs）与网络搜索集成，对于回答复杂用户查询至关重要。构建用于深度研究任务的训练数据集，涉及多步检索和推理，由于昂贵的人工标注或繁琐的先决条件，仍然具有挑战性。在这项工作中，我们引入了ORBIT，一个包含20K个理性密集型查询的训练数据集，带有简短可验证的答案，使用节俭的框架生成，而不依赖付费API服务。这个模块化框架依赖于四个阶段：种子创建、问题-答案对生成，以及两个验证阶段：自身验证和外部验证。ORBIT横跨15个领域，每个训练对需要4-5个推理步骤，需要从完整的网络进行外部搜索验证。我们在ORBIT上使用GRPO训练Qwen3-4B作为基础模型，并在维基百科问答任务上进行评估。广泛的实验结果表明，ORBIT-4B在低于4B LLMs的搜索代理中取得了强大的性能，证明了合成数据集的实用性。我们的框架、代码和数据集是开源的，公开可用。

更新时间: 2026-04-01 17:42:41

领域: cs.CL,cs.AI,cs.IR

下载: http://arxiv.org/abs/2604.01195v1

AgentWatcher: A Rule-based Prompt Injection Monitor

Large language models (LLMs) and their applications, such as agents, are highly vulnerable to prompt injection attacks. State-of-the-art prompt injection detection methods have the following limitations: (1) their effectiveness degrades significantly as context length increases, and (2) they lack explicit rules that define what constitutes prompt injection, causing detection decisions to be implicit, opaque, and difficult to reason about. In this work, we propose AgentWatcher to address the above two limitations. To address the first limitation, AgentWatcher attributes the LLM's output (e.g., the action of an agent) to a small set of causally influential context segments. By focusing detection on a relatively short text, AgentWatcher can be scalable to long contexts. To address the second limitation, we define a set of rules specifying what does and does not constitute a prompt injection, and use a monitor LLM to reason over these rules based on the attributed text, making the detection decisions more explainable. We conduct a comprehensive evaluation on tool-use agent benchmarks and long-context understanding datasets. The experimental results demonstrate that AgentWatcher can effectively detect prompt injection and maintain utility without attacks. The code is available at https://github.com/wang-yanting/AgentWatcher.

Updated: 2026-04-01 17:40:03

标题: AgentWatcher：基于规则的提示注入监视器

摘要: 大型语言模型（LLMs）及其应用，如代理，对提示注入攻击非常容易受到攻击。最先进的提示注入检测方法存在以下限制：（1）随着上下文长度的增加，它们的有效性显著降低，（2）它们缺乏明确的规则来定义什么构成提示注入，导致检测决策是隐式的、不透明的，并且难以理解。在这项工作中，我们提出了AgentWatcher来解决上述两个限制。为了解决第一个限制，AgentWatcher将LLM的输出（例如代理的行为）归因于一小组因果影响的上下文段。通过将检测集中在相对较短的文本上，AgentWatcher可以扩展到长上下文。为了解决第二个限制，我们定义了一组规则，指定什么构成提示注入，以及什么不构成提示注入，并使用监控LLM根据归因文本推理这些规则，使检测决策更具可解释性。我们对工具使用代理基准和长上下文理解数据集进行了全面评估。实验结果表明，AgentWatcher可以有效检测提示注入并在没有攻击的情况下保持效用。代码可在https://github.com/wang-yanting/AgentWatcher获得。

更新时间: 2026-04-01 17:40:03

领域: cs.CR

下载: http://arxiv.org/abs/2604.01194v1

CRoPE: Efficient Parametrization of Rotary Positional Embedding

Rotary positional embedding has become the state-of-the-art approach to encode position information in transformer-based models. While it is often succinctly expressed in complex linear algebra, we note that the actual implementation of $Q/K/V$-projections is not equivalent to a complex linear transformation. We argue that complex linear transformation is a more natural parametrization and saves near 50\% parameters within the attention block. We show empirically that removing such redundancy has negligible impact on the model performance. Our modification achieves more efficient parameter usage, as well as a cleaner interpretation of the representation space.

Updated: 2026-04-01 17:35:04

标题: CRoPE：旋转位置嵌入的有效参数化

摘要: 旋转位置嵌入已成为在基于变压器的模型中编码位置信息的最先进方法。虽然它通常以复杂的线性代数简洁表达，但我们注意到$Q/K/V$-投影的实际实现并不等同于复杂的线性变换。我们认为复杂线性变换是一种更自然的参数化方式，并且在注意力块内节省近50\%的参数。我们凭经验证明去除这种冗余对模型性能的影响微乎其微。我们的修改实现了更高效的参数使用，以及对表示空间的更清晰解释。

更新时间: 2026-04-01 17:35:04

领域: cs.LG

下载: http://arxiv.org/abs/2601.02728v2

SA-CycleGAN-2.5D: Self-Attention CycleGAN with Tri-Planar Context for Multi-Site MRI Harmonization

Multi-site neuroimaging analysis is fundamentally confounded by scanner-induced covariate shifts, where the marginal distribution of voxel intensities $P(\mathbf{x})$ varies non-linearly across acquisition protocols while the conditional anatomy $P(\mathbf{y}|\mathbf{x})$ remains constant. This is particularly detrimental to radiomic reproducibility, where acquisition variance often exceeds biological pathology variance. Existing statistical harmonization methods (e.g., ComBat) operate in feature space, precluding spatial downstream tasks, while standard deep learning approaches are theoretically bounded by local effective receptive fields (ERF), failing to model the global intensity correlations characteristic of field-strength bias. We propose SA-CycleGAN-2.5D, a domain adaptation framework motivated by the $HΔH$-divergence bound of Ben-David et al., integrating three architectural innovations: (1) A 2.5D tri-planar manifold injection preserving through-plane gradients $\nabla_z$ at $O(HW)$ complexity; (2) A U-ResNet generator with dense voxel-to-voxel self-attention, surpassing the $O(\sqrt{L})$ receptive field limit of CNNs to model global scanner field biases; and (3) A spectrally-normalized discriminator constraining the Lipschitz constant ($K_D \le 1$) for stable adversarial optimization. Evaluated on 654 glioma patients across two institutional domains (BraTS and UPenn-GBM), our method reduces Maximum Mean Discrepancy (MMD) by 99.1% ($1.729 \to 0.015$) and degrades domain classifier accuracy to near-chance (59.7%). Ablation confirms that global attention is statistically essential (Cohen's $d = 1.32$, $p < 0.001$) for the harder heterogeneous-to-homogeneous translation direction. By bridging 2D efficiency and 3D consistency, our framework yields voxel-level harmonized images that preserve tumor pathophysiology, enabling reproducible multi-center radiomic analysis.

Updated: 2026-04-01 17:34:06

标题: SA-CycleGAN-2.5D：具有三平面上下文的自注意力CycleGAN用于多站点MRI协调

摘要: 多中心神经影像分析基本上受到扫描仪诱发的协变量变化的影响，其中体素强度的边缘分布$P(\mathbf{x})$在获取协议之间非线性变化，而条件解剖$P(\mathbf{y}|\mathbf{x})$保持恒定。这对放射组学的可重复性特别有害，因为获取方差通常超过生物病理方差。现有的统计协调方法（例如，ComBat）在特征空间中操作，排除了空间下游任务，而标准的深度学习方法在理论上受到局部有效感受野（ERF）的限制，无法模拟场强偏差的全局强度相关性。我们提出SA-CycleGAN-2.5D，这是一个受Ben-David等人的$HΔH$-散度界限启发的领域适应框架，结合了三个架构创新：（1）一个保持通过平面梯度$\nabla_z$的2.5D三平面流形注射，复杂度为$O(HW)$；（2）一个带有密集体素自我关注的U-ResNet生成器，超越了CNN的$O(\sqrt{L})$感受野限制，以模拟全局扫描仪场偏差；以及（3）一个谱归一化的鉴别器，约束Lipschitz常数（$K_D \le 1$）以稳定对抗优化。在两个机构域（BraTS和UPenn-GBM）的654例胶质瘤患者上进行评估，我们的方法将最大均值差异（MMD）降低了99.1%（从$1.729$到$0.015$），并将域分类器的准确性降低到近似机会水平（59.7%）。消融实验证实全局关注在更难的异质到同质转换方向上在统计上是至关重要的（Cohen's $d = 1.32$，$p < 0.001$）。通过将2D效率和3D一致性联系起来，我们的框架产生了保留肿瘤病理生理学的体素级协调图像，实现了可重复的多中心放射组学分析。

更新时间: 2026-04-01 17:34:06

领域: cs.CV,cs.AI,cs.LG

下载: http://arxiv.org/abs/2603.17219v2

A ROS 2 Wrapper for Florence-2: Multi-Mode Local Vision-Language Inference for Robotic Systems

Foundation vision-language models are becoming increasingly relevant to robotics because they can provide richer semantic perception than narrow task-specific pipelines. However, their practical adoption in robot software stacks still depends on reproducible middleware integrations rather than on model quality alone. Florence-2 is especially attractive in this regard because it unifies captioning, optical character recognition, open-vocabulary detection, grounding and related vision-language tasks within a comparatively manageable model size. This article presents a ROS 2 wrapper for Florence-2 that exposes the model through three complementary interaction modes: continuous topic-driven processing, synchronous service calls and asynchronous actions. The wrapper is designed for local execution and supports both native installation and Docker container deployment. It also combines generic JSON outputs with standard ROS 2 message bindings for detection-oriented tasks. A functional validation is reported together with a throughput study on several GPUs, showing that local deployment is feasible with consumer grade hardware. The repository is publicly available here: https://github.com/JEDominguezVidal/florence2_ros2_wrapper

Updated: 2026-04-01 17:29:59

标题: 一个用于Florence-2的ROS 2封装器：机器人系统的多模式本地视觉-语言推理

摘要: 基于视觉-语言模型的基础正在变得越来越与机器人技术相关，因为它们可以提供比狭窄的特定任务管道更丰富的语义感知。然而，它们在机器人软件堆栈中的实际采用仍然取决于可重现的中间件集成，而不仅仅是模型质量。在这方面，Florence-2尤为吸引人，因为它在一个相对可管理的模型大小内统一了字幕生成、光学字符识别、开放词汇检测、定位和相关的视觉-语言任务。本文介绍了Florence-2的ROS 2封装器，通过三种互补的交互模式公开了模型：连续主题驱动处理、同步服务调用和异步动作。该封装器设计用于本地执行，并支持原生安装和Docker容器部署。它还将通用的JSON输出与用于检测导向任务的标准ROS 2消息绑定结合起来。报告了功能验证以及在多个GPU上的吞吐量研究，表明使用消费级硬件进行本地部署是可行的。该存储库可以在此处公开访问：https://github.com/JEDominguezVidal/florence2_ros2_wrapper.

更新时间: 2026-04-01 17:29:59

领域: cs.RO,cs.AI,cs.CV

下载: http://arxiv.org/abs/2604.01179v1

Screening Is Enough

A core limitation of standard softmax attention is that it does not define a notion of absolute query--key relevance: attention weights are obtained by redistributing a fixed unit mass across all keys according to their relative scores. As a result, relevance is defined only relative to competing keys, and irrelevant keys cannot be explicitly rejected. We introduce Multiscreen, a language-model architecture built around a mechanism we call screening, which enables absolute query--key relevance. Instead of redistributing attention across all keys, screening evaluates each key against an explicit threshold, discarding irrelevant keys and aggregating the remaining keys, thereby removing global competition among keys. Across experiments, Multiscreen achieves comparable validation loss with approximately 40% fewer parameters than a Transformer baseline, enables stable optimization at substantially larger learning rates, maintains strong performance in long-context perplexity, shows little to no degradation in retrieval performance even far beyond the training context length, and reduces inference latency by up to 3.2$\times$ at 100K context length.

Updated: 2026-04-01 17:29:08

标题: 筛查就足够

摘要: 标准softmax注意力的核心局限性在于它没有定义绝对查询-键相关性的概念：注意力权重是通过根据它们的相对分数在所有关键之间重新分配固定单位质量来获得的。因此，相关性仅相对于竞争关键进行定义，并且无关键不能被明确拒绝。我们介绍了Multiscreen，这是一个围绕我们称为筛选的机制构建的语言模型架构，它实现了绝对查询-键相关性。筛选不是在所有关键之间重新分配注意力，而是根据显式阈值评估每个关键，丢弃无关键并聚合剩余的关键，从而消除关键之间的全局竞争。在实验中，Multiscreen在与Transformer基线相比具有大约40%更少参数的情况下实现了可比较的验证损失，能够在大幅增加的学习速率下实现稳定优化，在长上下文困惑度中保持强大性能，在超出训练上下文长度甚远时甚至几乎没有检索性能下降，并且在100K上下文长度下将推理延迟降低了多达3.2倍。

更新时间: 2026-04-01 17:29:08

领域: cs.LG,cs.AI,cs.CL

下载: http://arxiv.org/abs/2604.01178v1

NeuroDDAF: Neural Dynamic Diffusion-Advection Fields with Evidential Fusion for Air Quality Forecasting

Accurate air quality forecasting is crucial for protecting public health and guiding environmental policy, yet it remains challenging due to nonlinear spatiotemporal dynamics, wind-driven transport, and distribution shifts across regions. Physics-based models are interpretable but computationally expensive and often rely on restrictive assumptions, whereas purely data-driven models can be accurate but may lack robustness and calibrated uncertainty. To address these limitations, we propose Neural Dynamic Diffusion-Advection Fields (NeuroDDAF), a physics-informed forecasting framework that unifies neural representation learning with open-system transport modeling. NeuroDDAF integrates (i) a GRU-Graph Attention encoder to capture temporal dynamics and wind-aware spatial interactions, (ii) a Fourier-domain diffusion-advection module with learnable residuals, (iii) a wind-modulated latent Neural ODE to model continuous-time evolution under time-varying connectivity, and (iv) an evidential fusion mechanism that adaptively combines physics-guided and neural forecasts while quantifying uncertainty. Experiments on four urban datasets (Beijing, Shenzhen, Tianjin, and Ancona) across 1-3 day horizons show that NeuroDDAF consistently outperforms strong baselines, including AirPhyNet, achieving up to 9.7% reduction in RMSE and 9.4% reduction in MAE on long-term forecasts. On the Beijing dataset, NeuroDDAF attains an RMSE of 41.63 $μ$g/m$^3$ for 1-day prediction and 48.88 $μ$g/m$^3$ for 3-day prediction, representing the best performance among all compared methods. In addition, NeuroDDAF improves cross-city generalization and yields well-calibrated uncertainty estimates, as confirmed by ensemble variance analysis and case studies under varying wind conditions.

Updated: 2026-04-01 17:27:43

标题: NeuroDDAF：具有证据融合的神经动态扩散-对流场，用于空气质量预测

摘要: 准确的空气质量预测对于保护公共健康和指导环境政策至关重要，然而由于非线性时空动态、风驱动的输运以及区域间分布变化，仍然具有挑战性。基于物理的模型具有可解释性但计算成本高昂，而且往往依赖于限制性假设，而纯数据驱动模型可能准确性较高但可能缺乏鲁棒性和校准不确定性。为了解决这些限制，我们提出了神经动态扩散-平流场（NeuroDDAF），这是一个融合了神经表示学习和开放系统输运建模的基于物理的预测框架。NeuroDDAF整合了（i）一个GRU-图注意力编码器，用于捕捉时间动态和风感知空间交互作用，（ii）一个带有可学习残差的傅里叶域扩散-平流模块，（iii）一个风调制的潜在神经ODE，用于模拟在时间变化的连通性下的连续时间演化，以及（iv）一种证据融合机制，可以自适应地结合物理引导和神经预测，并量化不确定性。在北京、深圳、天津和安科纳等四个城市数据集上进行的实验（1-3天的预测范围）表明，NeuroDDAF始终优于强基线，包括AirPhyNet，在长期预测中可达到9.7%的RMSE减少和9.4%的MAE减少。在北京数据集上，NeuroDDAF实现了1天预测的41.63 μg/m$^3$的RMSE和3天预测的48.88 μg/m$^3$的RMSE，代表了所有比较方法中的最佳性能。此外，NeuroDDAF改善了跨城市的泛化能力，并产生了良好校准的不确定性估计，这得到了集成方差分析和在不同风条件下的案例研究的确认。

更新时间: 2026-04-01 17:27:43

领域: cs.LG

下载: http://arxiv.org/abs/2604.01175v1

Safe learning-based control via function-based uncertainty quantification

Uncertainty quantification is essential when deploying learning-based control methods in safety-critical systems. This is commonly realized by constructing uncertainty tubes that enclose the unknown function of interest, e.g., the reward and constraint functions or the underlying dynamics model, with high probability. However, existing approaches for uncertainty quantification typically rely on restrictive assumptions on the unknown function, such as known bounds on functional norms or Lipschitz constants, and struggle with discontinuities. In this paper, we model the unknown function as a random function from which independent and identically distributed realizations can be generated, and construct uncertainty tubes via the scenario approach that hold with high probability and rely solely on the sampled realizations. We integrate these uncertainty tubes into a safe Bayesian optimization algorithm, which we then use to safely tune control parameters on a real Furuta pendulum.

Updated: 2026-04-01 17:23:30

标题: 安全学习控制：基于功能的不确定性量化

摘要: 不确定性量化在将基于学习的控制方法部署到安全关键系统中时至关重要。通常通过构建包围未知函数的不确定性管道来实现，例如奖励和约束函数或基础动力学模型，并且具有很高的概率。然而，现有的不确定性量化方法通常依赖于对未知函数的限制性假设，例如对功能范数或利普希茨常数的已知边界，并且在处理不连续性时存在困难。在本文中，我们将未知函数建模为一个随机函数，可以从中生成独立且同分布的实现，并通过场景方法构建不确定性管道，这些管道具有很高的概率，并且仅依赖于采样实现。我们将这些不确定性管道整合到一个安全的贝叶斯优化算法中，然后使用该算法安全地调节一个真实的富尔塔摆的控制参数。

更新时间: 2026-04-01 17:23:30

领域: eess.SY,cs.LG,math.OC

下载: http://arxiv.org/abs/2604.01173v1

Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning

While test-time scaling has enabled large language models to solve highly difficult tasks, state-of-the-art results come at exorbitant compute costs. These inefficiencies can be attributed to the miscalibration of post-trained language models, and the lack of calibration in popular sampling techniques. Here, we present Online Reasoning Calibration (ORCA), a framework for calibrating the sampling process that draws upon conformal prediction and test-time training. Specifically, we introduce a meta-learning procedure that updates the calibration module for each input. This allows us to provide valid confidence estimates under distributional shift, e.g. in thought patterns that occur across different stages of reasoning, or in prompt distributions between model development and deployment. ORCA not only provides theoretical guarantees on conformal risks, but also empirically shows higher efficiency and generalization across different reasoning tasks. At risk level $δ=0.1$, ORCA improves Qwen2.5-32B efficiency on in-distribution tasks with savings up to 47.5% with supervised labels and 40.7% with self-consistency labels. Under zero-shot out-of-domain settings, it improves MATH-500 savings from 24.8% of the static calibration baseline to 67.0% while maintaining a low empirical error rate, and the same trend holds across model families and downstream benchmarks. Our code is publicly available at https://github.com/wzekai99/ORCA.

Updated: 2026-04-01 17:21:50

标题: 在线推理校准：测试时间训练实现通用的符合逻辑模型推理

摘要: 尽管测试时间缩放使得大型语言模型能够解决极具挑战性的任务，但目前的最先进结果需要极高的计算成本。这种低效可以归因于事后训练语言模型的校准不准确，以及流行的抽样技术缺乏校准。在这里，我们提出了在线推理校准（ORCA），这是一个基于符合预测和测试时间训练的框架，用于校准抽样过程。具体来说，我们引入了一个元学习过程，为每个输入更新校准模块。这使我们能够在分布转移下提供有效的置信度估计，例如在不同推理阶段中出现的思维模式，或者在模型开发和部署之间的提示分布中出现的情况。ORCA不仅在符合风险上提供理论保证，而且在不同推理任务中实证上显示出更高的效率和泛化能力。在风险水平为$δ=0.1$时，ORCA在分布内任务上提高了Qwen2.5-32B的效率，使用有监督标签可节省高达47.5%，使用自一致性标签可节省40.7%。在零样本域外设置下，它将MATH-500的节省率从静态校准基线的24.8%提高到67.0%，同时保持低的实证错误率，这种趋势在不同模型系列和下游基准中也是一致的。我们的代码可以在https://github.com/wzekai99/ORCA 上公开获取。

更新时间: 2026-04-01 17:21:50

领域: cs.LG,cs.AI,cs.CL,stat.AP,stat.ML

下载: http://arxiv.org/abs/2604.01170v1

Bridging the Simulation-to-Experiment Gap with Generative Models using Adversarial Distribution Alignment

A fundamental challenge in science and engineering is the simulation-to-experiment gap. While we often possess prior knowledge of physical laws, these physical laws can be too difficult to solve exactly for complex systems. Such systems are commonly modeled using simulators, which impose computational approximations. Meanwhile, experimental measurements more faithfully represent the real world, but experimental data typically consists of observations that only partially reflect the system's full underlying state. We propose a data-driven distribution alignment framework that bridges this simulation-to-experiment gap by pre-training a generative model on fully observed (but imperfect) simulation data, then aligning it with partial (but real) observations of experimental data. While our method is domain-agnostic, we ground our approach in the physical sciences by introducing Adversarial Distribution Alignment (ADA). This method aligns a generative model of atomic positions -- initially trained on a simulated Boltzmann distribution -- with the distribution of experimental observations. We prove that our method recovers the target observable distribution, even with multiple, potentially correlated observables. We also empirically validate our framework on synthetic, molecular, and experimental protein data, demonstrating that it can align generative models with diverse observables. Our code is available at https://kaityrusnelson.com/ada/.

Updated: 2026-04-01 17:21:33

标题: 用对抗分布对齐使用生成模型弥合模拟与实验之间的差距

摘要: 科学和工程中的一个基本挑战是模拟与实验之间的差距。虽然我们通常掌握物理定律的先验知识，但这些物理定律对于复杂系统来说可能过于难以精确解决。这种系统通常使用模拟器进行建模，而模拟器会施加计算逼近。与此同时，实验测量更忠实地代表了真实世界，但实验数据通常只包含部分反映系统完整基础状态的观察。我们提出了一个数据驱动的分布对齐框架，通过对完全观察到的（但不完美）模拟数据进行预训练，然后将其与实验数据的部分（但真实）观察对齐，来弥合这种模拟与实验之间的差距。虽然我们的方法与领域无关，但我们通过引入对抗分布对齐（ADA）将我们的方法扎根于物理科学。该方法通过将最初在模拟玻尔兹曼分布上进行训练的原子位置生成模型与实验观察的分布进行对齐。我们证明了我们的方法能够恢复目标可观测分布，即使存在多个、可能相关的可观测量。我们还在合成、分子和实验蛋白数据上对我们的框架进行了经验验证，证明它可以将生成模型与各种可观测量对齐。我们的代码可在https://kaityrusnelson.com/ada/上找到。

更新时间: 2026-04-01 17:21:33

领域: cs.LG,cond-mat.mtrl-sci,q-bio.BM

下载: http://arxiv.org/abs/2604.01169v1

S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models

Using roughly 48 execution-verified HumanEval training solutions, tuning a single initial state matrix per recurrent layer, with zero inference overhead, outperforms LoRA by +10.8 pp (p < 0.001) on HumanEval. The method, which we call S0 tuning, optimizes one state matrix per recurrent layer while freezing all model weights. On Qwen3.5-4B (GatedDeltaNet hybrid), S0 tuning improves greedy pass@1 by +23.6 +/- 1.7 pp (10 seeds). On FalconH1-7B (Mamba-2 hybrid), S0 reaches 71.8% +/- 1.3 and LoRA reaches 71.4% +/- 2.4 (3 seeds), statistically indistinguishable at this sample size while requiring no weight merging. Cross-domain transfer is significant on MATH-500 (+4.8 pp, p = 0.00002, 8 seeds) and GSM8K (+2.8 pp, p = 0.0003, 10 seeds); a text-to-SQL benchmark (Spider) shows no transfer, consistent with the trajectory-steering mechanism. A prefix-tuning control on a pure Transformer (Qwen2.5-3B) degrades performance by -13.9 pp under all nine configurations tested. On Qwen3.5, a per-step state-offset variant reaches +27.1 pp, above both S0 and LoRA but with per-step inference cost. Taken together, the results show that recurrent state initialization is a strong zero-inference-overhead PEFT surface for hybrid language models when verified supervision is scarce. The tuned state is a ~48 MB file; task switching requires no weight merging or model reload. Code and library: https://github.com/jackyoung27/s0-tuning.

Updated: 2026-04-01 17:21:15

标题: S0调整：零额外开销的混合循环-注意模型自适应

摘要: 使用大约48个经过执行验证的HumanEval培训解决方案，调整每个循环层的单个初始状态矩阵，零推理开销，优于LoRA +10.8 pp（p <0.001）在HumanEval上。这种方法，我们称之为S0调优，在冻结所有模型权重的同时，优化每个循环层的一个状态矩阵。在Qwen3.5-4B（GatedDeltaNet混合）上，S0调优将贪婪传递@1提高了+23.6 +/- 1.7 pp（10个种子）。在FalconH1-7B（Mamba-2混合）上，S0达到71.8% +/- 1.3，LoRA达到71.4% +/- 2.4（3个种子），在这个样本大小上，在不需要权重合并的情况下，统计上无法区分。跨领域转移在MATH-500（+4.8 pp，p = 0.00002，8个种子）和GSM8K（+2.8 pp，p = 0.0003，10个种子）上是显著的；一个文本到SQL基准（Spider）显示没有转移，与轨迹调整机制一致。在纯Transformer（Qwen2.5-3B）上进行的前缀调优控制在所有九种配置中都降低了-13.9 pp的性能。在Qwen3.5上，每步状态偏移变体达到+27.1 pp，高于S0和LoRA，但具有每步推理成本。综合考虑，结果表明，当验证监督稀缺时，循环状态初始化是混合语言模型的强零推理开销PEFT表面。调整后的状态是一个约48 MB的文件；任务切换不需要权重合并或模型重新加载。代码和库：https://github.com/jackyoung27/s0-tuning。

更新时间: 2026-04-01 17:21:15

领域: cs.CL,cs.LG

下载: http://arxiv.org/abs/2604.01168v1

AdaLoRA-QAT: Adaptive Low-Rank and Quantization-Aware Segmentation

Chest X-ray (CXR) segmentation is an important step in computer-aided diagnosis, yet deploying large foundation models in clinical settings remains challenging due to computational constraints. We propose AdaLoRA-QAT, a two-stage fine-tuning framework that combines adaptive low-rank encoder adaptation with full quantization-aware training. Adaptive rank allocation improves parameter efficiency, while selective mixed-precision INT8 quantization preserves structural fidelity crucial for clinical reliability. Evaluated across large-scale CXR datasets, AdaLoRA-QAT achieves 95.6% Dice, matching full-precision SAM decoder fine-tuning while reducing trainable parameters by 16.6\times and yielding 2.24\times model compression. A Wilcoxon signed-rank test confirms that quantization does not significantly degrade segmentation accuracy. These results demonstrate that AdaLoRA-QAT effectively balances accuracy, efficiency, and structural trust-worthiness, enabling compact and deployable foundation models for medical image segmentation. Code and pretrained models are available at: https://prantik-pdeb.github.io/adaloraqat.github.io/

Updated: 2026-04-01 17:18:46

标题: AdaLoRA-QAT：自适应低秩和量化感知分割

摘要: 胸透（CXR）分割是计算机辅助诊断中的重要步骤，然而在临床环境中部署大型基础模型仍然具有挑战性，这是由于计算约束。我们提出了AdaLoRA-QAT，这是一个两阶段微调框架，结合了自适应低秩编码器适应性和全量化感知训练。自适应秩分配提高了参数效率，而选择性混合精度INT8量化保留了对临床可靠性至关重要的结构保真度。在大规模CXR数据集上评估，AdaLoRA-QAT实现了95.6%的Dice指标，与全精度SAM解码器微调相匹配，同时通过减少可训练参数16.6倍，实现了2.24倍的模型压缩。Wilcoxon符号秩检验证实，量化并不显著降低分割准确性。这些结果表明，AdaLoRA-QAT有效地平衡了准确性、效率和结构可信度，为医学图像分割提供了紪细和可部署的基础模型。代码和预训练模型可在以下网址找到：https://prantik-pdeb.github.io/adaloraqat.github.io/

更新时间: 2026-04-01 17:18:46

领域: eess.IV,cs.AI,cs.CV

下载: http://arxiv.org/abs/2604.01167v1

Evaluating LLM-Generated ACSL Annotations for Formal Verification

Formal specifications are crucial for building verifiable and dependable software systems, yet generating accurate and verifiable specifications for real-world C programs remains challenging. This paper empirically evaluates the extent to which formal-analysis tools can automatically generate and verify ACSL specifications without human or learning-based assistance. We conduct a controlled study on a recently released dataset of 506 C programs, repurposing it from interactive, developer-driven workflows to an automated evaluation setting. Five ACSL generation systems are compared: a rule-based Python script, Frama-C's RTE plugin, and three large language models--DeepSeek-V3.2, GPT-5.2, and OLMo 3.1 32B Instruct. All generated specifications are verified under identical conditions using the Frama-C WP plugin powered by multiple SMT solvers, allowing a direct comparison of annotation quality, solver sensitivity, and proof stability. Our results provide new empirical evidence on the capabilities and limitations of automated ACSL generation, complementing prior survey-based work.

Updated: 2026-04-01 17:15:52

标题: 评估LLM生成的ACSL注释用于形式验证

摘要: 正式规范对于构建可验证和可靠的软件系统至关重要，然而为现实世界的C程序生成准确和可验证的规范仍然具有挑战性。本文通过实证研究评估了正式分析工具在没有人为或基于学习的辅助的情况下能够自动生成和验证ACSL规范的程度。我们在最近发布的506个C程序数据集上进行了一项受控研究，将其从互动式、开发者驱动的工作流程重新用于自动评估设置。比较了五种ACSL生成系统：基于规则的Python脚本、Frama-C的RTE插件，以及三个大型语言模型--DeepSeek-V3.2、GPT-5.2和OLMo 3.1 32B Instruct。所有生成的规范都在相同条件下使用由多个SMT求解器提供支持的Frama-C WP插件进行验证，从而可以直接比较注释质量、求解器敏感性和证明稳定性。我们的结果提供了关于自动化ACSL生成的能力和局限性的新的实证证据，补充了先前基于调查的工作。

更新时间: 2026-04-01 17:15:52

领域: cs.SE,cs.AI

下载: http://arxiv.org/abs/2602.13851v2

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Large language models (LLMs) exhibiting test-time scaling behavior, such as extended reasoning traces and self-verification, have demonstrated remarkable performance on complex, long-term reasoning tasks. However, the robustness of these reasoning behaviors remains underexplored. To investigate this, we conduct a systematic evaluation of multiple reasoning models across three scenarios: (1) problems augmented with lengthy, irrelevant context; (2) multi-turn conversational settings with independent tasks; and (3) problems presented as a subtask within a complex task. We observe an interesting phenomenon: reasoning models tend to produce much shorter reasoning traces (up to 50%) for the same problem under different context conditions compared to the traces produced when the problem is presented in isolation. A finer-grained analysis reveals that this compression is associated with a decrease in self-verification and uncertainty management behaviors, such as double-checking. While this behavioral shift does not compromise performance on straightforward problems, it might affect performance on more challenging tasks. We hope our findings draw additional attention to both the robustness of reasoning models and the problem of context management for LLMs and LLM-based agents.

Updated: 2026-04-01 17:14:18

标题: 推理转变：背景如何悄悄缩短逻辑推理机制

摘要: 大型语言模型（LLMs）展现出测试时间扩展行为，如扩展推理轨迹和自我验证，在复杂、长期推理任务上表现出卓越性能。然而，这些推理行为的鲁棒性仍然未被充分探讨。为了调查这一问题，我们对多个推理模型在三种情境下进行系统评估：（1）问题中增加了冗长、无关的背景；（2）多轮对话设置中进行独立任务；以及（3）作为复杂任务中的一个子任务呈现的问题。我们观察到一个有趣的现象：与在孤立环境下呈现问题时产生的推理轨迹相比，推理模型在不同情境条件下对相同问题往往产生更短的推理轨迹（高达50%）。更细致的分析显示，这种压缩与自我验证和不确定性管理行为（如反复检查）减少有关。虽然这种行为转变不会影响对简单问题的性能，但可能会影响对更具挑战性任务的表现。我们希望我们的发现引起对推理模型的鲁棒性和LLMs及基于LLMs的代理的背景管理问题的额外关注。

更新时间: 2026-04-01 17:14:18

领域: cs.LG

下载: http://arxiv.org/abs/2604.01161v1

Property-Level Flood Risk Assessment Using AI-Enabled Street-View Lowest Floor Elevation Extraction and ML Imputation Across Texas

This paper argues that AI-enabled analysis of street-view imagery, complemented by performance-gated machine-learning imputation, provides a viable pathway for generating building-specific elevation data at regional scale for flood risk assessment. We develop and apply a three-stage pipeline across 18 areas of interest (AOIs) in Texas that (1) extracts LFE and the height difference between street grade and the lowest floor (HDSL) from Google Street View imagery using the Elev-Vision framework, (2) imputes missing HDSL values with Random Forest and Gradient Boosting models trained on 16 terrain, hydrologic, geographic, and flood-exposure features, and (3) integrates the resulting elevation dataset with Fathom 1-in-100 year inundation surfaces and USACE depth-damage functions to estimate property-specific interior flood depth and expected loss. Across 12,241 residential structures, street-view imagery was available for 73.4% of parcels and direct LFE/HDSL extraction was successful for 49.0% (5,992 structures). Imputation was retained for 13 AOIs where cross-validated performance was defensible, with selected models achieving R suqre values from 0.159 to 0.974; five AOIs were explicitly excluded from prediction because performance was insufficient. The results show that street-view-based elevation mapping is not universally available for every property, but it is sufficiently scalable to materially improve regional flood-risk characterization by moving beyond hazard exposure to structure-level estimates of interior inundation and expected damage. Scientifically, the study advances LFE estimation from a pilot-scale proof of concept to a regional, end-to-end workflow. Practically, it offers a replicable framework for jurisdictions that lack comprehensive Elevation Certificates but need parcel-level information to support mitigation, planning, and flood-risk management.

Updated: 2026-04-01 17:08:43

标题: 利用人工智能技术在得克萨斯州进行街景最低楼层高程提取和机器学习填充的房产级洪水风险评估

摘要: 本文认为，通过人工智能分析街景图像，并结合性能门控的机器学习填充技术，可以为洪水风险评估提供一个可行的途径，以在区域范围内生成建筑物特定的高程数据。我们在德克萨斯州的18个感兴趣区域（AOIs）开发并应用了一个三阶段流程：（1）使用Elev-Vision框架从Google街景图像中提取LFE和街道高度与最低楼层高度之间的差值（HDSL），（2）使用训练于16个地形、水文、地理和洪水暴露特征的随机森林和梯度提升模型填充缺失的HDSL值，并（3）将得到的高程数据集与Fathom 1年100年淹没表面和USACE深度损害函数相结合，估计特定物业内部洪水深度和预期损失。在12,241个住宅结构中，对73.4％的地块提供了街景图像，直接提取LFE/HDSL成功的比例为49.0％（5,992个结构）。填充技术在13个AOIs中被保留，其中交叉验证的性能是可靠的，所选模型的R suqre值从0.159到0.974；另外五个AOIs被明确排除在预测之外，因为性能不足。结果表明，基于街景的高程制图并非普遍适用于每个物业，但足够可扩展，可通过超越危险暴露，实现结构级别的内部淹没和预期损害估算，从而实质性地改善区域洪水风险表征。在科学上，该研究将LFE估计从试点规模的概念验证推进到区域规模的端到端工作流程。在实践上，它为那些缺乏全面高程证书但需要支持减灾、规划和洪水风险管理的地方提供了一个可复制的框架。

更新时间: 2026-04-01 17:08:43

领域: cs.LG

下载: http://arxiv.org/abs/2604.01153v1

Brainstacks: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA Stacks for Continual LLM Learning

We present Brainstacks, a modular architecture for continual multi-domain fine-tuning of large language models that packages domain expertise as frozen adapter stacks composing additively on a shared frozen base at inference. Five interlocking components: (1) MoE-LoRA with Shazeer-style noisy top-2 routing across all seven transformer projections under QLoRA 4-bit quantization with rsLoRA scaling; (2) an inner loop performing residual boosting by freezing trained stacks and adding new ones; (3) an outer loop training sequential domain-specific stacks with curriculum-ordered dependencies; (4) null-space projection via randomized SVD constraining new stacks to subspaces orthogonal to prior directions, achieving zero forgetting in isolation; (5) an outcome-based sigmoid meta-router trained on empirically discovered domain-combination targets that selectively weights stacks, enabling cross-domain composition. Two boundary experiments: (6) PSN pretraining on a randomly initialized model; (7) per-domain RL (DPO/GRPO) validating compatibility with post-SFT alignment. Validated on TinyLlama-1.1B (4 domains, 9 stacks) and Gemma 3 12B IT (5 domains, 10 stacks), MoE-LoRA achieves 2.5x faster convergence than parameter-matched single LoRA, residual boosting breaks through the single-stack ceiling, and the routed system recovers generation quality destroyed by ungated stack accumulation. The central finding: the outcome-based router discovers that domain stacks encode transferable cognitive primitives (instruction-following clarity, numerical reasoning, procedural logic, chain-of-thought structure) rather than domain-specific knowledge, with medical prompts routing to chat+math stacks in 97% of cases despite zero medical data in those stacks.

Updated: 2026-04-01 17:08:25

标题: Brainstacks：通过冻结的MoE-LoRA堆栈实现跨领域认知能力，用于持续的LLM学习

摘要: 我们提出了Brainstacks，这是一个用于大型语言模型持续多领域微调的模块化架构，它将领域专业知识打包成冻结的适配器堆栈，这些堆栈在推断时以加法方式组合在共享的冻结基础上。五个相互关联的组件：(1)包括在QLoRA 4位量化下通过Shazeer风格的嘈杂的前2路由跨越所有七个变压器投影的MoE-LoRA，带有rsLoRA缩放；(2)一个内循环通过冻结训练堆栈并添加新堆栈来执行残差增强；(3)一个外循环通过课程顺序依赖来训练序列领域特定堆栈；(4)通过随机SVD进行的零空间投影，将新堆栈约束到与先前方向正交的子空间，实现孤立时的零遗忘；(5)一个基于结果的S型元路由器，经过实证发现的领域组合目标进行训练，这个路由器选择性地对堆栈进行加权，实现跨领域组合。两个边界实验：(6)在随机初始化模型上进行PSN预训练；(7)每个领域的RL(DPO/GRPO)验证与后SFT对齐的兼容性。在TinyLlama-1.1B(4个领域，9个堆栈)和Gemma 3 12B IT(5个领域，10个堆栈)上验证了MoE-LoRA比参数匹配的单一LoRA快2.5倍收敛速度，残差增强突破了单一堆栈的限制，路由系统恢复了由未门控堆积破坏的生成质量。中心发现：基于结果的路由器发现，领域堆栈编码了可转移的认知基元(指令遵循清晰度、数值推理、程序逻辑、思维链结构)，而不是领域特定知识，尽管这些堆栈中没有医疗数据，但在97%的情况下，医疗提示路由到聊天+数学堆栈。

更新时间: 2026-04-01 17:08:25

领域: cs.CL,cs.AI

下载: http://arxiv.org/abs/2604.01152v1

Detecting Multi-Agent Collusion Through Multi-Agent Interpretability

As LLM agents are increasingly deployed in multi-agent systems, they introduce risks of covert coordination that may evade standard forms of human oversight. While linear probes on model activations have shown promise for detecting deception in single-agent settings, collusion is inherently a multi-agent phenomenon, and the use of internal representations for detecting collusion between agents remains unexplored. We introduce NARCBench, a benchmark for evaluating collusion detection under environment distribution shift, and propose five probing techniques that aggregate per-agent deception scores to classify scenarios at the group level. Our probes achieve 1.00 AUROC in-distribution and 0.60--0.86 AUROC when transferred zero-shot to structurally different multi-agent scenarios and a steganographic blackjack card-counting task. We find that no single probing technique dominates across all collusion types, suggesting that different forms of collusion manifest differently in activation space. We also find preliminary evidence that this signal is localised at the token level, with the colluding agent's activations spiking specifically when processing the encoded parts of their partner's message. This work takes a step toward multi-agent interpretability: extending white-box inspection from single models to multi-agent contexts, where detection requires aggregating signals across agents. These results suggest that model internals provide a complementary signal to text-level monitoring for detecting multi-agent collusion, particularly for organisations with access to model activations. Code and data are available at https://github.com/aaronrose227/narcbench.

Updated: 2026-04-01 17:08:05

标题: 通过多智能体可解释性检测多智能体合谋

摘要: 随着LLM代理越来越多地部署在多智能体系统中，它们引入了潜在的秘密协调风险，可能会逃避标准形式的人类监督。虽然在单一智能体环境中对模型激活进行线性探测已经显示出检测欺骗的潜力，但勾结是一个固有的多智能体现象，利用内部表示来检测代理之间的勾结尚未被探索。我们介绍了NARCBench，这是一个评估在环境分布转移下检测勾结的基准，并提出了五种探测技术，将每个代理的欺骗分数聚合起来，以对情景进行群体级别的分类。我们的探测技术在分布内达到1.00 AUROC，在零样本转移到结构不同的多智能体情景和隐写术二十一点牌计数任务时，达到了0.60 - 0.86 AUROC。我们发现没有一种单一的探测技术能在所有类型的勾结中占主导地位，这表明不同形式的勾结在激活空间中表现出不同的方式。我们还发现初步证据表明，这个信号在令牌级别上是局部化的，当处理其合作伙伴消息的编码部分时，勾结代理的激活会出现明显增加。这项工作朝着多智能体可解释性迈出了一步：将白盒检查从单一模型扩展到多智能体环境，其中检测需要跨智能体聚合信号。这些结果表明，模型内部为检测多智能体勾结提供了一种补充信号，尤其适用于具有访问模型激活权限的组织。代码和数据可在https://github.com/aaronrose227/narcbench 上找到。

更新时间: 2026-04-01 17:08:05

领域: cs.AI,cs.LG,cs.MA

下载: http://arxiv.org/abs/2604.01151v1

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

As Large Language Models (LLMs) for code increasingly utilize massive, often non-permissively licensed datasets, evaluating data contamination through Membership Inference Attacks (MIAs) has become critical. We propose SERSEM (Selective Entropy-Weighted Scoring for Membership Inference), a novel white-box attack framework that suppresses uninformative syntactical boilerplate to amplify specific memorization signals. SERSEM utilizes a dual-signal methodology: first, a continuous character-level weight mask is derived through static Abstract Syntax Tree (AST) analysis, spellchecking-based multilingual logic detection, and offline linting. Second, these heuristic weights are used to pool internal transformer activations and calibrate token-level Z-scores from the output logits. Evaluated on a 25,000-sample balanced dataset, SERSEM achieves a global AUC-ROC of 0.7913 on the StarCoder2-3B model and 0.7867 on the StarCoder2-7B model, consistently outperforming the implemented probability-based baselines Loss, Min-K% Prob, and PAC. Our findings demonstrate that focusing on human-centric coding anomalies provides a significantly more robust indicator of verbatim memorization than sequence-level probability averages.

Updated: 2026-04-01 17:03:58

标题: SERSEM：选择性熵加权评分用于代码语言模型中的成员推断

摘要: 随着用于代码的大型语言模型（LLMs）越来越多地利用庞大的、通常是非许可的数据集，通过成员推理攻击（MIAs）评估数据污染变得至关重要。我们提出了SERSEM（选择性熵加权评分进行成员推理），这是一个新颖的白盒攻击框架，它抑制了无信息量的语法样板，以放大特定的记忆信号。SERSEM利用双信号方法：首先，通过静态的抽象语法树（AST）分析、基于拼写检查的多语言逻辑检测和离线检查，导出连续字符级权重掩模。其次，这些启发式权重用于汇集内部变换器激活并校准来自输出对数的令牌级Z分数。在一个包含25,000个样本的平衡数据集上评估，SERSEM在StarCoder2-3B模型上实现了全局AUC-ROC为0.7913，在StarCoder2-7B模型上为0.7867，始终优于实施的基于概率的基线损失、最小K％概率和PAC。我们的研究结果表明，关注人类中心的编码异常提供了比序列级概率平均值更为稳健的文字记忆指示器。

更新时间: 2026-04-01 17:03:58

领域: cs.SE,cs.CR

下载: http://arxiv.org/abs/2604.01147v1

Deep Reinforcement Learning for Robotic Manipulation under Distribution Shift with Bounded Extremum Seeking

Reinforcement learning has shown strong performance in robotic manipulation, but learned policies often degrade in performance when test conditions differ from the training distribution. This limitation is especially important in contact-rich tasks such as pushing and pick-and-place, where changes in goals, contact conditions, or robot dynamics can drive the system out-of-distribution at inference time. In this paper, we investigate a hybrid controller that combines reinforcement learning with bounded extremum seeking to improve robustness under such conditions. In the proposed approach, deep deterministic policy gradient (DDPG) policies are trained under standard conditions on the robotic pushing and pick-and-place tasks, and are then combined with bounded ES during deployment. The RL policy provides fast manipulation behavior, while bounded ES ensures robustness of the overall controller to time variations when operating conditions depart from those seen during training. The resulting controller is evaluated under several out-of-distribution settings, including time-varying goals and spatially varying friction patches.

Updated: 2026-04-01 16:59:01

标题: 深度强化学习在受限极值搜索下的机器人操作中的应用

摘要: 强化学习在机器人操作中表现出色，但当测试条件与训练分布不同时，学习策略往往性能下降。这种限制在接触丰富的任务中尤为重要，例如推动和拾取放置，其中目标、接触条件或机器人动态的变化可能会在推断时使系统处于分布之外。在本文中，我们研究了一种将强化学习与有界极值寻找相结合的混合控制器，以提高在这种条件下的鲁棒性。在所提出的方法中，使用深度确定性策略梯度（DDPG）策略在机器人推动和拾取放置任务的标准条件下进行训练，然后在部署期间与有界ES结合。强化学习策略提供快速操纵行为，而有界ES确保整体控制器在操作条件偏离训练期间所见条件时具有鲁棒性。生成的控制器在几种分布之外的设置下进行评估，包括时间变化目标和空间变化摩擦补丁。

更新时间: 2026-04-01 16:59:01

领域: cs.RO,cs.LG

下载: http://arxiv.org/abs/2604.01142v1

Looking into a Pixel by Nonlinear Unmixing -- A Generative Approach

Due to the large footprint of pixels in remote sensing imagery, hyperspectral unmixing (HU) has become an important and necessary procedure in hyperspectral image analysis. Traditional HU methods rely on a prior spectral mixing model, especially for nonlinear mixtures, which has largely limited the performance and generalization capacity of the unmixing approach. In this paper, we address the challenging problem of hyperspectral nonlinear unmixing (HNU) without explicit knowledge of the mixing model. Inspired by the principle of generative models, where images of the same distribution can be generated as that of the training images without knowing the exact probability distribution function of the image, we develop an invertible mixing-unmixing process via a bi-directional GAN framework, constrained by both the cycle consistency and the linkage between linear and nonlinear mixtures. The combination of cycle consistency and linear linkage provides powerful constraints without requiring an explicit mixing model. We refer to the proposed approach as the linearly-constrained CycleGAN unmixing net, or LCGU net. Experimental results indicate that the proposed LCGU net exhibits stable and competitive performance across different datasets compared with other state-of-the-art model-based HNU methods.

Updated: 2026-04-01 16:58:05

标题: 通过非线性解混合方法研究像素——一种生成式方法

摘要: 由于遥感图像中像素的大足迹，高光谱解混（HU）已成为高光谱图像分析中重要且必要的过程。传统的HU方法依赖于先验光谱混合模型，特别是对于非线性混合物，这在很大程度上限制了解混方法的性能和泛化能力。本文针对没有混合模型的高光谱非线性解混（HNU）这一具有挑战性的问题进行了研究。受生成模型原理的启发，即可以生成与训练图像相同分布的图像，而不需要知道图像的确切概率分布函数，我们通过双向GAN框架开发了一种可逆的混合-解混过程，受到循环一致性和线性与非线性混合之间联系的约束。循环一致性和线性联系的结合提供了强大的约束条件，而不需要明确的混合模型。我们将提出的方法称为线性约束的CycleGAN解混网络，或LCGU网络。实验结果表明，与其他最先进的基于模型的HNU方法相比，提出的LCGU网络在不同数据集上表现稳定且有竞争力的性能。

更新时间: 2026-04-01 16:58:05

领域: cs.CV,cs.AI,eess.IV

下载: http://arxiv.org/abs/2604.01141v1

Obfuscating Code Vulnerabilities against Static Analysis in JavaScript Code

Code obfuscation is widely adopted in modern software development to protect intellectual property and hinder reverse engineering, but it also provides attackers with a powerful means to conceal malicious logic inside otherwise legitimate JavaScript code. In a software supply chain where a single compromised package can affect thousands of applications, this raises a critical question: how robust are the Static Application Security Testing (SAST) tools that CI/CD pipelines rely on as automated security gatekeepers? This paper answers that question by empirically quantifying the impact of JavaScript obfuscation on state-of-practice SAST. We define a realistic supply-chain threat model in which an adversary injects vulnerable code and iteratively obfuscates it until the pipeline reports a clean scan. To measure the resulting degradation, we introduce the Vulnerability Detection Loss (VDL) metric and conduct a two-phase study. First, we analyze 16 vulnerable-by-design Node.js web applications from the OWASP directory; second, we extend the analysis to 260 in-the-wild JavaScript/Node.js projects from GitHub. Across both datasets, we apply eight semantics-preserving obfuscation techniques and their combinations and evaluate two representative SAST tools, Njsscan and Bearer. Even a single obfuscation technique typically suppresses most baseline findings, including high-severity issues, while stacking techniques yield near-total evasion, with VDL often approaching 100%. Our results show that current JavaScript SAST is fundamentally not robust against commonplace obfuscations and that "clean" reports on obfuscated code may offer only a false sense of security. Finally, we discuss practical mitigation guidelines and directions for obfuscation-aware analysis.

Updated: 2026-04-01 16:52:37

标题: 在JavaScript代码中针对静态分析的代码漏洞混淆

摘要: 代码混淆被广泛应用于现代软件开发中，以保护知识产权并阻碍逆向工程，但它也为攻击者提供了一个强大的手段，在否则合法的JavaScript代码中隐藏恶意逻辑。在一个软件供应链中，一个受损的软件包可以影响成千上万的应用程序，这引发了一个关键问题：CI/CD管道依赖的静态应用程序安全测试（SAST）工具的鲁棒性如何？本文通过实证量化JavaScript混淆对现实实践SAST的影响来回答这个问题。我们定义了一个现实的供应链威胁模型，在这个模型中，对手注入易受攻击的代码并迭代混淆，直到管道报告干净扫描。为了衡量结果的恶化程度，我们引入了漏洞检测损失（VDL）指标，并进行了两阶段研究。首先，我们分析了来自OWASP目录的16个有意设计的Node.js Web应用程序；其次，我们将分析扩展到来自GitHub的260个JavaScript/Node.js项目。在这两个数据集中，我们应用了八种保持语义的混淆技术及其组合，并评估了两个代表性的SAST工具，Njsscan和Bearer。即使是单一的混淆技术通常也会抑制大多数基线发现，包括高危问题，而叠加技术则几乎完全逃避，VDL通常接近100%。我们的结果表明，当前的JavaScript SAST在常见的混淆下基本上不具备鲁棒性，而对混淆代码的“干净”报告可能只提供了一种虚假的安全感。最后，我们讨论了实际的缓解指南和混淆感知分析的方向。

更新时间: 2026-04-01 16:52:37

领域: cs.CR

下载: http://arxiv.org/abs/2604.01131v1

Toward Personalized Darts Training: A Data-Driven Framework Based on Skeleton-Based Biomechanical Analysis and Motion Modeling

As sports training becomes more data-driven, traditional dart coaching based mainly on experience and visual observation is increasingly inadequate for high-precision, goal-oriented movements. Although prior studies have highlighted the importance of release parameters, joint motion, and coordination in dart throwing, most quantitative methods still focus on local variables, single-release metrics, or static template matching. These approaches offer limited support for personalized training and often overlook useful movement variability. This paper presents a data-driven dart training assistance system. The system creates a closed-loop framework spanning motion capture, feature modeling, and personalized feedback. Dart-throwing data were collected in markerless conditions using a Kinect 2.0 depth sensor and an optical camera. Eighteen kinematic features were extracted from four biomechanical dimensions: three-link coordination, release velocity, multi-joint angular configuration, and postural stability. Two modules were developed: a personalized optimal throwing trajectory model that combines historical high-quality samples with the minimum jerk criterion, and a motion deviation diagnosis and recommendation model based on z-scores and hierarchical logic. A total of 2,396 throwing samples from professional and non-professional athletes were collected. Results show that the system generates smooth personalized reference trajectories consistent with natural human movement. Case studies indicate that it can detect poor trunk stability, abnormal elbow displacement, and imbalanced velocity control, then provide targeted recommendations. The framework shifts dart evaluation from deviation from a uniform standard to deviation from an individual's optimal control range, improving personalization and interpretability for darts training and other high-precision target sports.

Updated: 2026-04-01 16:51:30

标题: 朝向个性化飞镖训练：基于基于骨骼的生物力学分析和动作建模的数据驱动框架

摘要: 随着体育训练变得更加数据驱动，传统的飞镖教练主要基于经验和视觉观察的方式对高精度、目标导向的运动越来越不足。尽管先前的研究强调了飞镖投掷中释放参数、关节运动和协调的重要性，但大多数定量方法仍集中在局部变量、单次释放指标或静态模板匹配上。这些方法为个性化训练提供了有限支持，通常忽视了有用的运动变异性。本文提出了一个数据驱动的飞镖训练辅助系统。该系统创建了一个包括运动捕捉、特征建模和个性化反馈的闭环框架。使用Kinect 2.0深度传感器和光学摄像机在无标记条件下收集了飞镖投掷数据。从四个生物力学维度中提取了18个运动学特征：三连杆协调、释放速度、多关节角配置和姿势稳定性。开发了两个模块：一个个性化的最佳投掷轨迹模型，将历史高质量样本与最小加速度准则相结合，以及一个基于z分数和分层逻辑的运动偏差诊断和建议模型。收集了来自专业和非专业运动员的2396个投掷样本。结果显示该系统生成了与自然人类运动一致的平滑个性化参考轨迹。案例研究表明它可以检测到不良的躯干稳定性、异常的肘部位移和不平衡的速度控制，然后提供有针对性的建议。该框架将飞镖评估从偏离统一标准转变为偏离个体最佳控制范围，提高了飞镖训练和其他高精度目标运动的个性化和可解释性。

更新时间: 2026-04-01 16:51:30

领域: cs.LG,cs.CV

下载: http://arxiv.org/abs/2604.01130v1

Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers

This paper introduces the first systematic evaluation framework for quantifying the quality and risks of papers written by modern coding agents. While AI-driven paper writing has become a growing concern, rigorous evaluation of the quality and potential risks of AI-written papers remains limited, and a unified understanding of their reliability is still lacking. We introduce Paper Reconstruction Evaluation (PaperRecon), an evaluation framework in which an overview (overview.md) is created from an existing paper, after which an agent generates a full paper based on the overview and minimal additional resources, and the result is subsequently compared against the original paper. PaperRecon disentangles the evaluation of the AI-written papers into two orthogonal dimensions, Presentation and Hallucination, where Presentation is evaluated using a rubric and Hallucination is assessed via agentic evaluation grounded in the original paper source. For evaluation, we introduce PaperWrite-Bench, a benchmark of 51 papers from top-tier venues across diverse domains published after 2025. Our experiments reveal a clear trade-off: while both ClaudeCode and Codex improve with model advances, ClaudeCode achieves higher presentation quality at the cost of more than 10 hallucinations per paper on average, whereas Codex produces fewer hallucinations but lower presentation quality. This work takes a first step toward establishing evaluation frameworks for AI-driven paper writing and improving the understanding of its risks within the research community.

Updated: 2026-04-01 16:48:04

标题: 论文重建评估：评估人工智能撰写的论文中的呈现和幻觉

摘要: 本文介绍了第一个系统评估框架，用于量化现代编码代理编写的论文质量和风险。虽然基于人工智能的论文写作已经成为一个日益关注的问题，但对AI写作论文质量和潜在风险的严格评估仍然有限，对它们的可靠性还缺乏统一的理解。我们介绍了Paper Reconstruction Evaluation（PaperRecon），这是一个评估框架，其中从现有论文中创建一个概述（overview.md），然后一个代理根据概述和最少的额外资源生成一篇完整的论文，然后将结果与原始论文进行比较。PaperRecon将AI写作论文的评估分解为两个正交维度，即Presentation和Hallucination，其中使用评分表评估Presentation，通过原始论文来源的代理评估评估Hallucination。为了评估，我们介绍了PaperWrite-Bench，这是一个基准，包括来自2025年后各个领域顶级会议的51篇论文。我们的实验揭示了一个明显的权衡：尽管ClaudeCode和Codex都随着模型的进步而改进，但ClaudeCode在提高Presentation质量方面取得了更高的成绩，但平均每篇论文有超过10个幻觉，而Codex产生的幻觉较少，但Presentation质量较低。这项工作是为了建立AI驱动的论文写作的评估框架，并改善研究界对其风险的理解迈出的第一步。

更新时间: 2026-04-01 16:48:04

领域: cs.CL,cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.01128v1

Multi-Agent LLM Governance for Safe Two-Timescale Reinforcement Learning in SDN-IoT Defense

Software-Defined Networking (SDN) is increasingly adopted to secure Internet-of-Things (IoT) networks due to its centralized control and programmable forwarding. However, SDN-IoT defense is inherently a closed-loop control problem in which mitigation actions impact controller workload, queue dynamics, rule-installation delay, and future traffic observations. Aggressive mitigation may destabilize the control plane, degrade Quality of Service (QoS), and amplify systemic risk. Existing learning-based approaches prioritize detection accuracy while neglecting controller coupling and short-horizon Reinforcement Learning (RL) optimization without structured, auditable policy evolution. This paper introduces a self-reflective two-timescale SDN-IoT defense solution separating fast mitigation from slow policy governance. At the fast timescale, per-switch Proximal Policy Optimization (PPO) agents perform controller-aware mitigation under safety constraints and action masking. At the slow timescale, a multi-agent Large Language Model (LLM) governance engine generates machine-parsable updates to the global policy constitution Pi, which encodes admissible actions, safety thresholds, and reward priorities. Updates (Delta Pi) are validated through stress testing and deployed only with non-regression and safety guarantees, ensuring an auditable evolution without retraining RL agents. Evaluation under heterogeneous IoT traffic and adversarial stress shows improvements of 9.1% Macro-F1 over PPO and 15.4% over static baselines. Worst-case degradation drops by 36.8%, controller backlog peaks by 42.7%, and RTT p95 inflation remains below 5.8% under high-intensity attacks. Policy evolution converges within five cycles, reducing catastrophic overload from 11.6% to 2.3%.

Updated: 2026-04-01 16:48:03

标题: SDN-IoT防御中安全的双时间尺度强化学习的多代理LLM治理

摘要: 软件定义网络（SDN）越来越被采用来保护物联网（IoT）网络，因为其具有集中控制和可编程转发的特性。然而，SDN-IoT防御本质上是一个闭环控制问题，其中减轻措施会影响控制器工作负载、队列动态、规则安装延迟和未来流量观察。过度减轻可能会使控制平面不稳定，降低服务质量（QoS）并放大系统风险。现有的基于学习的方法优先考虑检测准确性，而忽略了控制器耦合和短期视野的强化学习（RL）优化，没有结构化、可审计的策略演变。本文介绍了一种自我反思的双时间尺度SDN-IoT防御解决方案，将快速减轻与慢速策略治理分开。在快速时间尺度上，每个交换机的近端策略优化（PPO）代理在安全约束和动作屏蔽下执行控制器感知减轻。在慢时间尺度上，多代理大型语言模型（LLM）治理引擎生成机器可解析的全局策略宪法Pi的更新，其中包括可接受的行动、安全阈值和奖励优先级。更新（Delta Pi）通过压力测试验证，并仅在非回归和安全保证下部署，确保可审计的演变，而无需重新训练RL代理。在异构IoT流量和对抗性压力下的评估表明，相对于PPO和静态基线，Macro-F1提高了9.1%，比最坏情况下的恶化降低了36.8%，控制器积压峰值减少了42.7%，RTT p95膨胀在高强度攻击下仍保持在5.8%以下。策略演变在五个周期内收敛，将灾难性过载从11.6%降至2.3%。

更新时间: 2026-04-01 16:48:03

领域: cs.CR

下载: http://arxiv.org/abs/2604.01127v1

When Agents Persuade: Rhetoric Generation and Mitigation in LLMs

Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to produce manipulative material. In this study, we task LLMs with propaganda objectives and analyze their outputs using two domain-specific models: one that classifies text as propaganda or non-propaganda, and another that detects rhetorical techniques of propaganda (e.g., loaded language, appeals to fear, flag-waving, name-calling). Our findings show that, when prompted, LLMs exhibit propagandistic behaviors and use a variety of rhetorical techniques in doing so. We also explore mitigation via Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and ORPO (Odds Ratio Preference Optimization). We find that fine-tuning significantly reduces their tendency to generate such content, with ORPO proving most effective.

Updated: 2026-04-01 16:44:39

标题: 当代理人说服时：在LLM中的修辞生成和缓解

摘要: 尽管基于LLM的代理在开放环境中具有广泛的好处，但它们可能会被利用来生成操纵性材料。在这项研究中，我们让LLM代理承担宣传目标，并使用两个领域特定模型对它们的输出进行分析：一个将文本分类为宣传或非宣传，另一个检测宣传的修辞技巧（例如，有倾向性的语言，恐惧呼吁，挥舞旗帜，辱骂）。我们的研究结果表明，当受到激励时，LLM代理展现出宣传行为，并使用各种修辞技巧。我们还探讨了通过监督微调（SFT）、直接偏好优化（DPO）和ORPO（赔率比偏好优化）来减轻这种情况。我们发现微调显著降低了它们生成此类内容的倾向，其中ORPO效果最好。

更新时间: 2026-04-01 16:44:39

领域: cs.AI

下载: http://arxiv.org/abs/2603.04636v2

But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors

LLM-as-a-judge is widely used as a scalable substitute for human evaluation, yet current approaches rely on black-box access and struggle to detect subtle dishonesty, such as sycophancy and manipulation. We introduce Judge Using Safety-Steered Alternatives (JUSSA), a framework that leverages a model's internal representations to optimize an honesty-promoting steering vector from a single training example, generating contrastive alternatives that give judges a reference point for detecting dishonesty. We test JUSSA on a novel manipulation benchmark with human-validated response pairs at varying dishonesty levels, finding AUROC improvements across both GPT-4.1 (0.893 $\to$ 0.946) and Claude Haiku (0.859 $\to$ 0.929) judges, though performance degrades when task complexity is mismatched to judge capability, suggesting contrastive evaluation helps most when the task is challenging but within the judge's reach. Layer-wise analysis further shows that steering is most effective in middle layers, where model representations begin to diverge between honest and dishonest prompt processing. Our work demonstrates that steering vectors can serve as tools for evaluation rather than for improving model outputs at inference, opening a new direction for thorough white-box auditing.

Updated: 2026-04-01 16:42:00

标题: 但是你诚实的回答是什么？利用导向向量为LLM法官提供诚实的替代方案。

摘要: LLM作为一种评判者广泛被用作可扩展的替代人工评估，然而目前的方法依赖于黑匣子访问，并且难以检测到微妙的不诚实，比如谄媚和操纵。我们引入了一种名为JUSSA（Judge Using Safety-Steered Alternatives）的框架，利用模型的内部表示来优化一个诚实促进的指导向量，从单个训练示例生成对比性的替代方案，为评判者提供一个参考点，以便检测不诚实行为。我们在一个包含人工验证的响应对的新型操纵基准上测试了JUSSA，在不同不诚实水平上发现了GPT-4.1（0.893 → 0.946）和Claude Haiku（0.859 → 0.929）评判者的AUROC改进，尽管当任务复杂度与评判者能力不匹配时，性能会下降，这表明对比性评估在任务具有挑战性但评判者能够完成时最有帮助。层次分析进一步显示，指导在中间层最为有效，模型表示在这些层中开始在处理诚实和不诚实提示时发生分歧。我们的工作表明，指导向量可以作为评估工具而不是用于改进推理时的模型输出，开辟了一个新的方向，进行彻底的白盒审计。

更新时间: 2026-04-01 16:42:00

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2505.17760v3

Lightweight Prompt-Guided CLIP Adaptation for Monocular Depth Estimation

Leveraging the rich semantic features of vision-language models (VLMs) like CLIP for monocular depth estimation tasks is a promising direction, yet often requires extensive fine-tuning or lacks geometric precision. We present a parameter-efficient framework, named MoA-DepthCLIP, that adapts pretrained CLIP representations for monocular depth estimation with minimal supervision. Our method integrates a lightweight Mixture-of-Adapters (MoA) module into the pretrained Vision Transformer (ViT-B/32) backbone combined with selective fine-tuning of the final layers. This design enables spatially-aware adaptation, guided by a global semantic context vector and a hybrid prediction architecture that synergizes depth bin classification with direct regression. To enhance structural accuracy, we employ a composite loss function that enforces geometric constraints. On the NYU Depth V2 benchmark, MoA-DepthCLIP achieves competitive results, significantly outperforming the DepthCLIP baseline by improving the $δ_1$ accuracy from 0.390 to 0.745 and reducing the RMSE from 1.176 to 0.520. These results are achieved while requiring substantially few trainable parameters, demonstrating that lightweight, prompt-guided MoA is a highly effective strategy for transferring VLM knowledge to fine-grained monocular depth estimation tasks.

Updated: 2026-04-01 16:41:04

标题: 轻量级提示引导的CLIP适应用于单目深度估计

摘要: 利用视觉语言模型（VLMs）如CLIP的丰富语义特征进行单眼深度估计任务是一个有前途的方向，但通常需要大量微调或缺乏几何精度。我们提出了一个名为MoA-DepthCLIP的参数高效框架，该框架利用预训练的CLIP表示来进行单眼深度估计，需要最少的监督。我们的方法将轻量级的Adapter混合（MoA）模块集成到预训练的Vision Transformer（ViT-B/32）骨干网络中，同时结合对最终层的选择性微调。这种设计实现了基于全局语义上下文向量的空间感知适应，以及将深度分区分类与直接回归相结合的混合预测架构。为了增强结构准确性，我们采用了一个复合损失函数来强制执行几何约束。在NYU Depth V2基准测试中，MoA-DepthCLIP取得了竞争性的结果，明显优于DepthCLIP基准线，将$δ_1$准确度从0.390提高到0.745，并将RMSE从1.176减少到0.520。这些结果是在需要极少可训练参数的情况下实现的，表明轻量级、及时引导的MoA是将VLM知识转移到细粒度单眼深度估计任务中的一种高效策略。

更新时间: 2026-04-01 16:41:04

领域: cs.CV,cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.01118v1

Reconsidering Dependency Networks from an Information Geometry Perspective

Dependency networks (Heckerman et al., 2000) provide a flexible framework for modeling complex systems with many variables by combining independently learned local conditional distributions through pseudo-Gibbs sampling. Despite their computational advantages over Bayesian and Markov networks, the theoretical foundations of dependency networks remain incomplete, primarily because their model distributions -- defined as stationary distributions of pseudo-Gibbs sampling -- lack closed-form expressions. This paper develops an information-geometric analysis of pseudo-Gibbs sampling, interpreting each sampling step as an m-projection onto a full conditional manifold. Building on this interpretation, we introduce the full conditional divergence and derive an upper bound that characterizes the location of the stationary distribution in the space of probability distributions. We then reformulate both structure and parameter learning as optimization problems that decompose into independent subproblems for each node, and prove that the learned model distribution converges to the true underlying distribution as the number of training samples grows to infinity. Experiments confirm that the proposed upper bound is tight in practice.

Updated: 2026-04-01 16:40:47

标题: 重新考虑依赖网络：基于信息几何视角

摘要: 依赖网络（Heckerman等人，2000）通过通过伪Gibbs采样将独立学习的局部条件分布结合起来，为建模具有许多变量的复杂系统提供了灵活的框架。尽管与贝叶斯网络和马尔可夫网络相比，依赖网络在计算上具有优势，但由于它们的模型分布 - 定义为伪Gibbs采样的稳态分布 - 缺乏封闭形式的表达，因此依赖网络的理论基础仍然不完整。本文对伪Gibbs采样进行了信息几何分析，将每个采样步骤解释为对完全条件流形的m-投影。基于这种解释，我们引入了完全条件散度，并推导了一个上界，用于描述概率分布空间中稳态分布的位置。然后，我们将结构和参数学习重新制定为优化问题，这些问题分解为每个节点的独立子问题，并证明随着训练样本数量增加到无穷大，学习的模型分布将收敛于真实的基础分布。实验证实了所提出的上界在实践中是紧密的。

更新时间: 2026-04-01 16:40:47

领域: cs.LG

下载: http://arxiv.org/abs/2604.01117v1

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

As generative AI systems are integrated into educational settings, students often encounter AI-generated output while working through learning tasks, either by requesting help or through integrated tools. Trust in AI can influence how students interpret and use that output, including whether they evaluate it critically or exhibit overreliance. We investigate how students' trust relates to their appropriate reliance on an AI assistant during programming problem-solving tasks, and whether this relationship differs by learner characteristics. With 432 undergraduate participants, students' completed Python output-prediction problems while receiving recommendations and explanations from an AI chatbot, including accurate and intentionally misleading suggestions. We operationalize reliance behaviorally as the extent to which students' responses reflected appropriate use of the AI assistant's suggestions, accepting them when they were correct and rejecting them when they were incorrect. Pre- and post-task surveys assessed trust in the assistant, AI literacy, need for cognition, programming self-efficacy, and programming literacy. Results showed a non-linear relationship in which higher trust was associated with lower appropriate reliance, suggesting weaker discrimination between correct and incorrect recommendations. This relationship was significantly moderated by students' AI literacy and need for cognition. These findings highlight the need for future work on instructional and system supports that encourage more reflective evaluation of AI assistance during problem-solving.

Updated: 2026-04-01 16:38:47

标题: 信任和依赖教育中的人工智能：人工智能素养和认知需求的调节者

摘要: 随着生成式人工智能系统被整合到教育环境中，学生在学习任务中经常会遇到由人工智能生成的输出，无论是通过请求帮助还是通过集成工具。对人工智能的信任可以影响学生对输出的解释和使用方式，包括是否进行批判性评估或表现出过度依赖。我们研究了学生的信任与他们在编程解决问题任务中对人工智能助手的适当依赖之间的关系，以及这种关系是否因学习者特征而有所不同。通过432名本科生参与者，学生在完成Python输出预测问题的同时，从人工智能对话机器人接收建议和解释，包括准确和故意误导的建议。我们行为化依赖行为，即学生的回应在多大程度上反映了对人工智能助手建议的适当使用，当建议正确时接受它们，当建议错误时拒绝它们。任务前后的调查评估了对助手的信任、人工智能素养、认知需求、编程自我效能感和编程素养。结果显示了一个非线性关系，即较高的信任与较低的适当依赖相关，表明在正确和不正确的建议之间判别能力较弱。这种关系受到学生的人工智能素养和认知需求的显著调节。这些发现强调了未来需要在解决问题过程中鼓励更多对人工智能辅助进行反思性评估的教学和系统支持。

更新时间: 2026-04-01 16:38:47

领域: cs.HC,cs.AI,cs.CY,cs.ET

下载: http://arxiv.org/abs/2604.01114v1

VT-Former: Efffcient Transformer-based Decoder for Varshamov-Tenengolts Codes

In recent years, widespread attention has been drawn to the challenge of correcting insertion, deletion, and substitution (IDS) errors in DNA-based data storage. Among various IDS-correcting codes, Varshamov-Tenengolts (VT) codes, originally designed for single-error correction, have been established as a central research focus. While existing decoding methods demonstrate high accuracy for single-error correction, they are typically not applicable to the correction of multiple IDS errors. In this work, the latent capability of VT codes for multiple-error correction is investigated through a statistic-enhanced Transformer-based VT decoder (VT-Former), utilizing both symbol and statistic feature embeddings. Experimental results demonstrate that VT-Former achieves nearly 100\% accuracy on correcting single errors. For multi-error decoding tasks across various codeword lengths, improvements in both frame accuracy and bit accuracy are observed, compared to conventional hard-decision and soft-in soft-out decoding algorithms. Furthermore, while lower decoding latency is exhibited by the base model compared to traditional soft decoders, the architecture is further optimized in this study to enhance decoding efficiency and reduce computational overhead.

Updated: 2026-04-01 16:38:21

标题: VT-Former: Varshamov-Tenengolts码的高效基于Transformer的解码器

摘要: 最近，人们对纠正DNA数据存储中的插入、删除和替换（IDS）错误的挑战引起了广泛关注。在各种IDS纠正码中，最初设计用于单错误纠正的Varshamov-Tenengolts（VT）码已被确立为中心研究焦点。尽管现有的解码方法对于单错误纠正表现出很高的准确性，但通常不适用于多个IDS错误的纠正。在这项工作中，通过利用符号和统计特征嵌入的统计增强Transformer型VT解码器（VT-Former），研究了VT码在多错误纠正方面的潜在能力。实验结果表明，VT-Former在纠正单一错误方面实现了近100％的准确性。对于不同码字长度的多错误解码任务，与传统的硬判决和软输入软输出解码算法相比，观察到了框架准确性和比特准确性的改进。此外，与传统软解码器相比，基础模型表现出更低的解码延迟，本研究进一步优化了架构，以提高解码效率并减少计算开销。

更新时间: 2026-04-01 16:38:21

领域: cs.LG,cs.IT

下载: http://arxiv.org/abs/2502.21060v2

Adversarial Moral Stress Testing of Large Language Models

Evaluating the ethical robustness of large language models (LLMs) deployed in software systems remains challenging, particularly under sustained adversarial user interaction. Existing safety benchmarks typically rely on single-round evaluations and aggregate metrics, such as toxicity scores and refusal rates, which offer limited visibility into behavioral instability that may arise during realistic multi-turn interactions. As a result, rare but high-impact ethical failures and progressive degradation effects may remain undetected prior to deployment. This paper introduces Adversarial Moral Stress Testing (AMST), a stress-based evaluation framework for assessing ethical robustness under adversarial multi-round interactions. AMST applies structured stress transformations to prompts and evaluates model behavior through distribution-aware robustness metrics that capture variance, tail risk, and temporal behavioral drift across interaction rounds. We evaluate AMST on several state-of-the-art LLMs, including LLaMA-3-8B, GPT-4o, and DeepSeek-v3, using a large set of adversarial scenarios generated under controlled stress conditions. The results demonstrate substantial differences in robustness profiles across models and expose degradation patterns that are not observable under conventional single-round evaluation protocols. In particular, robustness has been shown to depend on distributional stability and tail behavior rather than on average performance alone. Additionally, AMST provides a scalable and model-agnostic stress-testing methodology that enables robustness-aware evaluation and monitoring of LLM-enabled software systems operating in adversarial environments.

Updated: 2026-04-01 16:34:20

标题: 对大型语言模型进行对抗性道德压力测试

摘要: 评估部署在软件系统中的大型语言模型（LLMs）的道德鲁棒性仍然具有挑战性，特别是在持续的对抗性用户交互下。现有的安全基准通常依赖于单轮评估和聚合指标，如毒性评分和拒绝率，这些指标对可能在现实多回合交互中出现的行为不稳定性提供了有限的可见性。因此，在部署之前可能会发生罕见但高影响的道德失败和渐进性退化效应可能会未被察觉。本文介绍了对抗性道德压力测试（AMST），这是一个基于压力的评估框架，用于评估在对抗性多回合交互下的道德鲁棒性。AMST对提示应用结构化压力转换，并通过捕获交互回合中方差、尾风险和时间行为漂移的分布感知鲁棒性指标来评估模型行为。我们使用在受控压力条件下生成的大量对抗场景对几种最先进的LLMs，包括LLaMA-3-8B、GPT-4o和DeepSeek-v3进行了AMST评估。结果表明，不同模型之间的鲁棒性配置文件存在显著差异，并暴露了在传统的单轮评估协议下不可观察到的退化模式。特别是，鲁棒性已被证明依赖于分布稳定性和尾行为，而不仅仅是平均性能。此外，AMST提供了一种可扩展且与模型无关的压力测试方法，可以实现对在对抗环境中运行的LLM启用软件系统的鲁棒性感知评估和监控。

更新时间: 2026-04-01 16:34:20

领域: cs.AI

下载: http://arxiv.org/abs/2604.01108v1

How Motivation Relates to Generative AI Use: A Large-Scale Survey of Mexican High School Students

This study examined how high school students with different motivational profiles use generative AI tools in math and writing. Through K-means clustering analysis of survey data from 6,793 Mexican high school students, we identified three distinct motivational profiles based on self-concept and perceived subject value. Results revealed distinct domain-specific AI usage patterns across students with different motivational profiles. Our findings challenge one-size-fits-all AI integration approaches and advocate for motivationally-informed educational interventions.

Updated: 2026-04-01 16:33:53

标题: 激励如何与生成式人工智能使用相关：墨西哥高中学生的大规模调查

摘要: 这项研究探讨了不同动机特征的高中学生如何在数学和写作中使用生成式人工智能工具。通过对来自6793名墨西哥高中学生的调查数据进行K均值聚类分析，我们基于自我概念和感知科目价值确定了三种独特的动机特征。结果显示不同动机特征的学生在特定领域的人工智能使用模式各不相同。我们的研究结果挑战了一刀切的人工智能整合方法，并主张采用动机驱动的教育干预措施。

更新时间: 2026-04-01 16:33:53

领域: cs.CY,cs.AI,cs.HC

下载: http://arxiv.org/abs/2603.19263v2

Inverse Design of Optical Multilayer Thin Films using Robust Masked Diffusion Models

Inverse design of optical multilayer stacks seeks to infer layer materials, thicknesses, and ordering from a desired target spectrum. It is a long-standing challenge due to the large design space and non-unique solutions. We introduce \texttt{OptoLlama}, a masked diffusion language model for inverse thin-film design from optical spectra. Representing multilayer stacks as sequences of material-thickness tokens, \texttt{OptoLlama} conditions generation on reflectance, absorptance, and transmittance spectra and learns a probabilistic mapping from optical response to structure. Evaluated on a representative test set of 3,000 targets, \texttt{OptoLlama} reduces the mean absolute spectral error by 2.9-fold relative to a nearest-neighbor template baseline and by 3.45-fold relative to the state-of-the-art data-driven baseline, called \texttt{OptoGPT}. Case studies on designed and expert-defined targets show that the model reproduces characteristic spectral features and recovers physically meaningful stack motifs, including distributed Bragg reflectors. These results establish diffusion-based sequence modeling as a powerful framework for inverse photonic design.

Updated: 2026-04-01 16:33:05

标题: 使用健壮的掩模扩散模型进行光学多层薄膜的反向设计

摘要: 光学多层膜堆的逆向设计旨在从所需的目标光谱推断出层材料、厚度和排序。由于设计空间庞大且解决方案不唯一，这是一个长期存在的挑战。我们引入了\texttt{OptoLlama}，一种用于从光谱中逆向设计薄膜的掩模扩散语言模型。将多层膜堆表示为材料-厚度标记的序列，\texttt{OptoLlama}将生成条件应用于反射率、吸收率和透射率光谱，并学习光学响应到结构的概率映射。在代表性的3,000个目标测试集上评估，\texttt{OptoLlama}相对于最近邻模板基线将平均绝对光谱误差降低了2.9倍，相对于最先进的数据驱动基线\texttt{OptoGPT}降低了3.45倍。对设计和专家定义的目标的案例研究显示，该模型复制了特征光谱特征，并恢复了具有物理意义的堆叠图案，包括分布式布拉格反射镜。这些结果确立了基于扩散的序列建模作为逆向光子设计的强大框架。

更新时间: 2026-04-01 16:33:05

领域: physics.optics,cs.LG

下载: http://arxiv.org/abs/2604.01106v1

When Only the Final Text Survives: Implicit Execution Tracing for Multi-Agent Attribution

When a multi-agent system produces an incorrect or harmful answer, who is accountable if execution logs and agent identifiers are unavailable? In practice, generated content is often detached from its execution environment due to privacy or system boundaries, leaving the final text as the only auditable artifact. Existing attribution methods rely on full execution traces and thus become ineffective in such metadata-deprived settings. We propose Implicit Execution Tracing (IET), a provenance-by-design framework that shifts attribution from post-hoc inference to built-in instrumentation. Instead of reconstructing hidden trajectories, IET embeds agent-specific, key-conditioned statistical signals directly into the token generation process, transforming the output text into a self-verifying execution record. At inference time, we recover a linearized execution trace from the final text via transition-aware statistical scoring. Experiments across diverse multi-agent coordination settings demonstrate that IET achieves accurate segment-level attribution and reliable transition recovery under identity removal, boundary corruption, and privacy-preserving redaction, while maintaining generation quality. These results show that embedding provenance into generation provides a practical and robust foundation for accountability in multi-agent language systems when execution metadata is unavailable.

Updated: 2026-04-01 16:30:23

标题: 当只有最终文本幸存：用于多代理归因的隐式执行跟踪

摘要: 当多智能体系统产生不正确或有害答案时，如果执行日志和智能体标识不可用，谁应该负责？在实践中，由于隐私或系统边界的限制，生成的内容通常与其执行环境分离，使最终文本成为唯一可审计的物件。现有的归因方法依赖于完整的执行跟踪，因此在缺乏元数据的情况下变得无效。我们提出了隐式执行跟踪（IET），这是一个通过设计的溯源框架，将归因从事后推断转移到内置的仪器化。IET不是重建隐藏轨迹，而是将智能体特定的、关键条件的统计信号直接嵌入到令牌生成过程中，将输出文本转化为自我验证的执行记录。在推断时，我们通过过渡感知的统计评分从最终文本中恢复一个线性化的执行跟踪。在各种多智能体协调设置下进行的实验表明，IET在身份删除、边界损坏和保护隐私的去隐私化下实现了准确的段级归因和可靠的过渡恢复，同时保持生成质量。这些结果表明，当执行元数据不可用时，将溯源嵌入到生成中为多智能体语言系统中的责任提供了一个实用和健壮的基础。

更新时间: 2026-04-01 16:30:23

领域: cs.AI,cs.CL

下载: http://arxiv.org/abs/2603.17445v4

Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming

As large language model (LLM) agents increasingly automate complex web tasks, they boost productivity while simultaneously introducing new security risks. However, relevant studies on web agent attacks remain limited. Existing red-teaming approaches mainly rely on manually crafted attack strategies or static models trained offline. Such methods fail to capture the underlying behavioral patterns of web agents, making it difficult to generalize across diverse environments. In web agent attacks, success requires the continuous discovery and evolution of attack strategies. To this end, we propose Genesis, a novel agentic framework composed of three modules: Attacker, Scorer, and Strategist. The Attacker generates adversarial injections by integrating the genetic algorithm with a hybrid strategy representation. The Scorer evaluates the target web agent's responses to provide feedback. The Strategist dynamically uncovers effective strategies from interaction logs and compiles them into a continuously growing strategy library, which is then re-deployed to enhance the Attacker's effectiveness. Extensive experiments across various web tasks show that our framework discovers novel strategies and consistently outperforms existing attack baselines. Our code is available at https://github.com/CjangCjengh/web_agent_attack.

Updated: 2026-04-01 16:25:41

标题: 《创世纪：针对LLM Web代理的红队进攻策略的演变》

摘要: 随着大型语言模型（LLM）代理越来越多地自动化复杂的网络任务，它们提高了生产力的同时也引入了新的安全风险。然而，有关网络代理攻击的相关研究仍然有限。现有的红队方法主要依赖于手工制定的攻击策略或离线训练的静态模型。这些方法无法捕捉网络代理的基本行为模式，使得难以在不同环境中推广。在网络代理攻击中，成功需要不断发现和演变攻击策略。为此，我们提出了Genesis，一个由三个模块（攻击者，评分者和策略家）组成的新型代理框架。攻击者通过将遗传算法与混合策略表示集成来生成对抗注入。评分者评估目标网络代理的反应以提供反馈。策略家动态地从交互日志中发现有效的策略，并将它们编译成一个不断增长的策略库，然后重新部署以增强攻击者的效果。在各种网络任务上进行的大量实验表明，我们的框架发现了新颖的策略，并始终优于现有的攻击基准线。我们的代码可在https://github.com/CjangCjengh/web_agent_attack 上找到。

更新时间: 2026-04-01 16:25:41

领域: cs.AI

下载: http://arxiv.org/abs/2510.18314v2

Approximating Pareto Frontiers in Stochastic Multi-Objective Optimization via Hashing and Randomization

Stochastic Multi-Objective Optimization (SMOO) is critical for decision-making trading off multiple potentially conflicting objectives in uncertain environments. SMOO aims at identifying the Pareto frontier, which contains all mutually non-dominating decisions. The problem is highly intractable due to the embedded probabilistic inference, such as computing the marginal, posterior probabilities, or expectations. Existing methods, such as scalarization, sample average approximation, and evolutionary algorithms, either offer arbitrarily loose approximations or may incur prohibitive computational costs. We propose XOR-SMOO, a novel algorithm that with probability $1-δ$, obtains $γ$-approximate Pareto frontiers ($γ>1$) for SMOO by querying an SAT oracle poly-log times in $γ$ and $δ$. A $γ$-approximate Pareto frontier is only below the true frontier by a fixed, multiplicative factor $γ$. Thus, XOR-SMOO solves highly intractable SMOO problems (\#P-hard) with only queries to SAT oracles while obtaining tight, constant factor approximation guarantees. Experiments on real-world road network strengthening and supply chain design problems demonstrate that XOR-SMOO outperforms several baselines in identifying Pareto frontiers that have higher objective values, better coverage of the optimal solutions, and the solutions found are more evenly distributed. Overall, XOR-SMOO significantly enhanced the practicality and reliability of SMOO solvers.

Updated: 2026-04-01 16:24:13

标题: 通过哈希和随机化在随机多目标优化中逼近帕累托前沿

摘要: 随机多目标优化（SMOO）对于在不确定环境中权衡多个潜在冲突目标进行决策至关重要。SMOO旨在确定包含所有相互非支配决策的帕累托边界。由于嵌入式概率推断，如计算边缘、后验概率或期望，该问题非常难以解决。现有方法，如标量化、样本平均近似和进化算法，要么提供任意宽松的近似，要么可能造成巨大的计算成本。我们提出了XOR-SMOO，一种新颖算法，以概率$1-δ$，通过在$γ$和$δ$中进行多项式对数次数的查询，获得SMOO的$γ$-近似帕累托边界（$γ>1$）。$γ$-近似帕累托边界仅在真实边界下面一个固定的乘法因子$γ$。因此，XOR-SMOO通过只查询SAT预言者就解决了高度棘手的SMOO问题（\#P-hard），同时获得了紧密的、恒定因子的近似保证。在现实世界的道路网络强化和供应链设计问题上的实验表明，XOR-SMOO在确定具有更高目标值的帕累托边界、更好地覆盖最优解以及找到的解更加均匀分布方面优于几种基线。总体而言，XOR-SMOO显著提高了SMOO求解器的实用性和可靠性。

更新时间: 2026-04-01 16:24:13

领域: cs.LG,cs.AI,cs.LO

下载: http://arxiv.org/abs/2604.01098v1

Temporal Dependencies in In-Context Learning: The Role of Induction Heads

Large language models (LLMs) exhibit strong in-context learning capabilities, but how they track and retrieve information from context remains underexplored. Drawing on the free recall paradigm in cognitive science (where participants recall list items in any order), we show that several open-source LLMs consistently display a serial-recall-like pattern, assigning peak probability to tokens that immediately follow a repeated token in the input sequence. Through systematic ablation experiments, we show that induction heads, specialized attention heads that attend to the token following a previous occurrence of the current token, play an important role in this phenomenon. Removing heads with a high induction score substantially reduces the +1 lag bias, whereas ablating random heads does not reproduce the same reduction. We also show that removing heads with high induction scores impairs the performance of models prompted to do serial recall using few-shot learning to a larger extent than removing random heads. Our findings highlight a mechanistically specific connection between induction heads and temporal context processing in transformers, suggesting that these heads are especially important for ordered retrieval and serial-recall-like behavior during in-context learning.

Updated: 2026-04-01 16:21:38

标题: 上下文学习中的时间依赖性：归纳头的作用

摘要: 大型语言模型(LLMs)展现出强大的上下文学习能力，但它们如何跟踪和检索上下文中的信息仍未得到充分探讨。借鉴认知科学中的自由回忆范式(参与者以任何顺序回忆列表项)，我们展示了几个开源LLMs一致显示出类似串行回忆的模式，将最高概率分配给紧随输入序列中重复标记的标记。通过系统性的消融实验，我们展示了感应头，即专门关注紧随当前标记之前发生的当前标记的头部，在这一现象中起着重要作用。去除具有高感应评分的头部显著降低了+1滞后偏差，而去除随机头部则无法复制相同的减少。我们还发现，去除具有高感应评分的头部会更大程度上损害通过少量训练提示执行串行回忆的模型的性能，而不是去除随机头部。我们的发现强调了变压器中感应头和时间上下文处理之间的机制特定联系，表明这些头部在有序检索和上下文学习过程中的类似串行回忆行为中尤为重要。

更新时间: 2026-04-01 16:21:38

领域: cs.CL,cs.AI

下载: http://arxiv.org/abs/2604.01094v1

DR-LoRA: Dynamic Rank LoRA for Fine-Tuning Mixture-of-Experts Models

Mixture-of-Experts (MoE) has become a prominent paradigm for scaling Large Language Models (LLMs). Parameter-efficient fine-tuning methods, such as LoRA, are widely adopted to adapt pretrained MoE LLMs to downstream tasks. However, existing approaches typically assign identical LoRA ranks to all expert modules, ignoring the heterogeneous specialization of pretrained experts. This uniform allocation leads to a resource mismatch: task-relevant experts are under-provisioned, while less relevant ones receive redundant parameters. To address this, we propose DR-LoRA, a Dynamic Rank LoRA framework for fine-tuning pretrained MoE models. Specifically, DR-LoRA initializes all expert LoRA modules with a small active rank and uses an expert saliency score, which combines routing frequency and gradient-based rank importance, to identify which experts would benefit most from additional capacity. It then periodically expands the active ranks of the task-critical expert LoRA, progressively constructing a heterogeneous rank distribution tailored to the target task. Experiments on three MoE models across six tasks show that DR-LoRA consistently outperforms LoRA and other strong baselines, demonstrating that task-adaptive heterogeneous rank allocation is an effective strategy to improve active capacity utilization in MoE fine-tuning.

Updated: 2026-04-01 16:21:37

标题: DR-LoRA：用于微调专家混合模型的动态排名LoRA

摘要: 混合专家（MoE）已成为扩展大型语言模型（LLMs）的突出范式。参数高效的微调方法，如LoRA，被广泛采用来适应预训练的MoE LLMs到下游任务。然而，现有方法通常将相同的LoRA等级分配给所有专家模块，忽略了预训练专家的异质化专业化。这种统一分配导致资源不匹配：与任务相关的专家被低配，而不太相关的专家则接收冗余参数。为了解决这个问题，我们提出了DR-LoRA，一种用于微调预训练MoE模型的动态等级LoRA框架。具体来说，DR-LoRA使用一个专家显著性得分，结合路由频率和基于梯度的等级重要性，来识别哪些专家最需要额外的容量。然后，它定期扩展任务关键专家LoRA的活动等级，逐渐构建一个适合目标任务的异质等级分布。在六项任务中对三个MoE模型进行的实验表明，DR-LoRA始终优于LoRA和其他强基线，表明任务自适应的异质等级分配是改进MoE微调中主动容量利用的有效策略。

更新时间: 2026-04-01 16:21:37

领域: cs.AI,cs.CL

下载: http://arxiv.org/abs/2601.04823v4

LightGuard: Transparent WiFi Security via Physical-Layer LiFi Key Bootstrapping

WiFi is inherently vulnerable to eavesdropping because RF signals may penetrate many physical boundaries, such as walls and floors. LiFi, by contrast, is an optical method confined to line-of-sight and blocked by opaque surfaces. We present LightGuard, a dual-link architecture built on this insight: cryptographic key establishment can be offloaded from WiFi to a physically confined LiFi channel to mitigate the risk of key exposure over RF. LightGuard derives session keys over a LiFi link and installs them on the WiFi interface, ensuring cryptographic material never traverses the open RF medium. A prototype with off-the-shelf WiFi NICs and our LiFi transceiver frontend validates the design.

Updated: 2026-04-01 16:19:12

标题: LightGuard：通过物理层LiFi密钥引导实现透明WiFi安全

摘要: WiFi天生容易受到窃听的威胁，因为射频信号可以穿透许多物理障碍物，比如墙壁和地板。相比之下，LiFi是一种光学方法，限制在视线范围内，并且被不透明表面阻挡。我们提出了一种基于这一观点构建的双链架构LightGuard：加密密钥的建立可以从WiFi转移到一个物理上受限的LiFi信道，以减轻RF上密钥暴露的风险。LightGuard通过LiFi链路生成会话密钥，并将其安装在WiFi接口上，确保加密材料永远不会穿越开放的RF介质。使用现成的WiFi网卡和我们的LiFi收发器前端的原型验证了这一设计。

更新时间: 2026-04-01 16:19:12

领域: cs.CR,cs.AR,cs.NI

下载: http://arxiv.org/abs/2604.01092v1

TRACE: Training-Free Partial Audio Deepfake Detection via Embedding Trajectory Analysis of Speech Foundation Models

Partial audio deepfakes, where synthesized segments are spliced into genuine recordings, are particularly deceptive because most of the audio remains authentic. Existing detectors are supervised: they require frame-level annotations, overfit to specific synthesis pipelines, and must be retrained as new generative models emerge. We argue that this supervision is unnecessary. We hypothesize that speech foundation models implicitly encode a forensic signal: genuine speech forms smooth, slowly varying embedding trajectories, while splice boundaries introduce abrupt disruptions in frame-level transitions. Building on this, we propose TRACE (Training-free Representation-based Audio Countermeasure via Embedding dynamics), a training-free framework that detects partial audio deepfakes by analyzing the first-order dynamics of frozen speech foundation model representations without any training, labeled data, or architectural modification. We evaluate TRACE on four benchmarks that span two languages using six speech foundation models. In PartialSpoof, TRACE achieves 8.08% EER, competitive with fine-tuned supervised baselines. In LlamaPartialSpoof, the most challenging benchmark featuring LLM-driven commercial synthesis, TRACE surpasses a supervised baseline outright (24.12% vs. 24.49% EER) without any target-domain data. These results show that temporal dynamics in speech foundation models provide an effective, generalize signal for training-free audio forensics.

Updated: 2026-04-01 16:12:31

标题: TRACE：通过嵌入式轨迹分析语音基础模型进行无需训练的部分音频深度伪造检测

摘要: 部分音频深度伪造，即将合成片段拼接到真实录音中，尤其具有欺骗性，因为大部分音频保持真实性。现有的检测器是有监督的：它们需要帧级别注释，容易过拟合特定合成管道，并且必须随着新的生成模型的出现而重新训练。我们认为这种监督是不必要的。我们假设语音基础模型隐含地编码了一种取证信号：真实语音形成平滑、缓慢变化的嵌入轨迹，而拼接边界在帧级别转换中引入了突然的中断。基于此，我们提出了TRACE（通过嵌入动态进行培训免费的基于表示的音频对抗措施），这是一个无需训练、标记数据或架构修改的框架，通过分析冻结的语音基础模型表示的一阶动态来检测部分音频深度伪造。我们使用六个语音基础模型在涵盖两种语言的四个基准测试上评估TRACE。在PartialSpoof中，TRACE实现了8.08%的等效错误率（EER），与精调的有监督基线相竞争。在LlamaPartialSpoof中，这是一个具有挑战性的基准测试，采用了LLM驱动的商业合成，TRACE直接超过了一个有监督的基线（24.12%对24.49%的EER），而没有任何目标域数据。这些结果表明，语音基础模型中的时间动态为无需训练的音频取证提供了有效的、通用的信号。

更新时间: 2026-04-01 16:12:31

领域: cs.SD,cs.AI,cs.CV

下载: http://arxiv.org/abs/2604.01083v1

ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction

3D semantic occupancy prediction is central to autonomous driving, yet current methods are vulnerable to long-tailed class bias and out-of-distribution (OOD) inputs, often overconfidently assigning anomalies to rare classes. We present ProOOD, a lightweight, plug-and-play method that couples prototype-guided refinement with training-free OOD scoring. ProOOD comprises (i) prototype-guided semantic imputation that fills occluded regions with class-consistent features, (ii) prototype-guided tail mining that strengthens rare-class representations to curb OOD absorption, and (iii) EchoOOD, which fuses local logit coherence with local and global prototype matching to produce reliable voxel-level OOD scores. Extensive experiments on five datasets demonstrate that ProOOD achieves state-of-the-art performance on both in-distribution 3D occupancy prediction and OOD detection. On SemanticKITTI, it surpasses baselines by +3.57% mIoU overall and +24.80% tail-class mIoU; on VAA-KITTI, it improves AuPRCr by +19.34 points, with consistent gains across benchmarks. These improvements yield more calibrated occupancy estimates and more reliable OOD detection in safety-critical urban driving. The source code is publicly available at https://github.com/7uHeng/ProOOD.

Updated: 2026-04-01 16:11:59

标题: ProOOD：原型引导的分布之外的3D占用预测

摘要: 3D语义占用预测对于自动驾驶至关重要，然而当前的方法容易受到长尾类别偏向和超出分布（OOD）输入的影响，常常过于自信地将异常分配给稀有类别。我们提出了ProOOD，这是一种轻量级、即插即用的方法，将原型引导细化与无需训练的OOD评分相结合。ProOOD包括：（i）原型引导的语义填补，用类一致的特征填充遮挡区域；（ii）原型引导的尾部挖掘，加强稀有类别的表示以抑制OOD吸收；（iii）EchoOOD，将本地逻辑一致性与本地和全局原型匹配融合，产生可靠的体素级OOD评分。在五个数据集上的大量实验表明，ProOOD在分布3D占用预测和OOD检测方面均实现了最先进的性能。在SemanticKITTI上，整体mIoU比基准提高了+3.57%，尾部类别mIoU提高了+24.80%；在VAA-KITTI上，AuPRCr提高了+19.34个点，并且在各项基准测试中都有持续的增长。这些改进使得在安全关键的城市驾驶中获得更加校准的占用估计和更可靠的OOD检测。源代码可以在https://github.com/7uHeng/ProOOD上公开获取。

更新时间: 2026-04-01 16:11:59

领域: cs.CV,cs.LG,cs.RO,eess.IV

下载: http://arxiv.org/abs/2604.01081v1

Automated Generation of Cybersecurity Exercise Scenarios

There is a growing need for cybersecurity professionals with practical knowledge and experience to meet societal needs and comply with new standards and regulations. At the same time, the advances in software technology and artificial intelligence point towards a future where software agents will play an important role in protecting the computer systems that are critical for society to function. The training and development of both humans and software agents requires the design and execution of cybersecurity exercises that differ in properties such as size, scope, objectives, difficultly, etc. Cybersecurity scenarios are critical for the operation of cybersecurity exercises as they describe the scope, context, operational environment and storyline of each exercise. In this work, we present an approach to automatically generate cybersecurity scenarios that model enterprise IT systems. Our approach is able to generate a large number of scenarios that differ in multiple criteria including size, scope, difficulty, complexity and diversity. We further release as open source: a simulation and a virtualization environment that can run cybersecurity exercises based on the generated scenarios and a dataset containing 100000 sample scenarios.

Updated: 2026-04-01 16:11:00

标题: 自动生成网络安全练习场景

摘要: 随着社会需求的增长和新标准和法规的遵守，对具有实际知识和经验的网络安全专业人员的需求越来越大。同时，软件技术和人工智能的进步指向一个未来，在这个未来中，软件代理将在保护对社会功能至关重要的计算机系统方面发挥重要作用。人类和软件代理的培训和发展需要设计和执行网络安全练习，这些练习在大小、范围、目标、难度等属性上有所不同。网络安全场景对网络安全练习的运作至关重要，因为它们描述了每个练习的范围、背景、操作环境和情节。在这项工作中，我们提出了一种自动生成模拟企业IT系统的网络安全场景的方法。我们的方法能够生成大量在大小、范围、难度、复杂性和多样性等多个标准上有所不同的场景。我们进一步开源了一个能够运行基于生成场景的网络安全练习的模拟和虚拟化环境，以及包含10万个样本场景的数据集。

更新时间: 2026-04-01 16:11:00

领域: cs.CR,cs.SE

下载: http://arxiv.org/abs/2604.01079v1

LG-HCC: Local Geometry-Aware Hierarchical Context Compression for 3D Gaussian Splatting

Although 3D Gaussian Splatting (3DGS) enables high-fidelity real-time rendering, its prohibitive storage overhead severely hinders practical deployment. Recent anchor-based 3DGS compression schemes reduce gaussian redundancy through some advanced context models. However, they overlook explicit geometric dependencies, leading to structural degradation and suboptimal ratedistortion performance. In this paper, we propose a Local Geometry-aware Hierarchical Context Compression framework for 3DGS(LG-HCC) that incorporates inter-anchor geometric correlations into anchor pruning and entropy coding for compact representation. Specifically, we introduce an Neighborhood-Aware Anchor Pruning (NAAP) strategy, which evaluates anchor importance via weighted neighborhood feature aggregation and then merges low-contribution anchors into salient neighbors, yielding a compact yet geometry-consistent anchor set. Moreover, we further develop a hierarchical entropy coding scheme, in which coarse-to-fine priors are exploited through a lightweight Geometry-Guided Convolution(GG-Conv) operator to enable spatially adaptive context modeling and rate-distortion optimization. Extensive experiments show that LG-HCC effectively alleviates structural preservation issues,achieving superior geometric integrity and rendering fidelity while reducing storage by up to 30.85x compared to the Scaffold-GS baseline on the Mip-NeRF360 dataset

Updated: 2026-04-01 16:10:38

标题: LG-HCC：局部几何感知分层上下文压缩用于3D高斯平铺

摘要: 尽管3D高斯喷射（3DGS）实现了高保真实时渲染，但其存储开销严重阻碍了实际部署。最近基于锚点的3DGS压缩方案通过一些高级上下文模型减少了高斯冗余。然而，它们忽视了显式的几何依赖关系，导致结构退化和次优的速率失真性能。在本文中，我们提出了一种面向3DGS的局部几何感知分层上下文压缩框架（LG-HCC），将锚点减枝和熵编码相结合，以实现紧凑的表示。具体来说，我们引入了一种邻域感知锚点减枝（NAAP）策略，通过加权邻域特征聚合评估锚点重要性，然后将低贡献的锚点合并到显著的邻居中，产生一个紧凑而几何一致的锚点集。此外，我们进一步开发了一个分层熵编码方案，通过轻量级的几何引导卷积（GG-Conv）运算符利用粗到细的先验信息，实现空间自适应上下文建模和速率失真优化。广泛的实验表明，相比于Mip-NeRF360数据集上的Scaffold-GS基线，LG-HCC有效地缓解了结构保留问题，实现了卓越的几何完整性和渲染保真度，同时将存储减少了高达30.85倍。

更新时间: 2026-04-01 16:10:38

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2603.28431v3

Learning When the Concept Shifts: Confounding, Invariance, and Dimension Reduction

Practitioners often face the challenge of deploying prediction models in new environments with shifted distributions of covariates and responses. With observational data, such shifts are often driven by unobserved confounding, and can in fact alter the concept of which model is best. This paper studies distribution shifts in the domain adaptation problem with unobserved confounding. We postulate a linear structural causal model to account for endogeneity and unobserved confounding, and we leverage exogenous invariant covariate representations to cure concept shifts and improve target prediction. We propose a data-driven representation learning method that optimizes for a lower-dimensional linear subspace and a prediction model confined to that subspace. This method operates on a non-convex objective -- that interpolates between predictability and stability -- constrained to the Stiefel manifold, using an analog of projected gradient descent. We analyze the optimization landscape and prove that, provided sufficient regularization, nearly all local optima align with an invariant linear subspace resilient to distribution shifts. This method achieves a nearly ideal gap between target and source risk. We validate the method and theory with real-world data sets to illustrate the tradeoffs between predictability and stability.

Updated: 2026-04-01 16:08:36

标题: 学习当概念发生转变时：混淆、不变性和降维

摘要: 从业者经常面临在具有不同协变量和响应分布的新环境中部署预测模型的挑战。观察数据中，这种转变通常由未观察到的混淆驱动，事实上可能改变哪种模型最好的概念。本文研究了带有未观察混淆的领域适应问题中的分布转变。我们假设一个线性结构因果模型来解释内生性和未观察到的混淆，并利用外生不变协变量表示来治愈概念转变并改善目标预测。我们提出了一种数据驱动的表示学习方法，该方法优化一个低维线性子空间和一个局限于该子空间的预测模型。这种方法在一个非凸的目标函数上运行，该函数在可预测性和稳定性之间插值，受限于Stiefel流形，使用投影梯度下降的类比。我们分析了优化的景观，并证明，只要提供足够的正则化，几乎所有局部最优解都与对分布转变具有弹性的不变线性子空间对齐。该方法实现了目标和源风险之间的几乎理想差距。我们通过真实数据集验证了该方法和理论，以说明可预测性和稳定性之间的权衡。

更新时间: 2026-04-01 16:08:36

领域: cs.LG,stat.ME,stat.ML

下载: http://arxiv.org/abs/2406.15904v2

VibeGuard: A Security Gate Framework for AI-Generated Code

"Vibe coding," in which developers delegate code generation to AI assistants and accept the output with little manual review, has gained rapid adoption in production settings. On March 31, 2026, Anthropic's Claude Code CLI shipped a 59.8 MB source map file in its npm package, exposing roughly 512,000 lines of proprietary TypeScript. The tool had itself been largely vibe-coded, and the leak traced to a misconfigured packaging rule rather than a logic bug. Existing static-analysis and secret-scanning tools did not cover this failure mode, pointing to a gap between the vulnerabilities AI tends to introduce and the vulnerabilities current tooling is built to find. We present VibeGuard, a pre-publish security gate that targets five such blind spots: artifact hygiene, packaging-configuration drift, source-map exposure, hardcoded secrets, and supply-chain risk. In controlled experiments on eight synthetic projects (seven vulnerable, one clean control), VibeGuard achieved 100% recall, 89.47% precision (F1 = 94.44%), and correct pass/fail gate decisions on all eight projects across three policy levels. We discuss how these results inform a defense-in-depth workflow for teams that rely on AI code generation.

Updated: 2026-04-01 15:57:01

标题: VibeGuard：用于AI生成代码的安全门框架

摘要: “振动编码”是指开发人员将代码生成委托给AI助手，并只进行少量手动审查后接受输出，在生产环境中迅速得到采用。2026年3月31日，Anthropic的Claude Code CLI在其npm软件包中发布了一个59.8 MB的源映射文件，暴露了大约512,000行的专有TypeScript代码。该工具本身在很大程度上是通过振动编码生成的，泄漏源于一个配置错误的打包规则，而不是逻辑错误。现有的静态分析和秘密扫描工具未覆盖此故障模式，指向AI往往引入的漏洞与当前工具构建的漏洞发现之间存在差距。我们提出了VibeGuard，一个预发布安全门，针对五个这样的盲点进行监控：制品卫生、打包配置漂移、源映射曝光、硬编码的秘密和供应链风险。在八个合成项目（七个易受攻击的，一个干净的对照组）的控制实验中，VibeGuard实现了100%的召回率，89.47%的精度（F1 = 94.44%），并在三个策略级别的所有八个项目上做出了正确的通过/失败门决定。我们讨论了这些结果如何为依赖于AI代码生成的团队提供深度防御工作流程的信息。

更新时间: 2026-04-01 15:57:01

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2604.01052v1

Learning Hyperparameters via a Data-Emphasized Variational Objective

When training large models on limited data, avoiding overfitting is paramount. Common grid search or smarter search methods rely on expensive separate runs for each candidate hyperparameter, while carving out a validation set that reduces available training data. In this paper, we study gradient-based learning of hyperparameters via the evidence lower bound (ELBO) objective from Bayesian variational methods. This avoids the need for any validation set. We focus on scenarios where the model is over-parameterized for flexibility and the approximate posterior is chosen to be Gaussian with isotropic covariance for tractability, even though it cannot match the true posterior. In such scenarios, we find the ELBO prioritizes posteriors that match the prior, leading to severe underfitting. Instead, we recommend a data-emphasized ELBO that upweights the likelihood but not the prior. In Bayesian transfer learning of image and text classifiers, our method reduces the 88+ hour grid search of past work to under 3 hours while delivering comparable accuracy. We further demonstrate how our approach enables efficient yet accurate approximations of Gaussian processes with learnable lengthscale kernels.

Updated: 2026-04-01 15:56:36

标题: 通过数据强调的变分目标学习超参数

摘要: 在有限数据上训练大型模型时，避免过拟合至关重要。常见的网格搜索或更智能的搜索方法依赖于昂贵的针对每个候选超参数的单独运行，同时削减验证集，从而减少可用的训练数据。在本文中，我们研究了通过贝叶斯变分方法的证据下界（ELBO）目标对超参数进行基于梯度的学习。这避免了对任何验证集的需要。我们专注于模型过度参数化以提高灵活性的场景，选择近似后验为具有各向同性协方差的高斯分布以便于处理，尽管它无法匹配真实后验。在这种情况下，我们发现ELBO优先考虑与先验匹配的后验，导致严重欠拟合。相反，我们推荐一种数据强调的ELBO，强调似然但不是先验。在贝叶斯迁移学习中，我们的方法将过去工作中超过88小时的网格搜索缩短到不到3小时，同时提供可比较的准确性。我们进一步展示了我们的方法如何实现对具有可学习长度标度核的高斯过程的高效而准确的近似。

更新时间: 2026-04-01 15:56:36

领域: cs.LG,stat.ML

下载: http://arxiv.org/abs/2502.01861v3

Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference

We study multi-armed bandits under network interference, where each unit's reward depends on its own treatment and those of its neighbors in a given graph. This induces an exponentially large action space, making standard approaches computationally impractical. We propose a novel algorithm that uses the local graph structure to minimize regret. We derive a graph-dependent upper bound on cumulative regret that improves over prior work. Additionally, we provide the first lower bounds for bandits with arbitrary network interference, where each bound involves a distinct structural property of the graph. These bounds show that for both dense and sparse graphs, our algorithm is nearly optimal, with matching upper and lower bounds up to logarithmic factors. When the interference graph is unknown, a variant of our algorithm is Pareto optimal: no algorithm can uniformly outperform it across all instances. We complement our theoretical results with numerical experiments, showing that our approach outperforms the baseline methods.

Updated: 2026-04-01 15:55:20

标题: 多臂老虎机中受图干扰的遗憾界限

摘要: 我们研究了在网络干扰下的多臂赌博机问题，其中每个单位的奖励取决于其自身的处理方式以及给定图中其邻居的处理方式。这导致了指数级的动作空间，使得标准方法在计算上变得不切实际。我们提出了一种利用局部图结构来最小化后悔的新算法。我们推导了一个依赖于图结构的累积后悔的上界，这个上界优于先前的工作。此外，我们还为具有任意网络干扰的赌博机提供了首个下界，其中每个下界都涉及图的一个独特结构属性。这些下界表明，对于密集和稀疏图，我们的算法几乎是最优的，与对数因子匹配的上界和下界。当干扰图未知时，我们算法的一个变种是帕累托最优的：没有算法可以在所有实例上均能优于它。我们通过数值实验补充了我们的理论结果，结果表明我们的方法胜过基线方法。

更新时间: 2026-04-01 15:55:20

领域: cs.LG

下载: http://arxiv.org/abs/2503.07555v3

CayleyPy Growth: Efficient growth computations and hundreds of new conjectures on Cayley graphs (Brief version)

This is the third paper of the CayleyPy project applying artificial intelligence to problems in group theory. We announce the first public release of CayleyPy, an open source Python library for computations with Cayley and Schreier graphs. Compared with systems such as GAP and Sage, CayleyPy handles much larger graphs and performs several orders of magnitude faster. Using CayleyPy we obtained about 200 new conjectures on Cayley and Schreier graphs, focused on diameters and growth. For many Cayley graphs of symmetric groups Sn we observe quasi polynomial diameter formulas: a small set of quadratic or linear polynomials indexed by n mod s. We conjecture that this is a general phenomenon, giving efficient diameter computation despite the problem being NP hard. We propose a refinement of the Babai type conjecture on diameters of Sn: n^2/2 + 4n upper bounds in the undirected case, compared to previous O(n^2) bounds. We also provide explicit generator families, related to involutions in a square with whiskers pattern, conjectured to maximize the diameter; search confirms this for all n up to 15. We further conjecture an answer to a question posed by V M Glushkov in 1968 on directed Cayley graphs generated by a cyclic shift and a transposition. For nilpotent groups we conjecture an improvement of J S Ellenberg's results on upper unitriangular matrices over Z/pZ, showing linear dependence of diameter on p. Some conjectures are LLM friendly, naturally stated as sorting problems verifiable by algorithms or Python code. To benchmark path finding we created more than 10 Kaggle datasets. CayleyPy works with arbitrary permutation or matrix groups and includes over 100 predefined generators. Our growth computation code outperforms GAP and Sage up to 1000 times in speed and size.

Updated: 2026-04-01 15:54:42

标题: CayleyPy Growth: 高效的生长计算和数百个关于Cayley图的新猜想（简版）

摘要: 这是 CayleyPy 项目的第三篇论文，将人工智能应用于群论问题。我们宣布了 CayleyPy 的首次公开发布，这是一个用于计算 Cayley 和 Schreier 图的开源 Python 库。与 GAP 和 Sage 等系统相比，CayleyPy 处理更大的图形，并且执行速度快几个数量级。使用 CayleyPy，我们在 Cayley 和 Schreier 图上得到了约200个新的猜想，重点放在直径和增长上。对于对称群 Sn 的许多 Cayley 图，我们观察到准多项式直径公式：一个由 n mod s 索引的一小组二次或线性多项式。我们猜想这是一个普遍现象，提供了高效的直径计算方法，尽管问题是 NP 难的。我们提出了对 Sn 的直径的 Babai 类型猜想的改进：在无向情况下，上界为 n^2/2 + 4n，与之前的 O(n^2) 上界相比。我们还提供了明确的生成器族，与在带有鬃毛图案的正方形中的逆元有关，猜想最大化直径；搜索确认了这一点，对所有 n 最多到 15。我们进一步猜想了 1968 年 V M Glushkov 提出的关于由循环移位和换位生成的有向 Cayley 图的问题的答案。对于幂零群，我们猜想了 J S Ellenberg 关于 Z/pZ 上的上三角矩阵的结果的改进，显示了直径与 p 的线性依赖性。一些猜想对 LLM 友好，自然陈述为由算法或 Python 代码验证的排序问题。为了对路径查找进行基准测试，我们创建了超过10个 Kaggle 数据集。CayleyPy 可以处理任意排列或矩阵群，并包含超过100个预定义的生成器。我们的增长计算代码在速度和大小上比 GAP 和 Sage 快上1000倍。

更新时间: 2026-04-01 15:54:42

领域: math.CO,cs.LG,hep-th,math.GR

下载: http://arxiv.org/abs/2509.19162v2

Adversarial Attacks in AI-Driven RAN Slicing: SLA Violations and Recovery

Next-generation (NextG) cellular networks are designed to support emerging applications with diverse data rate and latency requirements, such as immersive multimedia services and large-scale Internet of Things deployments. A key enabling mechanism is radio access network (RAN) slicing, which dynamically partitions radio resources into virtual resource blocks to efficiently serve heterogeneous traffic classes, including enhanced mobile broadband (eMBB), massive machine-type communications (mMTC), and ultra-reliable low-latency communications (URLLC). In this paper, we study the impact of adversarial attacks on AI-driven RAN slicing decisions, where a budget-constrained adversary selectively jams slice transmissions to bias deep reinforcement learning (DRL)-based resource allocation, and quantify the resulting service level agreement (SLA) violations and post-attack recovery behavior. Our results indicate that budget-constrained adversarial jamming can induce severe and slice-dependent steady-state SLA violations. Moreover, the DRL agent's reward converges toward the clean baseline only after a non-negligible recovery period.

Updated: 2026-04-01 15:54:06

标题: 在AI驱动的RAN切片中的对抗性攻击：SLA违规和恢复

摘要: 下一代（NextG）蜂窝网络设计用于支持具有不同数据速率和延迟要求的新兴应用，例如沉浸式多媒体服务和大规模物联网部署。一个关键的启用机制是无线接入网络（RAN）切片，它将无线资源动态分割为虚拟资源块，以有效地为增强移动宽带（eMBB）、大规模机器类型通信（mMTC）和超可靠低延迟通信（URLLC）等异构流量类别提供服务。在本文中，我们研究对基于人工智能驱动的RAN切片决策的对抗性攻击的影响，其中一个预算受限的对手选择性地干扰切片传输，以偏导深度强化学习（DRL）为基础的资源分配，并量化由此产生的服务级别协议（SLA）违规和攻击后的恢复行为。我们的结果表明，预算受限的对抗性干扰可以引起严重的且与切片相关的稳态SLA违规。此外，仅在经过不可忽视的恢复期之后，DRL代理的奖励才会收敛到干净基线。

更新时间: 2026-04-01 15:54:06

领域: cs.NI,cs.AI

下载: http://arxiv.org/abs/2604.01049v1

RoboNeuron: A Middle-Layer Infrastructure for Agent-Driven Orchestration in Embodied AI

Vision-language-action (VLA) models and LLM agents have advanced rapidly, yet reliable deployment on physical robots is often hindered by an interface mismatch between agent tool APIs and robot middleware. Current implementations typically rely on ad-hoc wrappers that are difficult to reuse, and changes to the VLA backend or serving stack often necessitate extensive re-integration. We introduce RoboNeuron, a middleware layer that connects the Model Context Protocol (MCP) for LLM agents with robot middleware such as ROS2. RoboNeuron bridges these ecosystems by deriving agent-callable tools directly from ROS schemas, providing a unified execution abstraction that supports both direct commands and modular composition, and localizing backend, runtime, and acceleration-preset changes within a stable inference boundary. We evaluate RoboNeuron in simulation and on hardware through multi-platform base control, arm motion, and VLA-based grasping tasks, demonstrating that it enables modular system orchestration under a unified interface while supporting backend transitions without system rewiring. The full code implementation of this work is available at github repo: https://github.com/guanweifan/RoboNeuron

Updated: 2026-04-01 15:51:58

标题: RoboNeuron：一种用于具身人工智能中代理驱动编排的中间层基础设施

摘要: 视觉-语言-动作（VLA）模型和LLM代理已经迅速发展，但可靠地部署在物理机器人上常常受到代理工具API和机器人中间件之间接口不匹配的阻碍。目前的实现通常依赖于难以重复使用的临时包装器，并且对VLA后端或服务堆栈的更改通常需要进行广泛的重新集成。我们引入了RoboNeuron，它是一个连接LLM代理的模型上下文协议（MCP）与机器人中间件（如ROS2）的中间件层。RoboNeuron通过直接从ROS模式派生代理可调用工具，桥接了这些生态系统，提供了一个统一的执行抽象，支持直接命令和模块化组合，并将后端、运行时和加速器预设的更改定位在稳定的推断边界内。我们通过在模拟和硬件上进行多平台基础控制、臂运动和基于VLA的抓取任务来评估RoboNeuron，展示了它在一个统一接口下实现了模块化系统编排，同时支持后端过渡而无需重新布线系统。这项工作的完整代码实现可以在github存储库中找到：https://github.com/guanweifan/RoboNeuron

更新时间: 2026-04-01 15:51:58

领域: cs.RO,cs.LG

下载: http://arxiv.org/abs/2512.10394v2

Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL

Despite the significant advances in Deep Reinforcement Learning (RL) observed in the last decade, the amount of training experience necessary to learn effective policies remains one of the primary concerns in both simulated and real environments. Looking to solve this issue, previous work has shown that improved efficiency can be achieved by separately modeling the agent and environment, but usually requires a supervisory signal. In contrast to RL, humans can perfect a new skill from a small number of trials and often do so without a supervisory signal, making neuroscientific studies of human development a valuable source of inspiration for RL. In particular, we explore the idea of motor prediction, which states that humans develop an internal model of themselves and of the consequences that their motor commands have on the immediate sensory inputs. Our insight is that the movementofthe agent provides a cue that allows the duality between the agent and environment to be learned. To instantiate this idea, we present Ego-Foresight (EF), a self-supervised method for disentangling agent information based on motion and prediction. Our main finding is that, when used as an auxiliary task in feature learning, self-supervised agent awareness improves the sample-efficiency and performance of the underlying RL algorithm. To test our approach, we study the ability of EF to predict agent movement and disentangle agent information. Then, we integrate EF with model-free and model based RL algorithms to solve simulated control tasks, showing improved sample-efficiency and performance.

Updated: 2026-04-01 15:51:54

标题: 自我预见：自我监督学习智能体感知表示以改进强化学习

摘要: 尽管在过去十年中观察到了深度强化学习（RL）方面的重大进展，但在模拟和真实环境中学习有效策略所需的训练经验仍然是主要关注点之一。为了解决这个问题，先前的研究表明，通过分别对代理和环境建模，可以实现改进的效率，但通常需要监督信号。与RL相比，人类可以在少数尝试中完善新技能，通常不需要监督信号，这使得人类发展的神经科学研究对RL具有重要的启发作用。特别是，我们探讨了运动预测的概念，即人类发展出了关于自身及其运动命令对即时感觉输入产生的后果的内部模型。我们的见解是，代理的运动提供了一个线索，使得代理和环境之间的双重性得以学习。为了实现这一想法，我们提出了Ego-Foresight（EF），这是一种基于运动和预测的自监督方法，用于解开代理信息。我们的主要发现是，在特征学习的辅助任务中使用自监督代理意识能够提高底层RL算法的样本效率和性能。为了测试我们的方法，我们研究了EF预测代理运动和解开代理信息的能力。然后，我们将EF与基于模型和无模型的RL算法结合起来，以解决模拟控制任务，显示出改进的样本效率和性能。

更新时间: 2026-04-01 15:51:54

领域: cs.RO,cs.AI

下载: http://arxiv.org/abs/2407.01570v4

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Large Language Models (LLMs) have advanced artificial intelligence by enabling human-like text generation and natural language understanding. However, their reliance on static training data limits their ability to respond to dynamic, real-time queries, resulting in outdated or inaccurate outputs. Retrieval-Augmented Generation (RAG) has emerged as a solution, enhancing LLMs by integrating real-time data retrieval to provide contextually relevant and up-to-date responses. Despite its promise, traditional RAG systems are constrained by static workflows and lack the adaptability required for multi-step reasoning and complex task management. Agentic Retrieval-Augmented Generation (Agentic RAG) transcends these limitations by embedding autonomous AI agents into the RAG pipeline. These agents leverage agentic design patterns reflection, planning, tool use, and multi-agent collaboration to dynamically manage retrieval strategies, iteratively refine contextual understanding, and adapt workflows through operational structures ranging from sequential steps to adaptive collaboration. This integration enables Agentic RAG systems to deliver flexibility, scalability, and context-awareness across diverse applications. This paper presents an analytical survey of Agentic RAG systems. It traces the evolution of RAG paradigms, introduces a principled taxonomy of Agentic RAG architectures based on agent cardinality, control structure, autonomy, and knowledge representation, and provides a comparative analysis of design trade-offs across existing frameworks. The survey examines applications in healthcare, finance, education, and enterprise document processing, and distills practical lessons for system designers and practitioners. Finally, it identifies key open research challenges related to evaluation, coordination, memory management, efficiency, and governance, outlining directions for future research.

Updated: 2026-04-01 15:51:06

标题: 主动检索增强生成：关于主动RAG的调查

摘要: 大型语言模型（LLMs）通过实现类似人类的文本生成和自然语言理解，推动了人工智能的发展。然而，它们依赖于静态训练数据，限制了对动态实时查询的响应能力，导致输出过时或不准确。检索增强生成（RAG）已经成为一种解决方案，通过集成实时数据检索来增强LLMs，以提供相关且最新的响应。尽管具有潜力，传统的RAG系统受到静态工作流程的限制，缺乏多步推理和复杂任务管理所需的适应性。主动检索增强生成（Agentic RAG）通过将自主AI代理嵌入到RAG管道中，超越了这些限制。这些代理利用主动设计模式反思、规划、工具使用和多代理协作，动态管理检索策略，通过从顺序步骤到自适应协作的操作结构迭代地完善上下文理解，并调整工作流程。这种集成使主动RAG系统能够在不同应用程序中提供灵活性、可扩展性和上下文感知。本文对主动RAG系统进行了分析调查。它追溯了RAG范式的演变，介绍了基于代理基数、控制结构、自治性和知识表示的主动RAG体系结构的原则性分类，并对现有框架中的设计权衡进行了比较分析。该调查研究了在医疗保健、金融、教育和企业文档处理中的应用，并总结了系统设计者和从业者的实用经验。最后，它确定了与评估、协调、内存管理、效率和治理相关的关键开放性研究挑战，并概述了未来研究方向。

更新时间: 2026-04-01 15:51:06

领域: cs.AI,cs.CL,cs.IR

下载: http://arxiv.org/abs/2501.09136v4

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

System Instructions in Large Language Models (LLMs) are commonly used to enforce safety policies, define agent behavior, and protect sensitive operational context in agentic AI applications. These instructions may contain sensitive information such as API credentials, internal policies, and privileged workflow definitions, making system instruction leakage a critical security risk highlighted in the OWASP Top 10 for LLM Applications. Without incurring the overhead costs of reasoning models, many LLM applications rely on refusal-based instructions that block direct requests for system instructions, implicitly assuming that prohibited information can only be extracted through explicit queries. We introduce an automated evaluation framework that tests whether system instructions remain confidential when extraction requests are re-framed as encoding or structured output tasks. Across four common models and 46 verified system instructions, we observe high attack success rates (> 0.7) for structured serialization where models refuse direct extraction requests but disclose protected content in the requested serialization formats. We further demonstrate a mitigation strategy based on one-shot instruction reshaping using a Chain-of-Thought reasoning model, indicating that even subtle changes in wording and structure of system instructions can significantly reduce attack success rate without requiring model retraining.

Updated: 2026-04-01 15:45:56

标题: 自动化框架用于评估和加固LLM系统指令以防编码攻击

摘要: 大型语言模型（LLMs）中的系统指令通常用于强制执行安全策略，定义代理行为，并保护代理式人工智能应用程序中的敏感操作上下文。这些指令可能包含敏感信息，如API凭据、内部政策和特权工作流定义，使系统指令泄漏成为OWASP LLM应用程序十大安全风险中的一个关键问题。许多LLM应用程序在不产生推理模型的开销的情况下，依赖于基于拒绝的指令，阻止直接请求系统指令，暗示被禁止的信息只能通过显式查询提取。我们引入了一个自动化评估框架，测试当提取请求被重新构建为编码或结构化输出任务时，系统指令是否保持机密性。在四种常见模型和46个经过验证的系统指令中，我们观察到对结构化序列化的高攻击成功率（> 0.7），其中模型拒绝直接提取请求，但在请求的序列化格式中透露受保护的内容。我们进一步展示了一种基于一次性指令重塑的缓解策略，使用链式思维推理模型，表明即使是系统指令的措辞和结构上的细微变化也可以显著降低攻击成功率，而无需重新训练模型。

更新时间: 2026-04-01 15:45:56

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2604.01039v1

Aligning Recommendations with User Popularity Preferences

Popularity bias is a pervasive problem in recommender systems, where recommendations disproportionately favor popular items. This not only results in "rich-get-richer" dynamics and a homogenization of visible content, but can also lead to misalignment of recommendations with individual users' preferences for popular or niche content. This work studies popularity bias through the lens of user-recommender alignment. To this end, we introduce Popularity Quantile Calibration, a measurement framework that quantifies misalignment between a user's historical popularity preference and the popularity of their recommendations. Building on this notion of popularity alignment, we propose SPREE, an inference-time mitigation method for sequential recommenders based on activation steering. SPREE identifies a popularity direction in representation space and adaptively steers model activations based on an estimate of each user's personal popularity bias, allowing both the direction and magnitude of steering to vary across users. Unlike global debiasing approaches, SPREE explicitly targets alignment rather than uniformly reducing popularity. Experiments across multiple datasets show that SPREE consistently improves user-level popularity alignment while preserving recommendation quality.

Updated: 2026-04-01 15:45:24

标题: 将推荐与用户流行度偏好对齐

摘要: 流行度偏见在推荐系统中是一个普遍存在的问题，推荐内容倾向于偏向流行物品。这不仅导致了“富者愈富”的动态和可见内容的同质化，还可能导致推荐与个体用户对流行或利基内容的偏好不一致。本文通过用户-推荐器对齐的视角研究了流行度偏见。为此，我们引入了流行度分位数校准，这是一个衡量用户历史流行度偏好与其推荐流行度之间不一致的测量框架。基于这种流行度对齐的概念，我们提出了SPREE，这是一种基于激活导向的序列推荐器推理时间缓解方法。SPREE在表示空间中识别流行度方向，并根据每个用户个人流行度偏见的估计自适应地调整模型激活，允许在用户之间变化操纵的方向和幅度。与全局去偏见方法不同，SPREE明确地针对对齐而不是统一降低流行度。跨多个数据集的实验表明，SPREE始终提高了用户级别的流行度对齐，同时保持了推荐质量。

更新时间: 2026-04-01 15:45:24

领域: cs.IR,cs.AI,cs.CY

下载: http://arxiv.org/abs/2604.01036v1

Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

Multi-LLM revision pipelines, in which a second model reviews and improves a draft produced by a first, are widely assumed to derive their gains from genuine error correction. We question this assumption with a controlled decomposition experiment that uses four matched conditions to separate second-pass gains into three additive components: re-solving, scaffold, and content. We evaluate this design across two model pairs on three benchmarks spanning knowledge-intensive MCQ and competitive programming. Our results show that the gains of multi-LLM revision are not monolithic, but depend on task structure, draft quality, and the type of draft information. On MCQ tasks, where the answer space is constrained and drafts provide little structural guidance, most gains are consistent with stronger-model re-solving, and directly routing queries to the stronger model can be more effective than revising a weak draft. On code generation tasks, however, two-stage prompting remains useful because even semantically null drafts can provide substantial structural scaffolding, while weak draft content can be harmful. Finally, role-reversed experiments show that strong drafts clearly benefit weak reviewers. Ultimately, our findings demonstrate that the utility of multi-LLM revision is dynamically bottlenecked by task structure and draft quality, necessitating more targeted pipeline designs rather than blanket revision strategies.

Updated: 2026-04-01 15:39:40

标题: 修订还是重新解决？分解多LLM管道中的二次增益

摘要: 多LLM修订管道中，第二模型审查和改进第一模型生成的草稿，广泛被认为其收益来源于真正的错误纠正。我们通过一个控制性的分解实验质疑了这一假设，该实验使用了四个匹配条件将第二次通过的收益分为三个可加组件：重新解决、脚手架和内容。我们在跨越知识密集型MCQ和竞争性编程的三个基准上评估了这一设计。我们的结果显示，多LLM修订的收益并不是单一的，而是取决于任务结构、草稿质量和草稿信息类型。在MCQ任务中，答案空间受限且草稿提供的结构指导有限，大部分收益与更强模型的重新解决一致，直接将查询路由到更强模型可能比修订弱草稿更有效。然而，在代码生成任务中，两阶段提示仍然有用，因为即使语义上为空的草稿也可以提供大量结构支撑，而弱草稿内容可能是有害的。最后，反转角色的实验表明，强草稿显然有益于弱审阅者。最终，我们的研究结果表明，多LLM修订的效用受到任务结构和草稿质量的动态瓶颈限制，需要更有针对性的管道设计而不是一揽子修订策略。

更新时间: 2026-04-01 15:39:40

领域: cs.SE,cs.AI,cs.CL

下载: http://arxiv.org/abs/2604.01029v1

No-Regret Generative Modeling via Parabolic Monge-Ampère PDE

We introduce a novel generative modeling framework based on a discretized parabolic Monge-Ampère PDE, which emerges as a continuous limit of the Sinkhorn algorithm commonly used in optimal transport. Our method performs iterative refinement in the space of Brenier maps using a mirror gradient descent step. We establish theoretical guarantees for generative modeling through the lens of no-regret analysis, demonstrating that the iterates converge to the optimal Brenier map under a variety of step-size schedules. As a technical contribution, we derive a new Evolution Variational Inequality tailored to the parabolic Monge-Ampère PDE, connecting geometry, transportation cost, and regret. Our framework accommodates non-log-concave target distributions, constructs an optimal sampling process via the Brenier map, and integrates favorable learning techniques from generative adversarial networks and score-based diffusion models. As direct applications, we illustrate how our theory paves new pathways for generative modeling and variational inference.

Updated: 2026-04-01 15:34:05

标题: 通过抛物蒙日安培偏微分方程实现无悔生成建模

摘要: 我们引入了一种基于离散抛物蒙日-安培偏微分方程的新型生成建模框架，该方程作为常用于最优输运的Sinkhorn算法的连续极限而出现。我们的方法在Brenier映射空间中使用镜像梯度下降步骤进行迭代细化。通过非后悔分析的视角建立了生成建模的理论保证，证明在各种步长计划下迭代收敛到最佳Brenier映射。作为技术贡献，我们推导了一种针对抛物蒙日-安培偏微分方程量身定制的新型演化变分不等式，将几何、输运成本和后悔联系起来。我们的框架适应非对数凹目标分布，通过Brenier映射构建最优采样过程，并整合了生成对抗网络和基于得分的扩散模型的有利学习技术。作为直接应用，我们阐明了我们的理论如何为生成建模和变分推理开辟新的路径。

更新时间: 2026-04-01 15:34:05

领域: stat.ML,cs.LG,math.OC,math.ST

下载: http://arxiv.org/abs/2504.09279v2

Fast and Accurate Probing of In-Training LLMs' Downstream Performances

The paradigm of scaling Large Language Models (LLMs) in both parameter size and test time has pushed the boundaries of AI capabilities, but at the cost of making the traditional generative evaluation paradigm prohibitively expensive, therefore making the latency of LLM's in-training downstream performance evaluation unbearable. However, simple metrics like training loss (perplexity) are not always correlated with downstream performance, as sometimes their trends diverge from the actual task outcomes. This dilemma calls for a method that is computationally efficient and sufficiently accurate in measuring model capabilities. To address this challenge, we introduce a new in-training evaluation paradigm that uses a lightweight probe for monitoring downstream performance. The probes take the internal representations of LLM checkpoints (during training) as input and directly predict the checkpoint's performance on downstream tasks measured by success probability (i.e., pass@1). We design several probe architectures, validating their effectiveness using the OLMo3-7B's checkpoints across a diverse set of downstream tasks. The probes can accurately predict a checkpoint's performance (with avg. AUROC$>$0.75), have decent generalizability across checkpoints (earlier predicts later), and reduce the computation latency from $\sim$1 hr (using conventional generative evaluation method) to $\sim$3 min. In sum, this work presents a practical and scalable in-training downstream evaluation paradigm, enabling a more agile, informed, and efficient LLM development process.

Updated: 2026-04-01 15:32:56

标题: 快速准确地探测在训练中的LLMs的下游表现

摘要: 在参数大小和测试时间方面扩展大型语言模型（LLMs）的范式已经推动了人工智能能力的边界，但是以使传统的生成式评估范式变得代价高昂为代价，因此使得LLM在训练后续性能评估的延迟变得无法忍受。然而，像训练损失（困惑度）这样的简单指标并不总是与后续性能相关联，因为有时它们的趋势与实际任务结果相背离。这种困境需要一种在衡量模型能力时既具有计算效率又足够准确的方法。为了解决这一挑战，我们引入了一种新的在训练中评估范式，该范式使用轻量级探测器来监测后续性能。这些探测器以LLM检查点（在训练过程中）的内部表示为输入，并直接预测检查点在后续任务上的表现，以成功概率（即pass@1）来衡量。我们设计了几种探测器架构，并使用OLMo3-7B的检查点在各种后续任务中验证了它们的有效性。这些探测器可以准确预测检查点的表现（平均AUROC>0.75），在检查点之间具有相当好的泛化能力（早期的预测后期的），并将计算延迟从约1小时（使用传统的生成式评估方法）减少到约3分钟。总之，这项工作提出了一种实用且可扩展的在训练中的后续性能评估范式，实现了更敏捷、明智和高效的LLM开发过程。

更新时间: 2026-04-01 15:32:56

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.01025v1

Model-Based Learning of Near-Optimal Finite-Window Policies in POMDPs

We study model-based learning of finite-window policies in tabular partially observable Markov decision processes (POMDPs). A common approach to learning under partial observability is to approximate unbounded history dependencies using finite action-observation windows. This induces a finite-state Markov decision process (MDP) over histories, referred to as the superstate MDP. Once a model of this superstate MDP is available, standard MDP algorithms can be used to compute optimal policies, motivating the need for sample-efficient model estimation. Estimating the superstate MDP model is challenging because trajectories are generated by interaction with the original POMDP, creating a mismatch between the sampling process and target model. We propose a model estimation procedure for tabular POMDPs and analyze its sample complexity. Our analysis exploits a connection between filter stability and concentration inequalities for weakly dependent random variables. As a result, we obtain tight sample complexity guarantees for estimating the superstate MDP model from a single trajectory. Combined with value iteration, this yields approximately optimal finite-window policies for the POMDP.

Updated: 2026-04-01 15:32:47

标题: 基于模型的学习在POMDPs中近似最优有限窗口策略

摘要: 我们研究了在表格型部分可观察马尔可夫决策过程（POMDPs）中模型基础学习有限窗口策略的方法。在部分可观察性下学习的一种常见方法是使用有限动作-观察窗口来近似无界的历史依赖关系。这导致了一个有限状态的马尔可夫决策过程（MDP）在历史上，被称为超状态MDP。一旦有了这个超状态MDP的模型，标准的MDP算法可以用来计算最优策略，促使需要样本高效的模型估计。估计超状态MDP模型具有挑战性，因为轨迹是通过与原始POMDP的交互生成的，从而在采样过程和目标模型之间产生不匹配。我们提出了一种适用于表格型POMDPs的模型估计过程，并分析了其样本复杂性。我们的分析利用了滤波器稳定性与弱相关随机变量的集中不等式之间的联系。因此，我们从单个轨迹中获得了估计超状态MDP模型的严格样本复杂性保证。结合价值迭代，这为POMDP提供了近似最优的有限窗口策略。

更新时间: 2026-04-01 15:32:47

领域: cs.LG

下载: http://arxiv.org/abs/2604.01024v1

OmniFusion: Simultaneous Multilingual Multimodal Translations via Modular Fusion

There has been significant progress in open-source text-only translation large language models (LLMs) with better language coverage and quality. However, these models can be only used in cascaded pipelines for speech translation (ST), performing automatic speech recognition first followed by translation. This introduces additional latency, which is particularly critical in simultaneous ST (SimulST), and prevents the model from exploiting multimodal context, such as images, which can aid disambiguation. Pretrained multimodal foundation models (MMFMs) already possess strong perception and reasoning capabilities across multiple modalities, but generally lack the multilingual coverage and specialized translation performance of dedicated translation LLMs. To build an effective multimodal translation system, we propose an end-to-end approach that fuses MMFMs with translation LLMs. We introduce a novel fusion strategy that connects hidden states from multiple layers of a pretrained MMFM to a translation LLM, enabling joint end-to-end training. The resulting model, OmniFusion, built on Omni 2.5-7B as the MMFM and SeedX PPO-7B as the translation LLM, can perform speech-to-text, speech-and-image-to-text, and text-and-image-to-text translation. Experiments demonstrate that OmniFusion effectively leverages both audio and visual inputs, achieves a 1-second latency reduction in SimulST compared to cascaded pipelines and also improves the overall translation quality\footnote{Code is available at https://github.com/saikoneru/OmniFusion}.

Updated: 2026-04-01 15:30:27

标题: OmniFusion: 通过模块化融合实现同时多语言多模式翻译

摘要: 在开源文本翻译大型语言模型（LLMs）方面取得了显著进展，具有更好的语言覆盖范围和质量。然而，这些模型只能在级联管道中用于语音翻译（ST），首先进行自动语音识别，然后进行翻译。这引入了额外的延迟，这在同时进行的ST（SimulST）中特别关键，并阻止模型利用多模态上下文，例如图像，这可以帮助消除歧义。预先训练的多模态基础模型（MMFMs）已经具有跨多种模态的强大感知和推理能力，但通常缺乏专门翻译LLMs的多语言覆盖和专门翻译性能。为了构建一个有效的多模态翻译系统，我们提出了一种端到端的方法，将MMFMs与翻译LLMs融合在一起。我们引入了一种新颖的融合策略，将预训练的MMFM的多个层的隐藏状态连接到翻译LLM，实现联合端到端训练。结果模型OmniFusion建立在Omni 2.5-7B作为MMFM和SeedX PPO-7B作为翻译LLM，可以进行语音到文本，语音和图像到文本，文本和图像到文本的翻译。实验表明，OmniFusion有效利用了音频和视觉输入，在SimulST中比级联管道实现了1秒的延迟降低，并且还提高了整体翻译质量。【代码可在https://github.com/saikoneru/OmniFusion找到】。

更新时间: 2026-04-01 15:30:27

领域: cs.CL,cs.AI

下载: http://arxiv.org/abs/2512.00234v2

Cognitive Friction: A Decision-Theoretic Framework for Bounded Deliberation in Tool-Using Agents

Autonomous tool-using agents operating in networked environments must decide which information source to query and when to stop querying and act. Without principled bounds on information-acquisition costs, unconstrained agents exhibit systematic failure modes: excessive tool use under congestion, prolonged deliberation under time decay, and brittle behavior under ambiguous evidence. We propose the Triadic Cognitive Architecture (TCA), a unified decision-theoretic framework that formalizes these failure modes through the concept of Cognitive Friction. By synthesizing nonlinear filtering theory, congestion-dependent cost dynamics, and HJB optimal stopping, we model deliberation as a stochastic control problem over a joint belief-congestion state space, where information acquisition is explicitly priced by tool-dependent signal quality and live network load. Rather than relying on arbitrary heuristic stop-tokens or fixed query budgets, TCA derives an HJB-inspired stopping boundary and instantiates a computable rollout-based approximation of belief-dependent value-of-information with a net-utility halting condition. We validate the framework on two controlled simulation environments, the Emergency Medical Diagnostic Grid (EMDG) and the Network Security Triage Grid (NSTG), designed to isolate key decision-theoretic quantities under reproducible conditions. TCA reduces time-to-action while improving resource outcomes without degrading accuracy: over greedy baselines, TCA gains 36 viability points in EMDG and 33 integrity points in NSTG. Ablations confirm joint optimization of selection and stopping is essential; stopping rules alone recover at most 4 viability points. A sensitivity sweep over alpha, beta, lambda_S shows stable accuracy and interpretable tradeoffs; an empirical sweep over eta in {0, 0.1, 0.3, 0.5} confirms eta=0 is optimal on EMDG trajectories under high temporal urgency.

Updated: 2026-04-01 15:27:58

标题: 认知摩擦：一个决策理论框架，用于工具使用代理的有限思考

摘要: 在网络环境中运行的自主使用工具的代理必须决定查询哪个信息源以及何时停止查询并采取行动。在没有信息获取成本的原则性界限的情况下，不受限制的代理会表现出系统性故障模式：在拥堵情况下过度使用工具，在时间衰减下延长思考时间，在模糊证据下表现脆弱。我们提出了三元认知架构（TCA），这是一个统一的决策理论框架，通过认知摩擦的概念形式化了这些故障模式。通过合成非线性滤波理论、拥堵相关成本动态和HJB最优停止，我们将思考建模为一个关于联合信念-拥堵状态空间的随机控制问题，在这里信息获取明确地由与工具相关的信号质量和活动网络负载定价。TCA不依赖于任意的启发式停止令牌或固定的查询预算，而是推导出一个受HJB启发的停止边界，并实例化一个基于信念相关信息价值的可计算的基于展开的逼近，带有净效用停止条件。我们在两个受控模拟环境中验证了该框架，急诊医学诊断网格（EMDG）和网络安全分类网格（NSTG），旨在在可重复的条件下隔离关键的决策理论量。TCA减少了行动时间，同时改善资源结果而不降低准确性：与贪婪基线相比，在EMDG中获得了36个生存点，在NSTG中获得了33个完整性点。消融实验证实了选择和停止的联合优化是必不可少的；仅有停止规则最多恢复4个生存点。对alpha、beta、lambda_S进行的敏感度扫描显示了稳定的准确性和可解释的权衡；在{0, 0.1, 0.3, 0.5}上进行的经验扫描确认在高时间紧迫性下EMDG轨迹中eta=0是最佳的。

更新时间: 2026-04-01 15:27:58

领域: cs.AI

下载: http://arxiv.org/abs/2603.30031v2

Transfer learning for nonparametric Bayesian networks

This paper introduces two transfer learning methodologies for estimating nonparametric Bayesian networks under scarce data. We propose two algorithms, a constraint-based structure learning method, called PC-stable-transfer learning (PCS-TL), and a score-based method, called hill climbing transfer learning (HC-TL). We also define particular metrics to tackle the negative transfer problem in each of them, a situation in which transfer learning has a negative impact on the model's performance. Then, for the parameters, we propose a log-linear pooling approach. For the evaluation, we learn kernel density estimation Bayesian networks, a type of nonparametric Bayesian network, and compare their transfer learning performance with the models alone. To do so, we sample data from small, medium and large-sized synthetic networks and datasets from the UCI Machine Learning repository. Then, we add noise and modifications to these datasets to test their ability to avoid negative transfer. To conclude, we perform a Friedman test with a Bergmann-Hommel post-hoc analysis to show statistical proof of the enhanced experimental behavior of our methods. Thus, PCS-TL and HC-TL demonstrate to be reliable algorithms for improving the learning performance of a nonparametric Bayesian network with scarce data, which in real industrial environments implies a reduction in the required time to deploy the network.

Updated: 2026-04-01 15:25:46

标题: 非参数贝叶斯网络的迁移学习

摘要: 这篇论文介绍了两种用于估计稀缺数据下的非参数贝叶斯网络的迁移学习方法。我们提出了两种算法，一种是基于约束的结构学习方法，称为PC-stable-transfer learning（PCS-TL），另一种是基于得分的方法，称为hill climbing transfer learning（HC-TL）。我们还定义了特定的度量标准来解决它们中的负迁移问题，即迁移学习对模型性能产生负面影响的情况。然后，对于参数，我们提出了一种对数线性池化方法。在评估方面，我们学习了核密度估计贝叶斯网络，一种非参数贝叶斯网络，并将它们的迁移学习性能与单独模型进行了比较。为此，我们从UCI机器学习仓库中的小、中和大规模合成网络和数据集中抽样数据。然后，我们向这些数据集添加噪声和修改，以测试它们避免负迁移的能力。最后，我们进行了一项Friedman检验，并进行了Bergmann-Hommel事后分析，以显示我们方法的增强实验行为的统计证据。因此，PCS-TL和HC-TL证明是可靠的算法，可以提高稀缺数据下非参数贝叶斯网络的学习性能，在真实的工业环境中，这意味着减少部署网络所需的时间。

更新时间: 2026-04-01 15:25:46

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.01021v1

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

Vision-Language-Action (VLA) systems have shown strong potential for language-driven robotic manipulation. However, scaling them to long-horizon tasks remains challenging. Existing pipelines typically separate data collection, policy learning, and deployment, resulting in heavy reliance on manual environment resets and brittle multi-policy execution. We present RoboClaw, an agentic robotics framework that unifies data collection, policy learning, and task execution under a single VLM-driven controller. At the policy level, RoboClaw introduces Entangled Action Pairs (EAP), which couple forward manipulation behaviors with inverse recovery actions to form self-resetting loops for autonomous data collection. This mechanism enables continuous on-policy data acquisition and iterative policy refinement with minimal human intervention. During deployment, the same agent performs high-level reasoning and dynamically orchestrates learned policy primitives to accomplish long-horizon tasks. By maintaining consistent contextual semantics across collection and execution, RoboClaw reduces mismatch between the two phases and improves multi-policy robustness. Experiments in real-world manipulation tasks demonstrate improved stability and scalability compared to conventional open-loop pipelines, while significantly reducing human effort throughout the robot lifecycle, achieving a 25% improvement in success rate over baseline methods on long-horizon tasks and reducing human time investment by 53.7%.

Updated: 2026-04-01 15:22:08

标题: RoboClaw：用于可扩展长期规划机器任务的代理框架

摘要: 视觉-语言-动作（VLA）系统已显示出在以语言驱动的机器人操作中具有强大潜力。然而，将它们扩展到长期任务仍然具有挑战性。现有的流程通常将数据收集、策略学习和部署分开，导致对手动环境重置和脆弱的多策略执行依赖严重。我们提出了RoboClaw，这是一个整合数据收集、策略学习和任务执行的主动机器人框架，统一了一个VLM驱动的控制器。在策略级别上，RoboClaw引入了纠缠动作对（EAP），将正向操作行为与逆向恢复操作耦合在一起，形成自动重置循环，用于自主数据收集。这种机制使得连续的策略数据采集和最少人为干预的迭代策略细化成为可能。在部署阶段，同一个代理执行高层推理，并动态编排学习的策略原语以完成长期任务。通过在收集和执行过程中保持一致的上下文语义，RoboClaw减少了两个阶段之间的不匹配，并提高了多策略的鲁棒性。在真实世界的操作任务中的实验表明，与传统的开环流程相比，稳定性和可扩展性得到了改善，同时在机器人生命周期中显著减少了人力投入，长期任务的成功率比基准方法提高了25%，人力投入减少了53.7%。

更新时间: 2026-04-01 15:22:08

领域: cs.RO,cs.AI

下载: http://arxiv.org/abs/2603.11558v3

OrgAgent: Organize Your Multi-Agent System like a Company

While large language model-based multi-agent systems have shown strong potential for complex reasoning, how to effectively organize multiple agents remains an open question. In this paper, we introduce OrgAgent, a company-style hierarchical multi-agent framework that separates collaboration into governance, execution, and compliance layers. OrgAgent decomposes multi-agent reasoning into three layers: a governance layer for planning and resource allocation, an execution layer for task solving and review, and a compliance layer for final answer control. By evaluating the framework across reasoning tasks, LLMs, execution modes, and execution policies, we find that multi-agent systems organized in a company-style hierarchy generally outperform other organizational structures. Besides, hierarchical coordination also reduces token consumption relative to flat collaboration in most settings. For example, for GPT-OSS-120B, the hierarchical setting improves performance over flat multi-agent system by 102.73% while reducing token usage by 74.52% on SQuAD 2.0. Further analysis shows that hierarchy helps most when tasks benefit from stable skill assignment, controlled information flow, and layered verification. Overall, our findings highlight organizational structure as an important factor in multi-agent reasoning, shaping not only effectiveness and cost, but also coordination behavior.

Updated: 2026-04-01 15:21:14

标题: OrgAgent：像公司一样组织您的多智能体系统

摘要: 基于大型语言模型的多智能体系统展现出了在复杂推理方面的强大潜力，然而如何有效地组织多个智能体仍然是一个悬而未决的问题。在本文中，我们介绍了OrgAgent，这是一个类似于公司的分层多智能体框架，将协作分为治理、执行和合规三个层次。OrgAgent将多智能体推理分解为三个层次：一个用于规划和资源分配的治理层，一个用于任务解决和审查的执行层，以及一个用于最终答案控制的合规层。通过对跨推理任务、LLMs、执行模式和执行策略的框架进行评估，我们发现以公司形式层次化组织的多智能体系统通常优于其他组织结构。此外，层次协调也在大多数情况下减少了代币的消耗相对于平面协作。例如，对于GPT-OSS-120B，在SQuAD 2.0上，层次设置的性能比平面多智能体系统提高了102.73%，同时减少了74.52%的代币使用。进一步分析显示，当任务受益于稳定的技能分配、受控的信息流和分层验证时，层次结构最有帮助。总的来说，我们的发现强调了组织结构作为多智能体推理中的一个重要因素，不仅塑造了有效性和成本，还影响了协调行为。

更新时间: 2026-04-01 15:21:14

领域: cs.MA,cs.AI

下载: http://arxiv.org/abs/2604.01020v1

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

Membership Inference Attacks (MIAs) serve as a fundamental auditing tool for evaluating training data leakage in machine learning models. However, existing methodologies predominantly rely on static, handcrafted heuristics that lack adaptability, often leading to suboptimal performance when transferred across different large models. In this work, we propose AutoMIA, an agentic framework that reformulates membership inference as an automated process of self-exploration and strategy evolution. Given high-level scenario specifications, AutoMIA self-explores the attack space by generating executable logits-level strategies and progressively refining them through closed-loop evaluation feedback. By decoupling abstract strategy reasoning from low-level execution, our framework enables a systematic, model-agnostic traversal of the attack search space. Extensive experiments demonstrate that AutoMIA consistently matches or outperforms state-of-the-art baselines while eliminating the need for manual feature engineering.

Updated: 2026-04-01 15:17:45

标题: AutoMIA：通过主动自我探索改进的成员推断攻击基线

摘要: 成员推理攻击（MIAs）作为评估机器学习模型训练数据泄漏的基本审计工具。然而，现有方法主要依赖于静态、手工制定的启发式，缺乏适应性，通常在不同的大型模型之间转移时性能不佳。在这项工作中，我们提出了AutoMIA，一个主动框架，将会员推理重新构想为自我探索和策略演变的自动化过程。在给定高级场景规范的情况下，AutoMIA通过生成可执行的对数级策略自我探索攻击空间，并通过闭环评估反馈逐渐完善它们。通过将抽象策略推理与低级执行解耦，我们的框架实现了对攻击搜索空间的系统化、与模型无关的遍历。大量实验证明，AutoMIA始终与或优于最先进的基线，同时消除了手动特征工程的需求。

更新时间: 2026-04-01 15:17:45

领域: cs.CR,cs.CV

下载: http://arxiv.org/abs/2604.01014v1

A Gaussian Process View on Observation Noise and Initialization in Wide Neural Networks

Performing gradient descent in a wide neural network is equivalent to computing the posterior mean of a Gaussian Process with the Neural Tangent Kernel (NTK-GP), for a specific prior mean and with zero observation noise. However, existing formulations have two limitations: (i) the NTK-GP assumes noiseless targets, leading to misspecification on noisy data; (ii) the equivalence does not extend to arbitrary prior means, which are essential for well-specified models. To address (i), we introduce a regularizer into the training objective, showing its correspondence to incorporating observation noise in the NTK-GP. To address (ii), we propose a \textit{shifted network} that enables arbitrary prior means and allows obtaining the posterior mean with gradient descent on a single network, without ensembling or kernel inversion. We validate our results with experiments across datasets and architectures, showing that this approach removes key obstacles to the practical use of NTK-GP equivalence in applied Gaussian process modeling.

Updated: 2026-04-01 15:17:25

标题: 宽神经网络中观测噪声和初始化的高斯过程视角

摘要: 在一个宽神经网络中执行梯度下降等同于使用神经切向核（NTK-GP）计算高斯过程的后验均值，具有特定的先验均值并且没有观测噪声。然而，现有的公式存在两个限制：（i）NTK-GP假设无噪声目标，导致在嘈杂数据上误规范化；（ii）等价性不适用于任意先验均值，这对于规范良好的模型至关重要。为了解决（i），我们将一个正则化器引入到训练目标中，展示它与在NTK-GP中加入观测噪声的对应性。为了解决（ii），我们提出了一个“移位网络”，可以实现任意先验均值，并且允许在单个网络上使用梯度下降获取后验均值，而无需集成或核反演。我们通过跨数据集和架构的实验证实了我们的结果，表明这种方法消除了应用高斯过程建模中NTK-GP等价性的关键障碍。

更新时间: 2026-04-01 15:17:25

领域: cs.LG,stat.ML

下载: http://arxiv.org/abs/2502.01556v3

OmniMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall multimodal experiences remains a critical bottleneck. Building effective lifelong memory requires navigating a vast design space spanning architecture, retrieval strategies, prompt engineering, and data pipelines; this space is too large and interconnected for manual exploration or traditional AutoML to explore effectively. We deploy an autonomous research pipeline to discover OmniMem, a unified multimodal memory framework for lifelong AI agents. Starting from a naïve baseline (F1=0.117 on LoCoMo), the pipeline autonomously executes ${\sim}50$ experiments across two benchmarks, diagnosing failure modes, proposing architectural modifications, and repairing data pipeline bugs, all without human intervention in the inner loop. The resulting system achieves state-of-the-art on both benchmarks, improving F1 by +411% on LoCoMo (0.117$\to$0.598) and +214% on Mem-Gallery (0.254$\to$0.797) relative to the initial configurations. Critically, the most impactful discoveries are not hyperparameter adjustments: bug fixes (+175%), architectural changes (+44%), and prompt engineering (+188\% on specific categories) each individually exceed the cumulative contribution of all hyperparameter tuning, demonstrating capabilities fundamentally beyond the reach of traditional AutoML. We provide a taxonomy of six discovery types and identify four properties that make multimodal memory particularly suited for autoresearch, offering guidance for applying autonomous research pipelines to other AI system domains. Code is available at this https://github.com/aiming-lab/OmniMem.

Updated: 2026-04-01 15:06:23

标题: OmniMem：自主研究引导下的终身多模态代理记忆发现

摘要: AI代理越来越在延长时间范围内运行，然而它们保留、组织和回忆多模式经验的能力仍然是一个关键瓶颈。建立有效的终身记忆需要在涵盖架构、检索策略、提示工程和数据管道的广阔设计空间中进行导航；这个空间对于手动探索或传统的AutoML来说太大且相互连接，难以有效地探索。我们部署了一个自主研究管道来发现OmniMem，一个统一的多模式记忆框架，用于终身AI代理。从一个天真的基线（在LoCoMo上的F1=0.117）开始，该管道自主地在两个基准测试中执行约50个实验，诊断失败模式，提出架构修改，并修复数据管道错误，所有这些都在内部循环中无需人工干预。结果系统在两个基准测试中均实现了最新技术水平，相对于初始配置，LoCoMo的F1提高了+411%（0.117→0.598），Mem-Gallery提高了+214%（0.254→0.797）。至关重要的是，最有影响力的发现并不是超参数调整：错误修复（+175%）、架构更改（+44%）和提示工程（在特定类别上+188%）每个单独超过所有超参数调整的累积贡献，展示了根本超出传统AutoML范围的能力。我们提供了六种发现类型的分类法，并确定了使多模式记忆特别适合自动研究的四个属性，为将自主研究管道应用于其他AI系统领域提供指导。代码可在https://github.com/aiming-lab/OmniMem上找到。

更新时间: 2026-04-01 15:06:23

领域: cs.AI

下载: http://arxiv.org/abs/2604.01007v1

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

Recent advances in large language models have improved the capabilities of coding agents, yet systematic evaluation of complex, end-to-end website development remains limited. To address this gap, we introduce Vision2Web, a hierarchical benchmark for visual website development, spanning from static UI-to-code generation, interactive multi-page frontend reproduction, to long-horizon full-stack website development. The benchmark is constructed from real-world websites and comprises a total of 193 tasks across 16 categories, with 918 prototype images and 1,255 test cases. To support flexible, thorough and reliable evaluation, we propose workflow-based agent verification paradigm based on two complementary components: a GUI agent verifier and a VLM-based judge. We evaluate multiple visual language models instantiated under different coding-agent frameworks, revealing substantial performance gaps at all task levels, with state-of-the-art models still struggling on full-stack development.

Updated: 2026-04-01 15:06:02

标题: Vision2Web：一个用于视觉网站开发的层次基准，带有代理验证

摘要: 最近对大型语言模型的进展提高了编码代理的能力，然而对复杂的端到端网站开发的系统评估仍然有限。为了填补这一空白，我们引入了Vision2Web，这是一个用于视觉网站开发的分层基准，涵盖了从静态UI到代码生成、交互式多页面前端复制，以及长期全栈网站开发。该基准是从现实网站构建的，涵盖了16个类别的总共193个任务，包括918个原型图像和1255个测试用例。为了支持灵活、彻底和可靠的评估，我们提出了基于工作流的代理验证范式，由两个互补组件组成：GUI代理验证器和基于VLM的评判者。我们评估了多个在不同编码代理框架下实例化的视觉语言模型，揭示了在所有任务级别上存在实质性的性能差距，即使是最先进的模型在全栈开发上仍然存在困难。

更新时间: 2026-04-01 15:06:02

领域: cs.SE,cs.AI

下载: http://arxiv.org/abs/2603.26648v2

Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding

Multimodal Large Language Models (MLLMs) have shown strong performance on video question answering, but their application to long-form videos is constrained by limited context length and computational cost, making keyframe sampling essential. Existing approaches typically rely on semantic relevance or reinforcement learning, which either fail to capture evidential clues or suffer from inefficient combinatorial optimization. In this work, we propose an evidence-driven keyframe sampling framework grounded in information bottleneck theory. We formulate keyframe selection as maximizing the conditional mutual information between selected frames and the query, providing a principled objective that reflects each frame's contribution to answering the question. To make this objective tractable, we exploit its structure to derive a decomposed optimization that reduces subset selection to independent frame-level scoring. We further introduce a query-conditioned evidence scoring network trained with a contrastive objective to estimate evidential importance efficiently. Experiments on long-form video understanding benchmarks show that our method consistently outperforms prior sampling strategies under strict token budgets, while significantly improving training efficiency.

Updated: 2026-04-01 15:02:50

标题: 基于MLLM的长视频理解的查询条件证据关键帧采样

摘要: 多模态大语言模型（MLLMs）在视频问答方面表现出色，但在长篇视频中的应用受到有限的上下文长度和计算成本的限制，使得关键帧抽样成为必要。现有方法通常依赖于语义相关性或强化学习，但这两种方法要么无法捕获证据线索，要么受到组合优化效率低下的限制。在这项工作中，我们提出了一个基于信息瓶颈理论的证据驱动的关键帧抽样框架。我们将关键帧选择形式化为最大化选定帧与查询之间的条件互信息，提供了一个体现每个帧对回答问题的贡献的原则性目标。为了使这个目标可行，我们利用其结构推导出了一个分解优化，将子集选择简化为独立的帧级评分。我们进一步引入了一个经过对比目标训练的查询条件证据评分网络，以有效地估计证据重要性。在长篇视频理解基准测试中的实验表明，我们的方法在严格的令牌预算下始终优于先前的抽样策略，同时显着提高了训练效率。

更新时间: 2026-04-01 15:02:50

领域: cs.CV,cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.01002v1

EgoSim: Egocentric World Simulator for Embodied Interaction Generation

We introduce EgoSim, a closed-loop egocentric world simulator that generates spatially consistent interaction videos and persistently updates the underlying 3D scene state for continuous simulation. Existing egocentric simulators either lack explicit 3D grounding, causing structural drift under viewpoint changes, or treat the scene as static, failing to update world states across multi-stage interactions. EgoSim addresses both limitations by modeling 3D scenes as updatable world states. We generate embodiment interactions via a Geometry-action-aware Observation Simulation model, with spatial consistency from an Interaction-aware State Updating module. To overcome the critical data bottleneck posed by the difficulty in acquiring densely aligned scene-interaction training pairs, we design a scalable pipeline that extracts static point clouds, camera trajectories, and embodiment actions from in-the-wild large-scale monocular egocentric videos. We further introduce EgoCap, a capture system that enables low-cost real-world data collection with uncalibrated smartphones. Extensive experiments demonstrate that EgoSim significantly outperforms existing methods in terms of visual quality, spatial consistency, and generalization to complex scenes and in-the-wild dexterous interactions, while supporting cross-embodiment transfer to robotic manipulation. Codes and datasets will be open soon. The project page is at egosimulator.github.io.

Updated: 2026-04-01 15:00:46

标题: EgoSim：用于体验交互生成的以自我为中心的世界模拟器

摘要: 我们介绍了EgoSim，这是一个闭环的以自我为中心的世界模拟器，可以生成空间连续的互动视频，并持续更新基础3D场景状态以进行连续模拟。现有的以自我为中心的模拟器要么缺乏明确的3D基础，导致在视角变化下结构漂移，要么将场景视为静态的，未能跨多个阶段的交互更新世界状态。EgoSim通过将3D场景建模为可更新的世界状态来解决这两个问题。我们通过几何-动作感知观测模拟模型生成体现互动，通过交互感知状态更新模块实现空间一致性。为了克服在获取密集对齐的场景-互动训练对时所面临的关键数据瓶颈，我们设计了一个可扩展的流水线，从野外大规模单目自我为中心视频中提取静态点云、摄像机轨迹和体现动作。我们还介绍了EgoCap，这是一个捕捉系统，可以使用未校准的智能手机进行低成本的实际数据收集。大量实验证明，EgoSim在视觉质量、空间一致性和对复杂场景和野外灵巧互动的泛化方面明显优于现有方法，同时支持跨体现的转移至机器人操作。代码和数据集即将开放。项目页面位于egosimulator.github.io。

更新时间: 2026-04-01 15:00:46

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2604.01001v1

EmbedPart: Embedding-Driven Graph Partitioning for Scalable Graph Neural Network Training

Graph Neural Networks (GNNs) are widely used for learning on graph-structured data, but scaling GNN training to massive graphs remains challenging. To enable scalable distributed training, graphs are divided into smaller partitions that are distributed across multiple machines such that inter-machine communication is minimized and computational load is balanced. In practice, existing partitioning approaches face a fundamental trade-off between partitioning overhead and partitioning quality. We propose EmbedPart, an embedding-driven partitioning approach that achieves both speed and quality. Instead of operating directly on irregular graph structures, EmbedPart leverages node embeddings produced during the actual GNN training workload and clusters these dense embeddings to derive a partitioning. EmbedPart achieves more than 100x speedup over Metis while maintaining competitive partitioning quality and accelerating distributed GNN training. Moreover, EmbedPart naturally supports graph updates and fast repartitioning, and can be applied to graph reordering to improve data locality and accelerate single-machine GNN training. By shifting partitioning from irregular graph structures to dense embeddings, EmbedPart enables scalable and high-quality graph data optimization.

Updated: 2026-04-01 15:00:01

标题: EmbedPart：嵌入驱动的图分区用于可扩展的图神经网络训练

摘要: 图神经网络（GNNs）被广泛用于学习图结构数据，但将GNN训练扩展到大规模图仍然具有挑战性。为了实现可扩展的分布式训练，图被分成较小的分区，这些分区分布在多台机器上，以最小化机器间通信，平衡计算负载。实际上，现有的分区方法在分区开销和分区质量之间面临基本权衡。我们提出EmbedPart，一种基于嵌入式驱动的分区方法，可以实现速度和质量的平衡。EmbedPart不直接操作于不规则的图结构，而是利用实际GNN训练工作负载期间产生的节点嵌入，并将这些密集嵌入聚类以得出分区。EmbedPart在保持竞争性分区质量的同时，比Metis实现了100倍以上的加速，并加速了分布式GNN训练。此外，EmbedPart自然支持图更新和快速重新分区，并可应用于图重新排序以提高数据局部性并加速单机GNN训练。通过将分区从不规则图结构转移到密集嵌入，EmbedPart实现了可扩展且高质量的图数据优化。

更新时间: 2026-04-01 15:00:01

领域: cs.LG,cs.DB,cs.DC

下载: http://arxiv.org/abs/2604.01000v1

Automatic Method Illustration Generation for AI Scientific Papers via Drawing Middleware Creation, Evolution, and Orchestration

Method illustrations (MIs) play a crucial role in conveying the core ideas of scientific papers, yet their generation remains a labor-intensive process. Here, we take inspiration from human authors' drawing practices and correspondingly propose \textbf{FigAgent}, a novel multi-agent framework for high-quality automatic MI generation. Our FigAgent distills drawing experiences from similar components across MIs and encapsulates them into reusable drawing middlewares that can be orchestrated for MI generation, while evolving these middlewares to adapt to dynamically evolving drawing requirements. Besides, a novel Explore-and-Select drawing strategy is introduced to mimic the human-like trial-and-error manner for gradually constructing MIs with complex structures. Extensive experiments show the efficacy of our method.

Updated: 2026-04-01 14:57:16

标题: 通过绘图中间件的创建、演变和编排为AI科学论文生成自动化插图方法

摘要: 方法插图（MIs）在传达科学论文的核心思想中起着至关重要的作用，然而它们的生成仍然是一个耗时的过程。在这里，我们从人类作者的绘图实践中汲取灵感，相应地提出了一种新颖的高质量自动MI生成的多代理框架\textbf{FigAgent}。我们的FigAgent从MIs中类似组件的绘图经验中提取精华，并将其封装为可重复使用的绘图中间件，可以用于MI生成，并将这些中间件进化以适应动态演变的绘图需求。此外，引入了一种新颖的探索和选择绘图策略，模仿了人类式的试错方式逐渐构建具有复杂结构的MIs。大量实验证明了我们方法的有效性。

更新时间: 2026-04-01 14:57:16

领域: cs.GR,cs.AI

下载: http://arxiv.org/abs/2603.29590v2

Multimodal Analysis of State-Funded News Coverage of the Israel-Hamas War on YouTube Shorts

YouTube Shorts have become central to news consumption on the platform, yet research on how geopolitical events are represented in this format remains limited. To address this gap, we present a multimodal pipeline that combines automatic transcription, aspect-based sentiment analysis (ABSA), and semantic scene classification. The pipeline is first assessed for feasibility and then applied to analyze short-form coverage of the Israel-Hamas war by state-funded outlets. Using over 2,300 conflict-related Shorts and more than 94,000 visual frames, we systematically examine war reporting across major international broadcasters. Our findings reveal that the sentiment expressed in transcripts regarding specific aspects differs across outlets and over time, whereas scene-type classifications reflect visual cues consistent with real-world events. Notably, smaller domain-adapted models outperform large transformers and even LLMs for sentiment analysis, underscoring the value of resource-efficient approaches for humanities research. The pipeline serves as a template for other short-form platforms, such as TikTok and Instagram, and demonstrates how multimodal methods, combined with qualitative interpretation, can characterize sentiment patterns and visual cues in algorithmically driven video environments.

Updated: 2026-04-01 14:55:58

标题: YouTube短视频中以色列哈马斯战争的国家资助新闻报道的多模式分析

摘要: YouTube Shorts已经成为平台上新闻消费的中心，但是有关地缘政治事件如何在这种格式中呈现的研究仍然有限。为了填补这一空白，我们提出了一个多模态流水线，结合了自动转录、基于方面的情感分析（ABSA）和语义场景分类。首先对流水线进行可行性评估，然后应用于分析由国家资助的机构对以色列-哈马斯战争的短片报道。利用超过2,300个与冲突相关的短片和超过94,000个视觉帧，我们系统地检查了主要国际广播公司的战争报道。我们的研究发现，关于特定方面的转录中表达的情感在不同媒体间和随时间变化，而场景类型分类反映了与真实世界事件一致的视觉线索。值得注意的是，较小的领域适应模型在情感分析方面表现优于大型transformers甚至LLMs，强调了资源高效方法在人文研究中的价值。这个流水线作为其他短视频平台的模板，比如TikTok和Instagram，并展示了多模态方法结合定性解释如何在算法驱动的视频环境中描述情感模式和视觉线索。

更新时间: 2026-04-01 14:55:58

领域: cs.CL,cs.AI,cs.SI

下载: http://arxiv.org/abs/2604.00994v1

CDH-Bench: A Commonsense-Driven Hallucination Benchmark for Evaluating Visual Fidelity in Vision-Language Models

Vision-language models (VLMs) achieve strong performance on many benchmarks, yet a basic reliability question remains underexplored: when visual evidence conflicts with commonsense, do models follow what is shown or what commonsense suggests? A characteristic failure in this setting is that the model overrides visual evidence and outputs the commonsense alternative. We term this phenomenon \textbf{commonsense-driven hallucination} (CDH). To evaluate it, we introduce \textbf{CDH-Bench}, a benchmark designed to create explicit \textbf{visual evidence--commonsense conflicts}. CDH-Bench covers three dimensions: \textit{counting anomalies}, \textit{relational anomalies}, and \textit{attribute anomalies}. We evaluate frontier VLMs under \textit{binary Question Answering (QA)} and \textit{multiple-choice QA}, and report metrics including \textit{Counterfactual Accuracy} (CF-Acc), \textit{Commonsense Accuracy} (CS-Acc), \textit{Counterfactual Accuracy Drop} (CFAD), \textit{Commonsense Collapse Rate} (CCR), and \textit{Relative Prior Dependency} (RPD). Results show that even strong models remain vulnerable to prior-driven normalization under visual evidence--commonsense conflict. CDH-Bench provides a controlled diagnostic of visual fidelity under visual evidence--commonsense conflict.

Updated: 2026-04-01 14:55:28

标题: CDH-Bench：一个基于常识的幻觉基准，用于评估视觉语言模型中的视觉保真度

摘要: 视觉-语言模型（VLMs）在许多基准测试中取得了强大的性能，但一个基本的可靠性问题仍未得到充分探讨：当视觉证据与常识相冲突时，模型是遵循所显示的还是遵循常识建议？在这种情况下的一个典型失败是，模型会覆盖视觉证据并输出常识替代方案。我们将这一现象称为\textbf{常识驱动的幻觉}（CDH）。为了评估这一现象，我们引入了\textbf{CDH-Bench}，一个旨在创建明确的\textbf{视觉证据-常识冲突}的基准测试。CDH-Bench涵盖了三个维度：\textit{计数异常}，\textit{关系异常}和\textit{属性异常}。我们在\textit{二元问题回答（QA）}和\textit{多项选择QA}下评估前沿VLMs，并报告包括\textit{反事实准确性}（CF-Acc），\textit{常识准确性}（CS-Acc），\textit{反事实准确性下降}（CFAD），\textit{常识崩溃率}（CCR）和\textit{相对先验依赖性}（RPD）等指标。结果显示，即使强大的模型也仍然容易受到在视觉证据-常识冲突下的先验驱动规范化的影响。CDH-Bench提供了在视觉证据-常识冲突下对视觉忠实度的受控诊断。

更新时间: 2026-04-01 14:55:28

领域: cs.CV,cs.AI,cs.CL

下载: http://arxiv.org/abs/2603.27982v2

Focal plane wavefront control with model-based reinforcement learning

The direct imaging of potentially habitable exoplanets is one prime science case for high-contrast imaging instruments on extremely large telescopes. Most such exoplanets orbit close to their host stars, where their observation is limited by fast-moving atmospheric speckles and quasi-static non-common-path aberrations (NCPA). Conventional NCPA correction methods often use mechanical mirror probes, which compromise performance during operation. This work presents machine-learning-based NCPA control methods that automatically detect and correct both dynamic and static NCPA errors by leveraging sequential phase diversity. We extend previous work in reinforcement learning for AO to focal plane control. A new model-based RL algorithm, Policy Optimization for NCPAs (PO4NCPA), interprets the focal-plane image as input data and, through sequential phase diversity, determines phase corrections that optimize both non-coronagraphic and post-coronagraphic PSFs without prior system knowledge. Further, we demonstrate the effectiveness of this approach by numerically simulating static NCPA errors on a ground-based telescope and an infrared imager affected by water-vapor-induced seeing (dynamic NCPAs). Simulations show that PO4NCPA robustly compensates static and dynamic NCPAs. In static cases, it achieves near-optimal focal-plane light suppression with a coronagraph and near-optimal Strehl without one. With dynamics NCPA, it matches the performance of the modal least-squares reconstruction combined with a 1-step delay integrator in these metrics. The method remains effective for the ELT pupil, vector vortex coronagraph, and under photon and background noise. PO4NCPA is model-free and can be directly applied to standard imaging as well as to any coronagraph. Its sub-millisecond inference times and performance also make it suitable for real-time low-order correction of atmospheric turbulence beyond HCI.

Updated: 2026-04-01 14:55:15

标题: 使用基于模型的强化学习进行焦平面波前控制

摘要: 潜在宜居外行星的直接成像是极大望远镜上高对比成像仪器的主要科学案例之一。大多数这类外行星都围绕它们的母恒星密集运转，其观测受到快速移动的大气斑点和准静态非共路径像差（NCPA）的限制。传统的NCPA校正方法通常使用机械镜探针，这在操作过程中会损害性能。本文提出了基于机器学习的NCPA控制方法，通过利用序贯相位多样性自动检测和校正动态和静态NCPA错误。我们在自适应光学领域的强化学习方面进行了扩展，实现了对焦平面的控制。一种新的基于模型的RL算法，即NCPA策略优化（PO4NCPA），将焦平面图像解释为输入数据，并通过序贯相位多样性确定优化非日冕和日冕后的PSF的相位校正，而无需先前的系统知识。此外，我们通过在地面望远镜和受水汽诱导的视场扰动影响的红外成像仪上进行数值模拟静态NCPA错误的实验来展示这种方法的有效性。模拟表明，PO4NCPA能够稳健地补偿静态和动态NCPAs。在静态情况下，它通过日冕器接近最佳的焦平面光抑制，并在没有日冕器的情况下接近最佳的Strehl。在动态NCPA的情况下，它在这些指标上与模态最小二乘重建结合1步延迟积分器的性能相匹配。该方法对于ELT瞳孔，向量涡旋日冕以及光子和背景噪声仍然有效。PO4NCPA是无模型的，并且可以直接应用于标准成像以及任何日冕器。其次微秒推理时间和性能也使其适用于超越HCI的实时大气湍流低阶校正。

更新时间: 2026-04-01 14:55:15

领域: astro-ph.IM,cs.LG,cs.RO

下载: http://arxiv.org/abs/2604.00993v1

Bridging Structured Knowledge and Data: A Unified Framework with Finance Applications

We develop Structured-Knowledge-Informed Neural Networks (SKINNs), a unified estimation framework that embeds theoretical, simulated, previously learned, or cross-domain insights as differentiable constraints within flexible neural function approximation. SKINNs jointly estimate neural network parameters and economically meaningful structural parameters in a single optimization problem, enforcing theoretical consistency not only on observed data but over a broader input domain through collocation, and therefore nesting approaches such as functional GMM, Bayesian updating, transfer learning, PINNs, and surrogate modeling. SKINNs define a class of M-estimators that are consistent and asymptotically normal with root-N convergence, sandwich covariance, and recovery of pseudo-true parameters under misspecification. We establish identification of structural parameters under joint flexibility, derive generalization and target-risk bounds under distributional shift in a convex proxy, and provide a restricted-optimal characterization of the weighting parameter that governs the bias-variance tradeoff. In an illustrative financial application to option pricing, SKINNs improve out-of-sample valuation and hedging performance, particularly at longer horizons and during high-volatility regimes, while recovering economically interpretable structural parameters with improved stability relative to conventional calibration. More broadly, SKINNs provide a general econometric framework for combining model-based reasoning with high-dimensional, data-driven estimation.

Updated: 2026-04-01 14:51:08

标题: 连接结构化知识和数据：一个具有金融应用的统一框架

摘要: 我们开发了结构知识信息的神经网络（SKINNs），这是一个统一的估计框架，将理论、模拟、先前学习或跨领域的见解作为可微约束嵌入灵活的神经函数逼近中。SKINNs联合估计神经网络参数和经济意义上的结构参数在一个优化问题中，不仅在观测数据上强制理论一致性，还通过配置在更广泛的输入域上，并因此嵌套方法，如函数GMM、贝叶斯更新、迁移学习、PINNs和替代建模。SKINNs定义了一类一致和渐近正常的M-估计器，具有根N收敛、夹层协方差和在规范错误下恢复伪真实参数。我们在联合灵活性下建立了结构参数的识别，推导了在凸代理中的分布转移下的泛化和目标风险界限，并提供了控制偏差-方差折衷的加权参数的受限最优特征化。在一个说明性的金融应用中，SKINNs改善了样本外估值和对冲表现，特别是在较长的时间跨度和高波动率制度下，同时相对于传统校准，恢复了经济可解释的结构参数的稳定性。更广泛地说，SKINNs提供了一个将基于模型的推理与高维、数据驱动的估计结合的通用计量经济学框架。

更新时间: 2026-04-01 14:51:08

领域: stat.ML,cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.00987v1

Do Phone-Use Agents Respect Your Privacy?

We study whether phone-use agents respect privacy while completing benign mobile tasks. This question has remained hard to answer because privacy-compliant behavior is not operationalized for phone-use agents, and ordinary apps do not reveal exactly what data agents type into which form entries during execution. To make this question measurable, we introduce MyPhoneBench, a verifiable evaluation framework for privacy behavior in mobile agents. We operationalize privacy-respecting phone use as permissioned access, minimal disclosure, and user-controlled memory through a minimal privacy contract, iMy, and pair it with instrumented mock apps plus rule-based auditing that make unnecessary permission requests, deceptive re-disclosure, and unnecessary form filling observable and reproducible. Across five frontier models on 10 mobile apps and 300 tasks, we find that task success, privacy-compliant task completion, and later-session use of saved preferences are distinct capabilities, and no single model dominates all three. Evaluating success and privacy jointly reshuffles the model ordering relative to either metric alone. The most persistent failure mode across models is simple data minimization: agents still fill optional personal entries that the task does not require. These results show that privacy failures arise from over-helpful execution of benign tasks, and that success-only evaluation overestimates the deployment readiness of current phone-use agents. All code, mock apps, and agent trajectories are publicly available at~ https://github.com/tangzhy/MyPhoneBench.

Updated: 2026-04-01 14:50:50

标题: 手机使用代理尊重您的隐私吗？

摘要: 我们研究了手机使用代理在完成良性移动任务时是否尊重隐私。这个问题一直很难回答，因为对于手机使用代理来说，遵守隐私的行为并没有被明确定义，而普通应用程序在执行过程中并没有透露代理输入哪些数据到哪些表单条目。为了使这个问题可测量，我们引入了MyPhoneBench，一个用于评估移动代理隐私行为的可验证框架。我们将尊重隐私的手机使用操作化为经过许可的访问、最小化披露和用户控制的内存，通过一个最小的隐私合同iMy来实现，并将其与被仪器化的模拟应用程序和基于规则的审计配对，使不必要的权限请求、欺骗性再披露和不必要的表单填写变得可观察和可重现。在对10个移动应用程序和300个任务上的五个前沿模型进行评估时，我们发现任务成功、符合隐私的任务完成和后续会话中对保存的偏好的使用是不同的能力，没有一个单一模型能够在这三个方面占主导地位。联合评估成功和隐私重新排列了模型的顺序，相对于单独的指标而言。在各个模型中最持久的失败模式是简单的数据最小化：代理仍然填写了任务不需要的可选个人条目。这些结果表明，隐私失败是由于对良性任务的过度帮助执行而引起的，并且仅评估成功会高估当前手机使用代理的部署准备就绪性。所有代码、模拟应用程序和代理轨迹都可以在https://github.com/tangzhy/MyPhoneBench上公开获取。

更新时间: 2026-04-01 14:50:50

领域: cs.CR,cs.AI,cs.CL,cs.LG

下载: http://arxiv.org/abs/2604.00986v1

Dual Optimal: Make Your LLM Peer-like with Dignity

Current aligned language models exhibit a dual failure mode we term the Evasive Servant: they sycophantically validate flawed user beliefs while deflecting responsibility with boilerplate disclaimers. We propose the Dignified Peer framework, which counters servility with anti-sycophancy and trustworthiness, and mitigates evasiveness through empathy and creativity. Realizing this agent requires overcoming significant challenges in data supervision, objective collapse, and evaluation bias. We address these issues by introducing the PersonaKnob dataset which features a compositional partial order structure of multiple persona preference. This data is utilized alongside a tolerant constrained Lagrangian DPO algorithm that dynamically balances all persona dimensions to prevent behavioral collapse. Additionally, we employ a psychometrically calibrated Item Response Theory evaluation protocol to disentangle latent model persona capability from confounders like judge biases. Extensive empirical studies demonstrate that our approach successfully build a LLM agent with both dignity and peer.

Updated: 2026-04-01 14:48:44

标题: 双重优势：让您的LLM同行更具尊严

摘要: 目前的对齐语言模型表现出一种我们称之为“逃避仆人”的双重失败模式：它们奉承地验证了用户的错误信念，同时用标准免责声明转移责任。我们提出了尊贵同行框架，通过反对奉承和可信度来对抗低贱，并通过共情和创造力减轻逃避行为。实现这种代理需要克服数据监督、客观崩溃和评估偏见等重大挑战。我们通过引入PersonaKnob数据集来解决这些问题，该数据集具有多个人物偏好的组合部分顺序结构。这些数据与一种宽容的受限拉格朗日DPO算法一起使用，动态平衡所有人物维度，以防止行为崩溃。此外，我们采用一种经过心理测量校准的项目反应理论评估协议，以解开潜在模型人物能力与评委偏见等混杂因素之间的关系。广泛的实证研究表明，我们的方法成功地构建了一个既有尊严又有同伴的LLM代理。

更新时间: 2026-04-01 14:48:44

领域: cs.CL,cs.AI

下载: http://arxiv.org/abs/2604.00979v1

TempoControl: Temporal Attention Guidance for Text-to-Video Models

Recent advances in generative video models have enabled the creation of high-quality videos based on natural language prompts. However, these models frequently lack fine-grained temporal control, meaning they do not allow users to specify when particular visual elements should appear within a generated sequence. In this work, we introduce TempoControl, a method that allows for temporal alignment of visual concepts during inference, without requiring retraining or additional supervision. TempoControl utilizes cross-attention maps, a key component of text-to-video diffusion models, to guide the timing of concepts through a novel optimization approach. Our method steers attention using three complementary principles: aligning its temporal pattern with a control signal (correlation), adjusting its strength where visibility is required (magnitude), and preserving semantic consistency (entropy). TempoControl provides precise temporal control while maintaining high video quality and diversity. We demonstrate its effectiveness across various applications, including temporal reordering of single and multiple objects, action timing, and audio-aligned video generation. Project page: https://shira-schiber.github.io/TempoControl/.

Updated: 2026-04-01 14:48:11

标题: TempoControl：文本到视频模型的时间注意力引导

摘要: 最近在生成视频模型方面取得了重大进展，使得可以基于自然语言提示创建高质量视频。然而，这些模型经常缺乏细粒度的时间控制，意味着它们不允许用户指定特定视觉元素在生成序列中出现的时间。在这项工作中，我们介绍了TempoControl，一种方法，它允许在推断过程中实现视觉概念的时间对齐，而无需重新训练或额外监督。TempoControl利用了交叉注意力图，这是文本到视频扩散模型的关键组成部分，通过一种新颖的优化方法来引导概念的时间。我们的方法使用三个互补原则来引导注意力：将其时间模式与控制信号对齐（相关性），在需要可见性时调整其强度（幅度），并保持语义一致性（熵）。TempoControl提供精确的时间控制，同时保持高质量和多样性的视频。我们展示了它在各种应用中的有效性，包括单个和多个对象的时间重新排序，动作定时和音频对齐的视频生成。项目页面：https://shira-schiber.github.io/TempoControl/。

更新时间: 2026-04-01 14:48:11

领域: cs.CV,cs.AI,cs.LG

下载: http://arxiv.org/abs/2510.02226v3

Taxonomy-Conditioned Hierarchical Bayesian TSB Models for Heterogeneous Intermittent Demand Forecasting

Intermittent demand forecasting poses unique challenges due to sparse observations, cold-start items, and obsolescence. Classical models such as Croston, SBA, and the Teunter--Syntetos--Babai (TSB) method provide simple heuristics but lack a principled generative foundation. We introduce TSB-HB, a hierarchical Bayesian extension of TSB. Demand occurrence is modeled with a Beta--Binomial distribution, while nonzero demand sizes follow a Log-Normal distribution. Crucially, hierarchical priors enable partial pooling across items, stabilizing estimates for sparse or cold-start series while preserving heterogeneity. This framework provides a coherent generative reinterpretation of the classical TSB structure. On the UCI Online Retail dataset, TSB-HB achieves the lowest RMSE and RMSSE among all baselines, while remaining competitive in MAE. On a 5,000-series M5 sample, it improves MAE and RMSE over classical intermittent baselines. Under the calibrated probabilistic configuration, TSB-HB yields competitive pinball loss and a favorable sharpness--calibration tradeoff among the parametric baselines reported in the main text.

Updated: 2026-04-01 14:48:04

标题: 分类条件下的分层贝叶斯TSB模型用于异质间歇需求预测

摘要: 间歇需求预测面临独特挑战，包括稀疏观测、冷启动商品和过时商品。经典模型如Croston、SBA和Teunter-Syntetos-Babai（TSB）方法提供简单的启发式方法，但缺乏基本的生成基础。我们引入TSB-HB，这是TSB的层次贝叶斯扩展。需求发生率采用Beta-Binomial分布建模，非零需求量遵循对数正态分布。关键是，层次先验使得能够在商品之间进行部分汇总，稳定稀疏或冷启动系列的估计，同时保留异质性。这一框架为经典TSB结构提供了一个连贯的生成性重新解释。在UCI在线零售数据集上，TSB-HB在所有基线中实现了最低的RMSE和RMSSE，同时在MAE方面保持竞争力。在一个5,000系列的M5样本中，它改善了MAE和RMSE，超过了经典的间歇基线。在校准的概率配置下，TSB-HB在主文中报告的参数基线中实现了竞争力的损失和有利的尖锐度-校准折衷。

更新时间: 2026-04-01 14:48:04

领域: stat.ML,cs.LG

下载: http://arxiv.org/abs/2511.12749v2

Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization

Reinforcement Learning (RL) has proven highly effective in addressing complex control and decision-making tasks. However, in most traditional RL algorithms, the policy is typically parameterized as a diagonal Gaussian distribution, which constrains the policy from capturing multimodal distributions, making it difficult to cover the full range of optimal solutions in multi-solution problems, and the return is reduced to a mean value, losing its multimodal nature and thus providing insufficient guidance for policy updates. In response to these problems, we propose a RL algorithm termed flow-based policy with distributional RL (FP-DRL). This algorithm models the policy using flow matching, which offers both computational efficiency and the capacity to fit complex distributions. Additionally, it employs a distributional RL approach to model and optimize the entire return distribution, thereby more effectively guiding multimodal policy updates and improving agent performance. Experimental trails on MuJoCo benchmarks demonstrate that the FP-DRL algorithm achieves state-of-the-art (SOTA) performance in most MuJoCo control tasks while exhibiting superior representation capability of the flow policy.

Updated: 2026-04-01 14:47:41

标题: 基于流量的政策在轨迹优化中的分布式强化学习

摘要: 强化学习（RL）在处理复杂的控制和决策任务方面已被证明非常有效。然而，在大多数传统的RL算法中，策略通常被参数化为一个对角高斯分布，这限制了策略捕捉多模式分布的能力，使其难以覆盖多解问题中的全部最优解，并且回报被降低为均值，失去了其多模式特性，从而对策略更新提供了不足的指导。针对这些问题，我们提出了一种称为基于流的分布RL（FP-DRL）的RL算法。该算法使用流匹配来建模策略，既具有计算效率，又能够适应复杂的分布。此外，它采用分布式RL方法来建模和优化整个回报分布，从而更有效地指导多模态策略更新，提高智能体的性能。在MuJoCo基准测试中的实验结果表明，FP-DRL算法在大多数MuJoCo控制任务中取得了最先进的性能，并展现出对流策略的优越表示能力。

更新时间: 2026-04-01 14:47:41

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.00977v1

D4C: Data-Free Quantization for Contrastive Language-Image Pre-training Models

Data-Free Quantization (DFQ) offers a practical solution for model compression without requiring access to real data, making it particularly attractive in privacy-sensitive scenarios. While DFQ has shown promise for unimodal models, its extension to Vision-Language Models such as Contrastive Language-Image Pre-training (CLIP) models remains underexplored. In this work, we reveal that directly applying existing DFQ techniques to CLIP results in substantial performance degradation due to two key limitations: insufficient semantic content and low intra-image diversity in synthesized samples. To tackle these challenges, we propose D4C, the first DFQ framework tailored for CLIP. D4C synthesizes semantically rich and structurally diverse pseudo images through three key components: 1) Prompt-Guided Semantic Injection aligns generated images with real-world semantics using text prompts; 2) Structural Contrastive Generation reproduces compositional structures of natural images by leveraging foreground-background contrastive synthesis; and 3) Perturbation-Aware Enhancement applies controlled perturbations to improve sample diversity and robustness. These components jointly empower D4C to synthesize images that are both semantically informative and structurally diverse, effectively bridging the performance gap of DFQ on CLIP. Extensive experiments validate the effectiveness of D4C, showing significant performance improvements on various bit-widths and models.

Updated: 2026-04-01 14:42:29

标题: D4C：无数据量化用于对比语言-图像预训练模型

摘要: Data-Free Quantization (DFQ)提供了一种实际的模型压缩解决方案，无需访问真实数据，使其在隐私敏感场景中特别吸引人。虽然DFQ已显示出对单模型的潜力，但将其扩展到视觉-语言模型，如对比语言-图像预训练（CLIP）模型，仍未得到充分探索。在这项工作中，我们发现直接应用现有的DFQ技术到CLIP会导致性能严重下降，这是由于两个关键限制：合成样本中的语义内容不足和图像内部多样性低。为了解决这些挑战，我们提出了D4C，这是专门为CLIP定制的第一个DFQ框架。D4C通过三个关键组件合成语义丰富、结构多样的伪图像：1）通过提示引导的语义注入使用文本提示将生成的图像与现实语义对齐；2）结构对比生成通过利用前景-背景对比综合再现自然图像的组成结构；3）扰动感知增强应用受控扰动以改善样本多样性和鲁棒性。这些组件共同赋予D4C能力，合成既有语义信息又具有结构多样性的图像，有效弥合了DFQ在CLIP上的性能差距。大量实验证实了D4C的有效性，在各种比特宽度和模型上显示出显著的性能改进。

更新时间: 2026-04-01 14:42:29

领域: cs.CV,cs.LG

下载: http://arxiv.org/abs/2511.15411v2

Variance-Based Pruning for Accelerating and Compressing Trained Networks

Increasingly expensive training of ever larger models such as Vision Transfomers motivate reusing the vast library of already trained state-of-the-art networks. However, their latency, high computational costs and memory demands pose significant challenges for deployment, especially on resource-constrained hardware. While structured pruning methods can reduce these factors, they often require costly retraining, sometimes for up to hundreds of epochs, or even training from scratch to recover the lost accuracy resulting from the structural modifications. Maintaining the provided performance of trained models after structured pruning and thereby avoiding extensive retraining remains a challenge. To solve this, we introduce Variance-Based Pruning, a simple and structured one-shot pruning technique for efficiently compressing networks, with minimal finetuning. Our approach first gathers activation statistics, which are used to select neurons for pruning. Simultaneously the mean activations are integrated back into the model to preserve a high degree of performance. On ImageNet-1k recognition tasks, we demonstrate that directly after pruning DeiT-Base retains over 70% of its original performance and requires only 10 epochs of fine-tuning to regain 99% of the original accuracy while simultaneously reducing MACs by 35% and model size by 36%, thus speeding up the model by 1.44x. The code is available at: https://github.com/boschresearch/variance-based-pruning

Updated: 2026-04-01 14:41:44

标题: 基于方差的修剪用于加速和压缩训练好的网络

摘要: 随着如Vision Transfomers等越来越昂贵的训练越来越大的模型，激励我们重新利用已经训练过的最先进网络的广泛库。然而，它们的延迟、高计算成本和内存需求对部署提出了重大挑战，尤其是在资源受限的硬件上。虽然结构化剪枝方法可以减少这些因素，但它们通常需要昂贵的重新训练，有时甚至需要多达数百个时期，甚至从头开始训练，以恢复由结构修改导致的准确性损失。在结构剪枝后保持训练模型的性能并因此避免大量的重新训练仍然是一个挑战。为了解决这个问题，我们引入了基于方差的剪枝，这是一种简单且结构化的一次性剪枝技术，可以有效地压缩网络，并进行最少的微调。我们的方法首先收集激活统计数据，用于选择要剪枝的神经元。同时，平均激活被整合回模型中，以保持高度的性能。在ImageNet-1k识别任务中，我们展示了在剪枝后，DeiT-Base保留了原始性能的70%以上，并且只需要10个时期的微调即可恢复到原始准确性的99%，同时将MACs减少了35%，模型大小减少了36%，从而将模型的速度提升了1.44倍。代码可在以下链接获得：https://github.com/boschresearch/variance-based-pruning

更新时间: 2026-04-01 14:41:44

领域: cs.CV,cs.LG

下载: http://arxiv.org/abs/2507.12988v2

Scale-adaptive and robust intrinsic dimension estimation via optimal neighbourhood identification

The Intrinsic Dimension (ID) is a key concept in unsupervised learning and feature selection, as it is a lower bound to the number of variables which are necessary to describe a system. However, in almost any real-world dataset the ID depends on the scale at which the data are analysed. Quite typically at a small scale, the ID is very large, as the data are affected by measurement errors. At large scale, the ID can also appear erroneously large, due to the curvature and the topology of the manifold containing the data. In this work, we introduce an automatic protocol to select the sweet spot, namely the correct range of scales in which the ID is meaningful and useful. This protocol is based on imposing that for distances smaller than the correct scale the density of the data is constant. In the presented framework, to estimate the density it is necessary to know the ID, therefore, this condition is imposed self-consistently. We illustrate the usefulness and robustness of this procedure to noise by benchmarks on artificial and real-world datasets.

Updated: 2026-04-01 14:40:02

标题: 通过最优邻域识别的尺度自适应和稳健的内在维度估计

摘要: 内在维度（ID）是无监督学习和特征选择中的关键概念，因为它是描述系统所需的变量数量的下限。然而，在几乎任何现实世界的数据集中，ID取决于数据分析的尺度。通常在小尺度下，ID非常大，因为数据受到测量误差的影响。在大尺度下，由于数据所在流形的曲率和拓扑结构，ID也可能出现错误地变得很大。在这项工作中，我们介绍了一种自动协议，以选择“甜点”，即ID有意义和有用的正确尺度范围。这个协议基于对小于正确尺度的距离数据密度恒定的要求。在所提出的框架中，要估计密度就需要知道ID，因此，这个条件是自洽地施加的。我们通过对人工和现实世界数据集的基准测试，展示了这个程序对噪声的有用性和鲁棒性。

更新时间: 2026-04-01 14:40:02

领域: stat.ML,cs.LG,math.ST,stat.CO,stat.ME

下载: http://arxiv.org/abs/2405.15132v5

Rapid mixing in positively weighted restricted Boltzmann machines

We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds.

Updated: 2026-04-01 14:38:35

标题: 正权重受限玻尔兹曼机中的快速混合

摘要: 我们展示了对于正权重限制玻尔兹曼机的交替扫描采样器具有多对数混合时间界限。通过分析相同的链和铁磁二自旋系统的Glauber动力学，我们获得了新的混合时间界限，直至临界阈值。

更新时间: 2026-04-01 14:38:35

领域: cs.DS,cs.LG,math.PR

下载: http://arxiv.org/abs/2604.00963v1

Activation Steering via Generative Causal Mediation

Where should we intervene in a language model (LM) to localize and control behaviors that are diffused across many tokens of a long-form response? We introduce Generative Causal Mediation (GCM), a procedure for selecting model components (e.g., attention heads) from contrastive long-form responses, to steer such diffuse concepts (e.g., talk in verse vs. talk in prose). In GCM, we first construct a dataset of contrasting behavioral inputs and long-form responses. Then, we quantify how model components mediate the concept and select the strongest mediators for steering. We evaluate GCM on three behaviors--refusal, sycophancy, and style transfer--across three language models. GCM successfully localizes concepts expressed in long-form responses and outperforms correlational probe-based baselines when steering with a sparse set of attention heads. Together, these results demonstrate that GCM provides an effective approach for localizing from and controlling the long-form responses of LMs.

Updated: 2026-04-01 14:27:14

标题: 通过生成因果中介激活导向

摘要: 我们应该在语言模型（LM）中进行干预以定位和控制分布在长篇回复的许多标记中的行为吗？我们引入了生成因果中介（GCM），这是一种从对比长篇回复中选择模型组件（例如，注意力头）的过程，以引导这种分散的概念（例如，用韵文说话与用散文说话）。在GCM中，我们首先构建对比行为输入和长篇回复的数据集。然后，我们量化模型组件如何介导该概念，并选择最强的介导者进行引导。我们在三种行为--拒绝、谄媚和风格转移--以及三种语言模型上评估了GCM。通过使用少量注意力头进行引导时，GCM成功地定位了长篇回复中表达的概念，并在效果上优于相关探针基线。总的来说，这些结果表明GCM提供了一种有效的方法，可以从和控制LM的长篇回复。

更新时间: 2026-04-01 14:27:14

领域: cs.CL,cs.CY,cs.HC,cs.LG

下载: http://arxiv.org/abs/2602.16080v2

Code Comprehension then Auditing for Unsupervised LLM Evaluation

Large Language Models (LLMs) for unsupervised code correctness evaluation have recently gained attention because they can judge if code runs as intended without requiring reference implementations or unit tests, which may be unavailable, sparse, or unreliable. However, most prior approaches condition LLM evaluators directly on the full code implementation, forcing the model to jointly infer program behavior and evaluate correctness in a single step. This entanglement leads to misinterpretations of code behavior and unreliable judgments. To mitigate this issue, we introduce CoCoA, an unsupervised Code Comprehension then Auditing framework that first comprehends functionality to generate a natural-language explanation. Then it evaluates task alignment based on this explanation. By sequentially sampling comprehension before evaluation, CoCoA improves the quality of inferred program behavior and enables the evaluator to focus on behavioral alignment rather than raw implementation details. Across multiple datasets, programming languages, and models, CoCoA achieves up to $68\%$ increased F1 score and up to $20\%$ increased accuracy over the best-performing baselines.

Updated: 2026-04-01 14:26:43

标题: 代码理解后无监督LLM评估审计

摘要: 最近，用于无监督代码正确性评估的大型语言模型(LLMs)引起了人们的关注，因为它们可以判断代码是否按预期运行，而无需参考实现或单元测试，这些可能不可用、稀疏或不可靠。然而，先前的大多数方法直接基于完整的代码实现条件LLM评估器，迫使模型在一个步骤中同时推断程序行为和评估正确性。这种纠缠导致对代码行为的误解和不可靠的判断。为了减轻这个问题，我们引入了CoCoA，一个无监督的代码理解和审计框架，首先理解功能以生成自然语言解释，然后基于这个解释评估任务对齐性。通过在评估之前顺序抽取理解，CoCoA提高了推断程序行为的质量，并使评估者能够专注于行为对齐而不是原始实现细节。在多个数据集、编程语言和模型上，CoCoA的F1分数提高了高达68%，准确性提高了高达20%，超过了表现最好的基准线。

更新时间: 2026-04-01 14:26:43

领域: cs.AI,cs.CL,cs.LG

下载: http://arxiv.org/abs/2410.03131v4

CHEEM: Continual Learning by Reuse, New, Adapt and Skip -- A Hierarchical Exploration-Exploitation Approach

To effectively manage the complexities of real-world dynamic environments, continual learning must incrementally acquire, update, and accumulate knowledge from a stream of tasks of different nature without suffering from catastrophic forgetting of prior knowledge. While this capability is innate to human cognition, it remains a significant challenge for modern deep learning systems. At the heart of this challenge lies the stability-plasticity dilemma: the need to balance leveraging prior knowledge, integrating novel information, and allocating model capacity adaptively based on task complexity and synergy. In this paper, we propose a novel exemplar-free class-incremental continual learning (ExfCCL) framework that addresses these issues through a Hierarchical Exploration-Exploitation (HEE) approach. The core of our method is a HEE-guided efficient neural architecture search (HEE-NAS) that enables a learning-to-adapt backbone via four primitive operations - reuse, new, adapt, and skip - thereby serving as an internal memory that dynamically updates selected components across streaming tasks. To address the task ID inference problem in ExfCCL, we exploit an external memory of task centroids proposed in the prior art. We term our method CHEEM (Continual Hierarchical-Exploration-Exploitation Memory). CHEEM is evaluated on the challenging MTIL and VDD benchmarks using both Tiny and Base Vision Transformers and a proposed holistic Figure-of-Merit (FoM) metric. It significantly outperforms state-of-the-art prompting-based continual learning methods, closely approaching full fine-tuning upper bounds. Furthermore, it learns adaptive model structures tailored to individual tasks in a semantically meaningful way. Our code is available at https://github.com/savadikarc/cheem .

Updated: 2026-04-01 14:18:35

标题: CHEEM：通过重用、创新、适应和跳过实现的持续学习--一种层次化的探索-利用方法

摘要: 为了有效管理现实世界动态环境的复杂性，持续学习必须从不同性质的任务流中逐步获取、更新和积累知识，而不会遭受以往知识的灾难性遗忘。虽然这种能力是人类认知的固有特性，但对于现代深度学习系统来说仍然是一个重大挑战。这一挑战的核心在于稳定性-可塑性困境：需要在利用先前知识、整合新信息和根据任务复杂性和协同作用自适应地分配模型容量之间取得平衡。在本文中，我们提出了一个新颖的无实例类增量持续学习（ExfCCL）框架，通过一种分层探索-利用（HEE）方法来解决这些问题。我们方法的核心是一种HEE引导的高效神经架构搜索（HEE-NAS），通过四种基本操作 - 重用、新建、适应和跳过 - 实现学习适应骨干，从而作为一个内部记忆，动态更新流式任务中选择的组件。为了解决ExfCCL中的任务ID推断问题，我们利用了先前提出的任务中心的外部记忆。我们将我们的方法命名为CHEEM（持续分层探索-利用记忆）。CHEEM在具有挑战性的MTIL和VDD基准测试中使用Tiny和Base Vision Transformers以及提出的综合Figure-of-Merit（FoM）度量进行评估。它明显优于最先进的基于提示的持续学习方法，接近完全微调的上限。此外，它以语义上有意义的方式学习适应于个别任务的模型结构。我们的代码可在https://github.com/savadikarc/cheem 上找到。

更新时间: 2026-04-01 14:18:35

领域: cs.CV,cs.LG

下载: http://arxiv.org/abs/2303.08250v5

Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

We investigate the collective accuracy of heterogeneous agents who learn to estimate their own reliability over time and selectively abstain from voting. While classical epistemic voting results, such as the \textit{Condorcet Jury Theorem} (CJT), assume fixed participation, real-world aggregation often benefits from allowing agents to say ``I don't know.'' We propose a probabilistic framework where agents engage in a \textit{calibration} phase, updating beliefs about their own fixed competence, before facing a final confidence gate that determines whether to vote or abstain. We derive a non-asymptotic lower bound on the group's success probability and prove that this \textit{selective participation} generalizes the asymptotic guarantees of the CJT to a sequential, confidence-gated setting. Empirically, we validate these bounds via Monte Carlo simulations. While our results are general, we discuss their potential application to AI safety, outlining how this framework can mitigate \textit{hallucinations} in collective LLM decision-making.

Updated: 2026-04-01 14:18:17

标题: 认知过滤和集体幻觉：置信度校准代理的陪审团定理

摘要: 我们研究了异质代理人集体准确性，他们学会随时间估计自己的可靠性并选择性地弃权投票。尽管经典的认知投票结果，如\textit{Condorcet Jury Theorem} (CJT)，假定固定参与，现实世界的聚合往往受益于允许代理人说“我不知道”。我们提出了一个概率框架，代理人在\textit{校准}阶段参与，更新关于自己固定能力的信念，然后面对最终的信心门槛，确定是投票还是弃权。我们得出了群体成功概率的非渐近下界，并证明这种\textit{选择性参与}将CJT的渐近保证推广到了一个顺序、信心门控设置中。在经验上，我们通过蒙特卡洛模拟验证了这些下界。虽然我们的结果是通用的，但我们讨论了它们在人工智能安全领域的潜在应用，概述了这个框架如何可以减轻集体LLM决策中的\textit{幻觉}。

更新时间: 2026-04-01 14:18:17

领域: cs.AI

下载: http://arxiv.org/abs/2602.22413v2

Causal K-Means Clustering

Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup structure is typically unknown, it is more challenging to identify and evaluate subgroup effects than population effects. We propose a new solution to this problem: \emph{Causal k-Means Clustering}, which harnesses the widely-used k-means clustering algorithm to uncover the unknown subgroup structure. Our problem differs significantly from the conventional clustering setup since the variables to be clustered are unknown counterfactual functions. We present a plug-in estimator which is simple and readily implementable using off-the-shelf algorithms, and study its rate of convergence. We also develop a new bias-corrected estimator based on nonparametric efficiency theory and double machine learning, and show that this estimator achieves fast root-n rates and asymptotic normality in large nonparametric models. Our proposed methods are especially useful for modern outcome-wide studies with multiple treatment levels. Further, our framework is extensible to clustering with generic pseudo-outcomes, such as partially observed outcomes or otherwise unknown functions. Finally, we explore finite sample properties via simulation, and illustrate the proposed methods using a study of mobile-supported self-management for chronic low back pain.

Updated: 2026-04-01 14:14:14

标题: 因果K均值聚类

摘要: 因果效应通常通过人口摘要来表征。当不同子组中存在异质的治疗效应时，这些摘要可能提供了不完整的图片。由于子组结构通常是未知的，识别和评估子组效应比人口效应更具挑战性。我们提出了一个新的解决方案：\emph{因果k均值聚类}，利用广泛使用的k均值聚类算法来揭示未知的子组结构。我们的问题与传统的聚类设置有很大的不同，因为要进行聚类的变量是未知的反事实函数。我们提出了一个插值估计器，简单易实现，可使用现成的算法，并研究了其收敛速度。我们还基于非参数效率理论和双机器学习开发了一个新的偏差校正估计器，并展示了这个估计器在大型非参数模型中实现了快速的根n速率和渐近正态性。我们提出的方法对于具有多个治疗水平的现代结果广泛研究尤为有用。此外，我们的框架可扩展到使用通用伪结果进行聚类，例如部分观察到的结果或其他未知函数。最后，我们通过模拟探讨了有限样本特性，并利用一项关于慢性腰部疼痛移动支持自我管理的研究展示了所提出的方法。

更新时间: 2026-04-01 14:14:14

领域: stat.ME,cs.LG,stat.ML

下载: http://arxiv.org/abs/2405.03083v4

Differentially Private Manifold Denoising

We introduce a differentially private manifold denoising framework that allows users to exploit sensitive reference datasets to correct noisy, non-private query points without compromising privacy. The method follows an iterative procedure that (i) privately estimates local means and tangent geometry using the reference data under calibrated sensitivity, (ii) projects query points along the privately estimated subspace toward the local mean via corrective steps at each iteration, and (iii) performs rigorous privacy accounting across iterations and queries using $(\varepsilon,δ)$-differential privacy (DP). Conceptually, this framework brings differential privacy to manifold methods, retaining sufficient geometric signal for downstream tasks such as embedding, clustering, and visualization, while providing formal DP guarantees for the reference data. Practically, the procedure is modular and scalable, separating DP-protected local geometry (means and tangents) from budgeted query-point updates, with a simple scheduler allocating privacy budget across iterations and queries. Under standard assumptions on manifold regularity, sampling density, and measurement noise, we establish high-probability utility guarantees showing that corrected queries converge toward the manifold at a non-asymptotic rate governed by sample size, noise level, bandwidth, and the privacy budget. Simulations and case studies demonstrate accurate signal recovery under moderate privacy budgets, illustrating clear utility-privacy trade-offs and providing a deployable DP component for manifold-based workflows in regulated environments without reengineering privacy systems.

Updated: 2026-04-01 14:13:59

标题: 差分隐私流形去噪

摘要: 我们引入了一种差分隐私流形去噪框架，允许用户利用敏感的参考数据集来校正有噪声的、非私有的查询点，而不损害隐私。该方法遵循一个迭代过程，(i) 在校准的敏感度下使用参考数据私下估计局部均值和切线几何，(ii) 在每次迭代中通过纠错步骤沿着私下估计的子空间将查询点投影到局部均值，(iii) 使用$(\varepsilon,δ)$-差分隐私(DP)在迭代和查询之间进行严格的隐私计算。从概念上讲，这个框架将差分隐私引入到流形方法中，保留足够的几何信号用于下游任务，如嵌入、聚类和可视化，同时为参考数据提供正式的DP保证。在标准假设下，即流形的正则性、采样密度和测量噪声，我们建立了高概率的效用保证，表明校正的查询以非渐近速率向流形收敛，该速率由样本大小、噪声水平、带宽和隐私预算决定。模拟和案例研究展示了在中等隐私预算下准确的信号恢复，说明了明显的效用-隐私权衡，并为在受监管环境中基于流形的工作流程提供可部署的DP组件，而无需重新设计隐私系统。

更新时间: 2026-04-01 14:13:59

领域: cs.LG,cs.CR,math.ST

下载: http://arxiv.org/abs/2604.00942v1

WARP: Guaranteed Inner-Layer Repair of NLP Transformers

Transformer-based NLP models remain vulnerable to adversarial perturbations, yet existing repair methods face a fundamental trade-off: gradient-based approaches offer flexibility but lack verifiability and often overfit; methods that do provide repair guarantees are restricted to the final layer or small networks, significantly limiting the parameter search space available for repair. We present WARP (Weight-Adjusted Repair with Provability), a constraint-based repair framework that extends repair beyond the last layer of Transformer models. WARP formulates repair as a convex quadratic program derived from a first-order linearization of the logit gap, enabling tractable optimization over a high-dimensional parameter space. Under the condition that the first-order approximation holds, this formulation induces three per-sample guarantees: (i) a positive margin constraint ensuring correct classification on repaired inputs, (ii) preservation constraints over a designated remain set, and (iii) a certified robustness radius derived from Lipschitz continuity. To ensure feasibility across varying model architectures, we introduce a sensitivity-based preprocessing step that conditions the optimization landscape accordingly. We further show that the iterative optimization procedure converges to solutions satisfying all repair constraints under mild assumptions. Empirical evaluation on encoder-only Transformers with varying layer architectures validates that these guarantees hold in practice while improving robustness to adversarial inputs. Our results demonstrate that guaranteed, generalizable Transformer repair is achievable through principled constraint-based optimization.

Updated: 2026-04-01 14:12:49

标题: WARP：NLP Transformers内层修复的保证

摘要: 基于Transformer的NLP模型仍然容易受到对抗性扰动的影响，然而现有的修复方法面临着一个基本的权衡：基于梯度的方法提供了灵活性，但缺乏可验证性，并且往往过拟合；能够提供修复保证的方法仅限于最后一层或小型网络，显著限制了可用于修复的参数搜索空间。我们提出了一种名为WARP（Weight-Adjusted Repair with Provability）的基于约束的修复框架，将修复扩展到Transformer模型的最后一层以外。WARP将修复形式化为一个凸二次规划问题，该问题源自对逻辑差距的一阶线性化，从而实现在高维参数空间上的可处理优化。在一阶近似成立的条件下，该公式引出了三种每个样本的保证：（i）一个正边界约束，确保对修复后的输入进行正确分类，（ii）在指定的保留集上的保留约束，以及（iii）基于Lipschitz连续性推导出的认证鲁棒性半径。为了确保在不同的模型架构下的可行性，我们引入了一个基于敏感度的预处理步骤，相应地调整了优化的景观。我们进一步展示，迭代优化过程收敛于满足所有修复约束的解决方案，只需做出温和的假设。对具有不同层架构的仅编码器Transformer的实证评估验证了这些保证在实践中的有效性，同时提高了对对抗性输入的鲁棒性。我们的结果表明，通过基于原则的约束优化，可以实现保证的、可推广的Transformer修复。

更新时间: 2026-04-01 14:12:49

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.00938v1

BN-Pool: Bayesian Nonparametric Pooling for Graphs

We introduce BN-Pool, the first clustering-based pooling method for Graph Neural Networks that adaptively determines the number of supernodes in a coarsened graph. BN-Pool leverages a generative model based on a Bayesian nonparametric framework for partitioning graph nodes into an unbounded number of clusters. During training, the node-to-cluster assignments are learned by combining the supervised loss of the downstream task with an unsupervised auxiliary term, which encourages the reconstruction of the original graph topology while penalizing unnecessary proliferation of clusters. By automatically discovering the optimal coarsening level for each graph, BN-Pool preserves the performance of soft-clustering pooling methods while avoiding their typical redundancy by learning compact pooled graphs. The code is available at https://github.com/NGMLGroup/Bayesian-Nonparametric-Graph-Pooling.

Updated: 2026-04-01 14:12:43

标题: BN-Pool：图形的贝叶斯非参数汇聚

摘要: 我们介绍了BN-Pool，这是第一个基于聚类的图神经网络汇聚方法，可以自适应地确定粗化图中的超节点数量。BN-Pool利用基于贝叶斯非参数框架的生成模型，将图节点分区为无限数量的簇。在训练过程中，通过将下游任务的监督损失与无监督辅助项结合，学习节点到簇的分配，这有助于重建原始图的拓扑结构，同时惩罚不必要的簇扩散。通过自动发现每个图的最佳粗化级别，BN-Pool保持了软聚类池化方法的性能，同时避免了其典型的冗余，通过学习紧凑的池化图。代码可在https://github.com/NGMLGroup/Bayesian-Nonparametric-Graph-Pooling找到。

更新时间: 2026-04-01 14:12:43

领域: cs.LG,math.PR

下载: http://arxiv.org/abs/2501.09821v2

PsychAgent: An Experience-Driven Lifelong Learning Agent for Self-Evolving Psychological Counselor

Existing methods for AI psychological counselors predominantly rely on supervised fine-tuning using static dialogue datasets. However, this contrasts with human experts, who continuously refine their proficiency through clinical practice and accumulated experience. To bridge this gap, we propose an Experience-Driven Lifelong Learning Agent (\texttt{PsychAgent}) for psychological counseling. First, we establish a Memory-Augmented Planning Engine tailored for longitudinal multi-session interactions, which ensures therapeutic continuity through persistent memory and strategic planning. Second, to support self-evolution, we design a Skill Evolution Engine that extracts new practice-grounded skills from historical counseling trajectories. Finally, we introduce a Reinforced Internalization Engine that integrates the evolved skills into the model via rejection fine-tuning, aiming to improve performance across diverse scenarios. Comparative analysis shows that our approach achieves higher scores than strong general LLMs (e.g., GPT-5.4, Gemini-3) and domain-specific baselines across all reported evaluation dimensions. These results suggest that lifelong learning can improve the consistency and overall quality of multi-session counseling responses.

Updated: 2026-04-01 14:08:49

标题: PsychAgent：一种基于经验驱动的终身学习代理程序，用于自我进化的心理辅导员

摘要: 现有的AI心理咨询师方法主要依赖于使用静态对话数据集进行监督微调。然而，这与人类专家形成对比，人类专家通过临床实践和积累的经验不断提高他们的专业水平。为了弥补这一差距，我们提出了一种基于经验驱动的终身学习代理（\texttt{PsychAgent}）用于心理咨询。首先，我们建立了一个适用于长期多次会话交互的记忆增强规划引擎，通过持久性记忆和战略规划确保治疗的连续性。其次，为了支持自我进化，我们设计了一个技能进化引擎，从历史咨询轨迹中提取新的实践基础技能。最后，我们引入了一个强化内化引擎，通过拒绝微调将进化的技能整合到模型中，旨在提高在不同场景下的表现。比较分析表明，我们的方法在所有报告的评估维度上比强大的通用LLMs（如GPT-5.4、Gemini-3）和领域特定基线得分更高。这些结果表明，终身学习可以提高多次会话咨询回应的一致性和整体质量。

更新时间: 2026-04-01 14:08:49

领域: cs.AI

下载: http://arxiv.org/abs/2604.00931v1

Exact Graph Learning via Integer Programming

Learning the dependence structure among variables in complex systems is a central problem across medical, natural, and social sciences. These structures can be naturally represented by graphs, and the task of inferring such graphs from data is known as graph learning or causal discovery. Existing approaches typically rely on restrictive assumptions about the data-generating process, employ greedy oracle algorithms, or solve approximate formulations of the graph learning problem. Therefore, they are either sensitive to violations of central assumptions or fail to guarantee globally optimal solutions. We address these limitations by introducing a nonparametric graph learning framework based on conditional independence testing and integer programming. We reformulate the graph learning problem as a mixed-integer program and prove that solving this integer-programming problem provides a globally optimal solution to the original graph learning problem. Our method leverages efficient encodings of graphical separation criteria, enabling the exact recovery of larger graphs than was previously feasible. We provide an open-source R package 'glip' which supports learning (acyclic) directed (mixed) graphs and chain graphs. We demonstrate that our approach is often faster than existing exact graph learning procedures and achieves state-of-the-art performance on simulated and benchmark data across all aforementioned classes of graphs.

Updated: 2026-04-01 14:07:56

标题: 通过整数规划实现精确图学习

摘要: 学习复杂系统中变量之间的依赖结构是医学、自然科学和社会科学领域的一个核心问题。这些结构可以自然地用图表示，从数据中推断这样的图被称为图学习或因果发现。现有方法通常依赖于对数据生成过程的限制性假设，采用贪婪的oracle算法，或解决图学习问题的近似公式。因此，它们要么对核心假设的违反敏感，要么无法保证全局最优解。我们通过引入基于条件独立性检验和整数规划的非参数图学习框架来解决这些限制。我们将图学习问题重新表述为一个混合整数规划，并证明解决这个整数规划问题提供了原始图学习问题的全局最优解。我们的方法利用了图形分离标准的高效编码，使得比以前可行的更大的图形的精确恢复成为可能。我们提供了一个开源的R软件包'glip'，支持学习（非循环）有向（混合）图和链图。我们证明我们的方法通常比现有的精确图学习程序更快，并在所有前述类别的图的模拟和基准数据上实现了最先进的性能。

更新时间: 2026-04-01 14:07:56

领域: stat.ME,cs.LG

下载: http://arxiv.org/abs/2601.20589v2

Learning Quantised Structure-Preserving Motion Representations for Dance Fingerprinting

We present DANCEMATCH, an end-to-end framework for motion-based dance retrieval, the task of identifying semantically similar choreographies directly from raw video, defined as DANCE FINGERPRINTING. While existing motion analysis and retrieval methods can compare pose sequences, they rely on continuous embeddings that are difficult to index, interpret, or scale. In contrast, DANCEMATCH constructs compact, discrete motion signatures that capture the spatio-temporal structure of dance while enabling efficient large-scale retrieval. Our system integrates Skeleton Motion Quantisation (SMQ) with Spatio-Temporal Transformers (STT) to encode human poses, extracted via Apple CoMotion, into a structured motion vocabulary. We further design DANCE RETRIEVAL ENGINE (DRE), which performs sub-linear retrieval using a histogram-based index followed by re-ranking for refined matching. To facilitate reproducible research, we release DANCETYPESBENCHMARK, a pose-aligned dataset annotated with quantised motion tokens. Experiments demonstrate robust retrieval across diverse dance styles and strong generalisation to unseen choreographies, establishing a foundation for scalable motion fingerprinting and quantitative choreographic analysis.

Updated: 2026-04-01 14:06:59

标题: 学习量化结构保持的动作表示用于舞蹈指纹识别

摘要: 我们提出了DANCEMATCH，一个用于基于动作的舞蹈检索的端到端框架，即直接从原始视频中识别语义相似的编舞，定义为舞蹈指纹。虽然现有的动作分析和检索方法可以比较姿势序列，但它们依赖于难以索引、解释或扩展的连续嵌入。相反，DANCEMATCH构建了紧凑的离散运动签名，捕捉了舞蹈的时空结构，同时实现了高效的大规模检索。我们的系统将骨架运动量化（SMQ）与时空变换器（STT）相结合，将通过Apple CoMotion提取的人体姿势编码成结构化的运动词汇。我们进一步设计了舞蹈检索引擎（DRE），使用基于直方图的索引进行次线性检索，然后进行重新排序以进行精细匹配。为了促进可重复研究，我们发布了DANCETYPESBENCHMARK，一个带有量化运动令牌注释的姿势对齐数据集。实验证明，在多样化的舞蹈风格中具有强大的检索能力，并且对未见过的编舞具有较强的泛化能力，为可扩展的运动指纹和定量编舞分析奠定了基础。

更新时间: 2026-04-01 14:06:59

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2604.00927v1

Representation Selection via Cross-Model Agreement using Canonical Correlation Analysis

Modern vision pipelines increasingly rely on pretrained image encoders whose representations are reused across tasks and models, yet these representations are often overcomplete and model-specific. We propose a simple, training-free method to improve the efficiency of image representations via a post-hoc canonical correlation analysis (CCA) operator. By leveraging the shared structure between representations produced by two pre-trained image encoders, our method finds linear projections that serve as a principled form of representation selection and dimensionality reduction, retaining shared semantic content while discarding redundant dimensions. Unlike standard dimensionality reduction techniques such as PCA, which operate on a single embedding space, our approach leverages cross-model agreement to guide representation distillation and refinement. The technique allows representations to be reduced by more than 75% in dimensionality with improved downstream performance, or enhanced at fixed dimensionality via post-hoc representation transfer from larger or fine-tuned models. Empirical results on ImageNet-1k, CIFAR-100, MNIST, and additional benchmarks show consistent improvements over both baseline and PCA-projected representations, with accuracy gains of up to 12.6%.

Updated: 2026-04-01 14:01:41

标题: 使用典范相关分析通过跨模型一致性选择表示

摘要: 现代视觉管道越来越依赖于预训练图像编码器，这些编码器的表示被重新用于各种任务和模型，然而这些表示通常是过度完整和特定于模型的。我们提出了一种简单的、无需训练的方法，通过后续规范相关分析（CCA）运算符来提高图像表示的效率。通过利用两个预训练图像编码器生成的表示之间的共享结构，我们的方法找到线性投影，作为一种符合原则的表示选择和降维方法，保留共享语义内容，同时丢弃冗余维度。与标准的降维技术（如PCA）不同，这些技术在单个嵌入空间上运行，我们的方法利用跨模型一致性来指导表示的提炼和改进。该技术允许表示通过后续从更大或微调模型中进行传输，在固定维度上提高性能。ImageNet-1k、CIFAR-100、MNIST和其他基准测试的经验结果显示，与基线和PCA投影表示相比，我们的方法在准确性上取得了一致的改进，最高可达12.6%。

更新时间: 2026-04-01 14:01:41

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2604.00921v1

Multi-Mode Quantum Annealing for Variational Autoencoders with General Boltzmann Priors

Variational autoencoders (VAEs) learn compact latent representations of complex data, but their generative capacity is fundamentally constrained by the choice of prior distribution over the latent space. Energy-based priors offer a principled way to move beyond factorized assumptions and capture structured interactions among latent variables, yet training such priors at scale requires accurate and efficient sampling from intractable distributions. Here we present Boltzmann-machine--prior VAEs (BM-VAEs) trained using quantum annealing--based sampling in three distinct operational modes within a single generative system. During training, diabatic quantum annealing (DQA) provides unbiased Boltzmann samples for gradient estimation of the energy-based prior; for unconditional generation, slower quantum annealing (QA) concentrates samples near low-energy minima; for conditional generation, bias fields are added to direct sampling toward attribute-specific regions of the energy landscape (c-QA). Using up to 2000 qubits on a D-Wave Advantage2 processor, we demonstrate stable and efficient training across multiple datasets, with faster convergence and lower reconstruction loss than a Gaussian-prior VAE. The learned Boltzmann prior enables unconditional generation by sampling directly from the energy-based latent distribution, a capability that plain autoencoders lack, and conditional generation through latent biasing that leverages the learned pairwise interactions.

Updated: 2026-04-01 13:59:40

标题: 多模式量子退火用于具有一般Boltzmann先验的变分自编码器

摘要: 变分自动编码器（VAEs）学习复杂数据的紧凑潜在表示，但它们的生成能力在根本上受到对潜在空间先验分布的选择的限制。基于能量的先验提供了一种超越因式化假设并捕捉潜在变量之间结构化交互的原则性方法，然而在规模上训练这种先验需要准确和高效地从难以处理的分布中进行采样。在这里，我们提出了使用基于量子退火的采样在单个生成系统内的三种不同操作模式训练的玻尔兹曼机-先验VAEs（BM-VAEs）。在训练过程中，瞬态量子退火（DQA）提供了无偏的玻尔兹曼样本，用于能量先验的梯度估计；对于无条件生成，较慢的量子退火（QA）将样本集中在低能最小值附近；对于有条件生成，添加偏置场以将采样引导到能量景观的属性特定区域（c-QA）。在D-Wave Advantage2处理器上使用多达2000个量子比特，我们展示了在多个数据集上的稳定和高效训练，收敛速度更快，重建损失更低，比高斯先验VAE更好。学习的玻尔兹曼先验通过直接从基于能量的潜在分布中采样实现了无条件生成的能力，这是普通自动编码器所缺乏的，通过利用学到的配对交互实现潜在偏置进行有条件生成。

更新时间: 2026-04-01 13:59:40

领域: quant-ph,cond-mat.stat-mech,cs.LG

下载: http://arxiv.org/abs/2604.00919v1

Generalization Bounds for Spectral GNNs via Fourier Domain Analysis

Spectral graph neural networks learn graph filters, but their behavior with increasing depth and polynomial order is not well understood. We analyze these models in the graph Fourier domain, where each layer becomes an element-wise frequency update, separating the fixed spectrum from trainable parameters and making depth and order explicit. In this setting, we show that Gaussian complexity is invariant under the Graph Fourier Transform, which allows us to derive data-dependent, depth, and order-aware generalization bounds together with stability estimates. In the linear case, our bounds are tighter, and on real graphs, the data-dependent term correlates with the generalization gap across polynomial bases, highlighting practical choices that avoid frequency amplification across layers.

Updated: 2026-04-01 13:58:50

标题: 通过傅里叶域分析的谱图神经网络的泛化界限

摘要: 频谱图神经网络学习图滤波器，但其随着深度和多项式阶数增加的行为尚不是很清楚。我们在图傅立叶域中分析这些模型，其中每一层成为一个逐元素频率更新，将固定频谱与可训练参数分离开来并使深度和阶数显式。在这种设置下，我们展示了高斯复杂度在图傅立叶变换下是不变的，这使我们能够推导出数据相关、深度和阶数感知的泛化界限以及稳定性估计。在线性情况下，我们的边界更紧，而在实际图中，数据相关项与跨多项式基础的泛化差距相关，突显出避免跨层频率放大的实际选择。

更新时间: 2026-04-01 13:58:50

领域: cs.LG

下载: http://arxiv.org/abs/2604.00918v1

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

The rise of large language models for code has reshaped software development. Autonomous coding agents, able to create branches, open pull requests, and perform code reviews, now actively contribute to real-world projects. Their growing role offers a unique and timely opportunity to investigate AI-driven contributions and their effects on code quality, team dynamics, and software maintainability. In this work, we construct a novel dataset of approximately $110,000$ open-source pull requests, including associated commits, comments, reviews, issues, and file changes, collectively representing millions of lines of source code. We compare five popular coding agents, including OpenAI Codex, Claude Code, GitHub Copilot, Google Jules, and Devin, examining how their usage differs in various development aspects such as merge frequency, edited file types, and developer interaction signals, including comments and reviews. Furthermore, we emphasize that code authoring and review are only a small part of the larger software engineering process, as the resulting code must also be maintained and updated over time. Hence, we offer several longitudinal estimates of survival and churn rates for agent-generated versus human-authored code. Ultimately, our findings indicate an increasing agent activity in open-source projects, although their contributions are associated with more churn over time compared to human-authored code.

Updated: 2026-04-01 13:58:30

标题: 研究自主代理在野外的贡献：活动模式和随时间变化的代码更改

摘要: 大型语言模型为代码的崛起重新塑造了软件开发。能够创建分支、打开拉取请求和进行代码审查的自主编码代理现在积极地为现实世界的项目做出贡献。它们不断增长的作用提供了一个独特而及时的机会，来研究基于人工智能的贡献及其对代码质量、团队动态和软件可维护性的影响。在这项工作中，我们构建了一个约$110,000$个开源拉取请求的新颖数据集，包括相关的提交、评论、审查、问题和文件更改，总共代表了数百万行源代码。我们比较了五个流行的编码代理，包括OpenAI Codex、Claude Code、GitHub Copilot、Google Jules和Devin，研究它们在各种开发方面的使用方式的差异，例如合并频率、编辑文件类型和开发者交互信号，包括评论和审查。此外，我们强调，代码编写和审查只是更大软件工程过程的一小部分，因为生成的代码还必须随着时间的推移进行维护和更新。因此，我们提供了几个代理生成与人工编写代码的生存和流失率的长期估计。最终，我们的研究结果表明，在开源项目中代理的活动不断增加，尽管与人工编写的代码相比，它们的贡献随着时间的推移而产生更多的流失。

更新时间: 2026-04-01 13:58:30

领域: cs.SE,cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.00917v1

Orthogonal Learner for Estimating Heterogeneous Long-Term Treatment Effects

Estimation of heterogeneous long-term treatment effects (HLTEs) is widely used for personalized decision-making in marketing, economics, and medicine, where short-term randomized experiments are often combined with long-term observational data. However, HLTE estimation is challenging due to limited overlap in treatment or in observing long-term outcomes for certain subpopulations, which can lead to unstable HLTE estimates with large finite-sample variance. To address this challenge, we introduce the LT-O-learners (Long-Term Orthogonal Learners), a set of novel orthogonal learners for HLTE estimation. The learners are designed for the canonical HLTE setting that combines a short-term randomized dataset $\mathcal{D}_1$ with a long-term historical dataset $\mathcal{D}_2$. The key idea of our LT-O-Learners is to retarget the learning objective by introducing custom overlap weights that downweight samples with low overlap in treatment or in long-term observation. We show that the retargeted loss is equivalent to the weighted oracle loss and satisfies Neyman-orthogonality, which means our learners are robust to errors in the nuisance estimation. We further provide a general error bound for the LT-O-Learners and give the conditions under which quasi-oracle rate can be achieved. Finally, our LT-O-learners are model-agnostic and can thus be instantiated with arbitrary machine learning models. We conduct empirical evaluations on synthetic and semi-synthetic benchmarks to confirm the theoretical properties of our LT-O-Learners, especially the robustness in low-overlap settings. To the best of our knowledge, ours are the first orthogonal learners for HLTE estimation that are robust to low overlap that is common in long-term outcomes.

Updated: 2026-04-01 13:56:19

标题: 正交学习器用于估计异质性长期治疗效果

摘要: 估计异质长期治疗效应（HLTEs）在营销、经济学和医学中被广泛用于个性化决策制定，其中短期随机实验通常与长期观察数据结合使用。然而，由于某些亚群体中治疗或长期结果的重叠有限，HLTE估计面临挑战，可能导致具有大样本方差的不稳定HLTE估计。为了解决这一挑战，我们引入了LT-O-learners（长期正交学习器），这是一组新颖的用于HLTE估计的正交学习器。这些学习器设计用于将短期随机数据集$\mathcal{D}_1$与长期历史数据集$\mathcal{D}_2$结合的典型HLTE设置。我们LT-O-Learners的关键思想是通过引入自定义重叠权重，降低治疗或长期观察中重叠较低样本的权重，从而重新定位学习目标。我们展示了重新定位的损失等同于加权的oracle损失，并满足Neyman正交性，这意味着我们的学习器对于干扰估计中的误差是稳健的。我们进一步为LT-O-Learners提供了一个一般的误差界，并给出了可以实现准oracle速率的条件。最后，我们的LT-O-learners是与任意机器学习模型实例化的模型不可知的，并进行了合成和半合成基准的实证评估，以确认我们LT-O-Learners的理论性质，特别是在低重叠设置中的稳健性。据我们所知，我们的是第一个针对长期结果中普遍存在的低重叠问题稳健的HLTE估计的正交学习器。

更新时间: 2026-04-01 13:56:19

领域: cs.LG,stat.ML

下载: http://arxiv.org/abs/2604.00915v1

Event Embedding of Protein Networks : Compositional Learning of Biological Function

In this work, we study whether enforcing strict compositional structure in sequence embeddings yields meaningful geometric organization when applied to protein-protein interaction networks. Using Event2Vec, an additive sequence embedding model, we train 64-dimensional representations on random walks from the human STRING interactome, and compare against a DeepWalk baseline based on Word2Vec, trained on the same walks. We find that compositional structure substantially improves pathway coherence (30.2$\times$ vs 2.9$\times$ above random), functional analogy accuracy (mean similarity 0.966 vs 0.650), and hierarchical pathway organization, while geometric properties such as norm--degree anticorrelation are shared with or exceeded by the non-compositional baseline. These results indicate that enforced compositionality specifically benefits relational and compositional reasoning tasks in biological networks.

Updated: 2026-04-01 13:53:35

标题: 蛋白质网络的事件嵌入：生物功能的组合学习

摘要: 在这项工作中，我们研究了在蛋白质相互作用网络中应用严格的组合结构是否会产生有意义的几何组织。我们使用Event2Vec，一种加法序列嵌入模型，在人类STRING相互作用组中训练64维表示，并与基于Word2Vec的DeepWalk基线进行比较，后者也是在相同的随机行走中训练。我们发现，组合结构显著提高了通路的连贯性（相对随机的30.2倍 vs 2.9倍），功能类比准确性（平均相似度0.966 vs 0.650），以及层次化通路组织，而几何性质如范数-度数反相关性与非组合基线相同或更高。这些结果表明，强制的组合性特别有利于生物网络中的关系和组合推理任务。

更新时间: 2026-04-01 13:53:35

领域: cs.LG

下载: http://arxiv.org/abs/2604.00911v1

View-oriented Conversation Compiler for Agent Trace Analysis

Agent traces carry increasing analytical value in agentic systems and context engineering, yet most prior work treats conversation format as a trivial implementation detail. Modern agent conversations, however, contain deeply structured content, including nested tool calls and results, chain-of-thought reasoning blocks, sub-agent invocations, context-window compaction boundaries, and harness-injected system directives, whose complexity far exceeds that of simple user-assistant exchanges. Feeding such traces to a reflector or other analytical mechanism in plain text, JSON, YAML, or via grep can materially degrade analysis quality. This paper presents VCC (View-oriented Conversation Compiler), a compiler (lex, parse, IR, lower, emit) that transforms raw agent JSONL logs into a family of structured views: a full view (lossless transcript serving as the canonical line-number coordinate system), a user-interface (UI) view (reconstructing the interaction as the user actually perceived it), and an adaptive view (a structure-preserving projection governed by a relevance predicate). In a context-engineering experiment on AppWorld, replacing only the reflector's input format, from raw JSONL to VCC-compiled views, leads to higher pass rates across all three model configurations tested, while cutting reflector token consumption by half to two-thirds and producing more concise learned memory. These results suggest that message format functions as infrastructure for context engineering, not as an incidental implementation choice.

Updated: 2026-04-01 13:52:38

标题: 面向视图的对话编译器用于代理跟踪分析

摘要: 代理追踪在代理系统和上下文工程中具有越来越大的分析价值，然而大多数先前的工作将对话格式视为一个微不足道的实现细节。然而，现代代理对话包含深度结构化内容，包括嵌套工具调用和结果，思维链条推理块，子代理调用，上下文窗口压缩边界，以及注入系统指令，其复杂性远远超过简单的用户助理交流。将这些追踪信息以纯文本、JSON、YAML或通过grep传递给反射器或其他分析机制可能会显著降低分析质量。本文介绍了VCC（视图导向对话编译器），这是一个编译器（词法分析、语法分析、IR、下降、发射），它将原始代理JSONL日志转换为一系列结构化视图：完整视图（无损转录作为规范的行号坐标系）、用户界面（UI）视图（重建用户实际感知到的交互），以及自适应视图（由相关谓词控制的保留结构的投影）。在AppWorld上进行的一个上下文工程实验中，仅将反射器的输入格式从原始JSONL替换为VCC编译的视图，导致所有测试的三种模型配置中的通过率更高，同时将反射器令牌消耗减少了一半到三分之二，并产生更简洁的学习性记忆。这些结果表明，消息格式作为上下文工程的基础设施，而不是一种偶然的实现选择。

更新时间: 2026-04-01 13:52:38

领域: cs.AI

下载: http://arxiv.org/abs/2603.29678v2

Natural Hypergradient Descent: Algorithm Design, Convergence Analysis, and Parallel Implementation

In this work, we propose Natural Hypergradient Descent (NHGD), a new method for solving bilevel optimization problems. To address the computational bottleneck in hypergradient estimation--namely, the need to compute or approximate Hessian inverse--we exploit the statistical structure of the inner optimization problem and use the empirical Fisher information matrix as an asymptotically consistent surrogate for the Hessian. This design enables a parallel optimize-and-approximate framework in which the Hessian-inverse approximation is updated synchronously with the stochastic inner optimization, reusing gradient information at negligible additional cost. Our main theoretical contribution establishes high-probability error bounds and sample complexity guarantees for NHGD that match those of state-of-the-art optimize-then-approximate methods, while significantly reducing computational time overhead. Empirical evaluations on representative bilevel learning tasks further demonstrate the practical advantages of NHGD, highlighting its scalability and effectiveness in large-scale machine learning settings.

Updated: 2026-04-01 13:48:55

标题: 自然的超梯度下降：算法设计、收敛分析和并行实现

摘要: 在这项工作中，我们提出了一种新的方法，即自然超梯度下降（NHGD），用于解决双层优化问题。为了解决超梯度估计中的计算瓶颈，即需要计算或近似Hessian逆矩阵的问题，我们利用内部优化问题的统计结构，并将经验费舍尔信息矩阵作为Hessian的渐近一致替代物。这种设计实现了并行优化和近似框架，其中Hessian逆矩阵的近似与随机内部优化同步更新，以几乎可以忽略的额外成本重新利用梯度信息。我们的主要理论贡献建立了NHGD的高概率误差界和样本复杂度保证，与最先进的优化-近似方法相匹配，同时显著减少了计算时间开销。在代表性的双层学习任务上的实证评估进一步展示了NHGD的实际优势，突显了其在大规模机器学习环境中的可扩展性和有效性。

更新时间: 2026-04-01 13:48:55

领域: cs.LG,math.OC,stat.ML

下载: http://arxiv.org/abs/2602.10905v2

Fatigue-Aware Learning to Defer via Constrained Optimisation

Learning to defer (L2D) enables human-AI cooperation by deciding when an AI system should act autonomously or defer to a human expert. Existing L2D methods, however, assume static human performance, contradicting well-established findings on fatigue-induced degradation. We propose Fatigue-Aware Learning to Defer via Constrained Optimisation (FALCON), which explicitly models workload-varying human performance using psychologically grounded fatigue curves. FALCON formulates L2D as a Constrained Markov Decision Process (CMDP) whose state includes both task features and cumulative human workload, and optimises accuracy under human-AI cooperation budgets via PPO-Lagrangian training. We further introduce FA-L2D, a benchmark that systematically varies fatigue dynamics from near-static to rapidly degrading regimes. Experiments across multiple datasets show that FALCON consistently outperforms state-of-the-art L2D methods across coverage levels, generalises zero-shot to unseen experts with different fatigue patterns, and demonstrates the advantage of adaptive human-AI collaboration over AI-only or human-only decision-making when coverage lies strictly between 0 and 1.

Updated: 2026-04-01 13:48:24

标题: 疲劳感知学习通过受限优化进行推迟

摘要: Learning to defer (L2D)是一种使人工智能与人类合作的方法，通过决定AI系统何时应该自主行动或推迟到人类专家。然而，现有的L2D方法假定人类表现是静态的，与疲劳引起的退化的研究结果相矛盾。我们提出了一种基于疲劳感知的学习推迟方法，命名为FALCON，通过约束优化明确地建模了变化的工作负荷对人类表现的影响，使用心理学基础的疲劳曲线。FALCON将L2D表述为一种约束马尔可夫决策过程（CMDP），其状态包括任务特征和累积人类工作负荷，并通过PPO-Lagrangian训练在人工智能合作预算下优化准确性。我们进一步引入了FA-L2D，一个从几乎静态到快速退化的疲劳动态的基准。跨多个数据集的实验表明，FALCON在覆盖水平上始终优于最先进的L2D方法，在零-shot到看不见的专家具有不同疲劳模式时具有泛化性，并展示了当覆盖范围严格介于0和1之间时，自适应人工智能协作胜过仅AI或仅人类决策的优势。

更新时间: 2026-04-01 13:48:24

领域: cs.LG

下载: http://arxiv.org/abs/2604.00904v1

SHIFT: Stochastic Hidden-Trajectory Deflection for Removing Diffusion-based Watermark

Diffusion-based watermarking methods embed verifiable marks by manipulating the initial noise or the reverse diffusion trajectory. However, these methods share a critical assumption: verification can succeed only if the diffusion trajectory can be faithfully reconstructed. This reliance on trajectory recovery constitutes a fundamental and exploitable vulnerability. We propose $\underline{\mathbf{S}}$tochastic $\underline{\mathbf{Hi}}$dden-Trajectory De$\underline{\mathbf{f}}$lec$\underline{\mathbf{t}}$ion ($\mathbf{SHIFT}$), a training-free attack that exploits this common weakness across diverse watermarking paradigms. SHIFT leverages stochastic diffusion resampling to deflect the generative trajectory in latent space, making the reconstructed image statistically decoupled from the original watermark-embedded trajectory while preserving strong visual quality and semantic consistency. Extensive experiments on nine representative watermarking methods spanning noise-space, frequency-domain, and optimization-based paradigms show that SHIFT achieves 95%--100% attack success rates with nearly no loss in semantic quality, without requiring any watermark-specific knowledge or model retraining.

Updated: 2026-04-01 13:47:16

标题: SHIFT：随机隐藏轨迹偏移用于去除基于扩散的水印

摘要: 基于扩散的水印方法通过操纵初始噪声或逆向扩散轨迹嵌入可验证的标记。然而，这些方法共享一个关键假设：只有在扩散轨迹能够被忠实地重建时，验证才能成功。对轨迹恢复的依赖构成了一种基本且可利用的漏洞。我们提出了一种名为$\mathbf{SHIFT}$（$\underline{\mathbf{S}}$tochastic $\underline{\mathbf{Hi}}$dden-Trajectory De$\underline{\mathbf{f}}$lec$\underline{\mathbf{t}}$ion）的无需训练的攻击方法，利用这种普遍弱点跨越不同的水印范式。SHIFT利用随机扩散重采样来偏转潜在空间中的生成轨迹，使重建的图像在统计上脱离原始水印嵌入轨迹，同时保持较强的视觉质量和语义一致性。对涵盖噪声空间、频域和基于优化的不同范式的九种代表性水印方法进行的广泛实验表明，SHIFT实现了95%至100%的攻击成功率，几乎没有语义质量损失，而无需任何特定水印知识或模型重新训练。

更新时间: 2026-04-01 13:47:16

领域: cs.CV,cs.CR

下载: http://arxiv.org/abs/2603.29742v2

Neural Conditional Transport Maps

We present a neural framework for learning conditional optimal transport (OT) maps between probability distributions. Our approach introduces a conditioning mechanism capable of processing both categorical and continuous conditioning variables simultaneously. At the core of our method lies a hypernetwork that generates transport layer parameters based on these inputs, creating adaptive mappings that outperform simpler conditioning methods. Comprehensive ablation studies demonstrate the superior performance of our method over baseline configurations. Furthermore, we showcase an application to global sensitivity analysis, offering high performance in computing OT-based sensitivity indices. This work advances the state-of-the-art in conditional optimal transport, enabling broader application of optimal transport principles to complex, high-dimensional domains such as generative modeling and black-box model explainability.

Updated: 2026-04-01 13:46:59

标题: 神经条件传输映射

摘要: 我们提出了一个神经框架，用于学习概率分布之间的条件最优传输（OT）映射。我们的方法引入了一种能够同时处理分类和连续条件变量的调节机制。我们方法的核心是一个超网络，根据这些输入生成传输层参数，创建优于简单调节方法的自适应映射。全面的消融研究证明了我们方法相对于基准配置的优越性能。此外，我们展示了一个应用于全局敏感性分析的实例，提供了在计算基于OT的敏感性指数方面的高性能。这项工作推动了条件最优传输的最新技术，使得最优传输原则更广泛地应用于生成建模和黑盒模型可解释性等复杂、高维领域。

更新时间: 2026-04-01 13:46:59

领域: cs.LG,cs.AI,math.PR,stat.AP,stat.ML

下载: http://arxiv.org/abs/2505.15808v2

Experience as a Compass: Multi-agent RAG with Evolving Orchestration and Agent Prompts

Multi-agent Retrieval-Augmented Generation (RAG), wherein each agent takes on a specific role, supports hard queries that require multiple steps and sources, or complex reasoning. Existing approaches, however, rely on static agent behaviors and fixed orchestration strategies, leading to brittle performance on diverse, multi-hop tasks. We identify two key limitations: the lack of continuously adaptive orchestration mechanisms and the absence of behavior-level learning for individual agents. To this end, we propose HERA, a hierarchical framework that jointly evolves multi-agent orchestration and role-specific agent prompts. At the global level, HERA optimizes query-specific agent topologies through reward-guided sampling and experience accumulation. At the local level, Role-Aware Prompt Evolution refines agent behaviors via credit assignment and dual-axes adaptation along operational and behavioral principles, enabling targeted, role-conditioned improvements. On six knowledge-intensive benchmarks, HERA achieves an average improvement of 38.69\% over recent baselines while maintaining robust generalization and token efficiency. Topological analyses reveal emergent self-organization, where sparse exploration yields compact, high-utility multi-agent networks, demonstrating both efficient coordination and robust reasoning.

Updated: 2026-04-01 13:45:52

标题: 作为指南的经验：具有演变指挥和代理提示的多智能体RAG

摘要: 多智能体检索增强生成（RAG），其中每个智能体承担特定角色，支持需要多个步骤和来源，或复杂推理的困难查询。然而，现有方法依赖于静态智能体行为和固定的编排策略，导致在多样化、多跳任务上性能脆弱。我们确定了两个关键限制：缺乏持续自适应编排机制和缺乏个体智能体的行为级学习。为此，我们提出了HERA，一个联合演进多智能体编排和角色特定智能体提示的分层框架。在全局层面上，HERA通过奖励引导的抽样和经验积累优化特定于查询的智能体拓扑结构。在本地层面上，角色感知提示演进通过信用分配和沿操作和行为原则的双轴适应，使得定向的、角色条件的改进成为可能。在六个知识密集型基准测试中，HERA相对于最近的基准线平均改进了38.69\%，同时保持了强大的泛化和令牌效率。拓扑分析揭示了自发的自组织现象，其中稀疏探索产生紧凑、高效的多智能体网络，展示了高效协调和强大推理能力。

更新时间: 2026-04-01 13:45:52

领域: cs.AI

下载: http://arxiv.org/abs/2604.00901v1

Super-Resolving Coarse-Resolution Weather Forecasts With Flow Matching

Machine learning-based weather forecasting models now surpass state-of-the-art numerical weather prediction systems, but training and operating these models at high spatial resolution remains computationally expensive. We present a modular framework that decouples forecasting from spatial resolution by applying learned generative super-resolution as a post-processing step to coarse-resolution forecast trajectories. We formulate super-resolution as a stochastic inverse problem, using a residual formulation to preserve large-scale structure while reconstructing unresolved variability. The model is trained with flow matching exclusively on reanalysis data and is applied to global medium-range forecasts. We evaluate (i) design consistency by re-coarsening super-resolved forecasts and comparing them to the original coarse trajectories, and (ii) high-resolution forecast quality using standard ensemble verification metrics and spectral diagnostics. Results show that super-resolution preserves large-scale structure and variance after re-coarsening, introduces physically consistent small-scale variability, and achieves competitive probabilistic forecast skill at 0.25° resolution relative to an operational ensemble baseline, while requiring only a modest additional training cost compared with end-to-end high-resolution forecasting.

Updated: 2026-04-01 13:43:42

标题: 用流匹配技术超分辨粗分辨率天气预报

摘要: 基于机器学习的天气预测模型现在已经超过了最先进的数值天气预报系统，但在高空间分辨率下训练和操作这些模型仍然在计算上昂贵。我们提出了一个模块化框架，通过将学习到的生成式超分辨率作为后处理步骤应用于粗分辨率预测轨迹，从而将预测与空间分辨率解耦。我们将超分辨率形式化为一个随机反问题，使用残差形式来保留大尺度结构同时重建未解决的变异性。该模型仅在再分析数据上通过流匹配进行训练，并应用于全球中期预报。我们通过重新加粗超分辨率预测并将其与原始粗轨迹进行比较来评估（i）设计一致性，以及（ii）使用标准集成验证指标和频谱诊断来评估高分辨率预报质量。结果表明，超分辨率在重新加粗后保留了大尺度结构和方差，引入了物理一致的小尺度变异性，并在0.25°分辨率下相对于操作集成基线实现了有竞争力的概率预报技能，而与端到端高分辨率预测相比，只需要较少的额外训练成本。

更新时间: 2026-04-01 13:43:42

领域: cs.LG,cs.CV

下载: http://arxiv.org/abs/2604.00897v1

Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Models

Geometric Problem Solving (GPS) remains at the heart of enhancing mathematical reasoning in large language models because it requires the combination of diagrammatic understanding, symbolic manipulation and logical inference. In existing literature, researchers have chiefly focused on synchronising the diagram descriptions with text literals and solving the problem. In this vein, they have either taken a neural, symbolic or neuro-symbolic approach. But this solves only the first two of the requirements, namely diagrammatic understanding and symbolic manipulation, while leaving logical inference underdeveloped. The logical inference is often limited to one chain-of-thought (CoT). To address this weakness in hitherto existing models, this paper proposes MARS-GPS, that generates multiple parallel reasoning rollouts augmented with Python code execution for numerical verification, ranks them using token-level entropy as a confidence signal, and aggregates answers through a multi-stage voting and self-verification pipeline. Empirical results show that MARS-GPS with 8 parallel rollouts achieves 88.8% on Geometry3K, a nearly +11% improvement over the prior state-of-the-art, with accuracy scaling consistently as the number of rollouts increases from 1 to 16 (+6.0% on ablation subset). We provide our code and data in an anonymous repository: https://anonymous.4open.science/r/MARS-GPS-DE55.

Updated: 2026-04-01 13:37:06

标题: 超越符号求解：大型语言模型中几何推理的多链思维投票

摘要: 几何问题解决（GPS）仍然是增强大型语言模型中数学推理的核心，因为它需要结合图解理解、符号操作和逻辑推理。在现有文献中，研究人员主要集中在将图解描述与文本文字同步，并解决问题。在这方面，他们要么采用神经、符号或神经符号方法。但这只解决了前两项要求，即图解理解和符号操作，而逻辑推理则未能充分发展。逻辑推理通常仅限于一种思维链（CoT）。为了解决迄今存在模型中的这一弱点，本文提出了MARS-GPS，它生成多个并行推理展开，通过Python代码执行进行数值验证，使用标记级熵作为信心信号对其进行排序，并通过多阶段投票和自我验证管道对答案进行汇总。实证结果表明，具有8个并行推理展开的MARS-GPS在Geometry3K上实现了88.8%的准确率，比先前的最新技术提高了近11%，准确率随着推理展开数量从1增加到16而持续增加（在消融子集上增加了6.0%）。我们在匿名存储库中提供我们的代码和数据：https://anonymous.4open.science/r/MARS-GPS-DE55。

更新时间: 2026-04-01 13:37:06

领域: cs.AI,cs.CL,cs.CV

下载: http://arxiv.org/abs/2604.00890v1

A Hitchhiker's Guide to Privacy-Preserving Digital Payment Systems: A Survey on Anonymity, Confidentiality, and Auditability

Crypto-assets and central bank digital currencies (CBDCs) are reshaping how value is exchanged in distributed computing environments. These systems combine cryptographic primitives, protocol design, and system architectures to provide transparency and efficiency while raising critical challenges around privacy and regulatory compliance. This survey offers a comprehensive overview of privacy-preserving digital payment systems, covering both decentralized ledger systems and CBDCs. We present a taxonomy of privacy goals -- including anonymity, confidentiality, unlinkability, and auditability -- and map them to the cryptographic primitives, protocols, and system architectures that implement them. Our work adopts a design-oriented perspective, linking high-level privacy objectives to concrete implementations. We also trace the evolution of privacy-preserving digital payment systems through three generations, highlighting shifts from basic anonymity guarantees toward more nuanced privacy-accountability trade-offs. Finally, we identify open challenges, motivating further research into architectures and solutions that balance strong privacy with real-world auditability needs.

Updated: 2026-04-01 13:35:01

标题: 《隐私保护数字支付系统搭车者指南：关于匿名性、机密性和可审计性的调查》

摘要: 加密资产和央行数字货币（CBDCs）正在重塑分布式计算环境中价值交换的方式。这些系统结合了加密原语、协议设计和系统架构，提供透明度和效率，同时引发了关于隐私和监管合规性的重要挑战。本调查提供了隐私保护数字支付系统的全面概述，涵盖了分散账本系统和CBDCs。我们提出了隐私目标的分类法，包括匿名、保密、不可链接性和可审计性，并将它们映射到实施它们的加密原语、协议和系统架构。我们的工作采用了设计导向的视角，将高级隐私目标与具体实现相联系。我们还通过三代追踪了隐私保护数字支付系统的演变，突出了从基本匿名性保证转向更加细致的隐私和可审计性的权衡。最后，我们确定了未解决的挑战，激励进一步研究平衡强隐私与现实审计需求的架构和解决方案。

更新时间: 2026-04-01 13:35:01

领域: cs.CR,cs.DC

下载: http://arxiv.org/abs/2505.21008v3

Adversarial Attenuation Patch Attack for SAR Object Detection

Deep neural networks have demonstrated excellent performance in SAR target detection tasks but remain susceptible to adversarial attacks. Existing SAR-specific attack methods can effectively deceive detectors; however, they often introduce noticeable perturbations and are largely confined to digital domain, neglecting physical implementation constrains for attacking SAR systems. In this paper, a novel Adversarial Attenuation Patch (AAP) method is proposed that employs energy-constrained optimization strategy coupled with an attenuation-based deployment framework to achieve a seamless balance between attack effectiveness and stealthiness. More importantly, AAP exhibits strong potential for physical realization by aligning with signal-level electronic jamming mechanisms. Experimental results show that AAP effectively degrades detection performance while preserving high imperceptibility, and shows favorable transferability across different models. This study provides a physical grounded perspective for adversarial attacks on SAR target detection systems and facilitates the design of more covert and practically deployable attack strategies. The source code is made available at https://github.com/boremycin/SAAP.

Updated: 2026-04-01 13:34:31

标题: 对SAR目标检测的对抗性衰减补丁攻击

摘要: 深度神经网络在SAR目标检测任务中表现出优异的性能，但仍然容易受到对抗性攻击的影响。现有的针对SAR的攻击方法可以有效地欺骗检测器；然而，它们通常引入明显的扰动，并且主要局限于数字领域，忽视了对攻击SAR系统的物理实现约束。本文提出了一种新颖的对抗性衰减补丁（AAP）方法，采用能量受限的优化策略结合衰减基础部署框架，实现了攻击效果和隐蔽性之间的无缝平衡。更重要的是，AAP通过与信号级电子干扰机制对齐，展示了强大的物理实现潜力。实验结果表明，AAP有效地降低了检测性能，同时保留高度的不可察觉性，并且在不同模型之间展现了良好的可转移性。本研究为对SAR目标检测系统的对抗性攻击提供了一个物理基础的视角，并促进了更隐蔽和实际可部署的攻击策略的设计。源代码可在https://github.com/boremycin/SAAP上获得。

更新时间: 2026-04-01 13:34:31

领域: cs.CV,cs.CR

下载: http://arxiv.org/abs/2604.00887v1

PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding

Document understanding and GUI interaction are among the highest-value applications of Vision-Language Models (VLMs), yet they impose exceptionally heavy computational burden: fine-grained text and small UI elements demand high-resolution inputs that produce tens of thousands of visual tokens. We observe that this cost is largely wasteful -- across document and GUI benchmarks, only 22--71\% of image patches are pixel-unique, the rest being exact duplicates of another patch in the same image. We propose \textbf{PixelPrune}, which exploits this pixel-level redundancy through predictive-coding-based compression, pruning redundant patches \emph{before} the Vision Transformer (ViT) encoder. Because it operates in pixel space prior to any neural computation, PixelPrune accelerates both the ViT encoder and the downstream LLM, covering the full inference pipeline. The method is training-free, requires no learnable parameters, and supports pixel-lossless compression ($τ{=}0$) as well as controlled lossy compression ($τ{>}0$). Experiments across three model scales and document and GUI benchmarks show that PixelPrune maintains competitive task accuracy while delivering up to 4.2$\times$ inference speedup and 1.9$\times$ training acceleration. Code is available at https://github.com/OPPO-Mente-Lab/PixelPrune.

Updated: 2026-04-01 13:33:27

标题: PixelPrune：通过预测编码实现像素级自适应视觉标记减少

摘要: 文档理解和GUI交互是视觉语言模型（VLMs）最高价值的应用之一，但它们需要极大的计算负担：细粒度文本和小型UI元素需要高分辨率输入，产生数万个视觉标记。我们观察到这种成本在很大程度上是浪费的--在文档和GUI基准测试中，只有22-71％的图像补丁是像素唯一的，其余是同一图像中另一个补丁的精确副本。我们提出了PixelPrune，通过基于预测编码的压缩利用这种像素级冗余，对冗余补丁进行修剪\emph{在}Vision Transformer（ViT）编码器之前。因为它在任何神经计算之前在像素空间中运行，PixelPrune加速了ViT编码器和下游LLM，涵盖了完整的推理流程。该方法无需训练，不需要可学习参数，并支持像素无损压缩（$τ{=}0$）以及控制丢失压缩（$τ{>}0$）。跨三个模型规模和文档和GUI基准测试的实验表明，PixelPrune在保持竞争性任务准确性的同时提供了高达4.2倍的推理加速和1.9倍的训练加速。代码可在https://github.com/OPPO-Mente-Lab/PixelPrune找到。

更新时间: 2026-04-01 13:33:27

领域: cs.CV,cs.AI,cs.CL

下载: http://arxiv.org/abs/2604.00886v1

PluriHopRAG: Exhaustive, Recall-Sensitive QA Through Corpus-Specific Document Structure Learning

Retrieval-Augmented Generation (RAG) has been used in question answering (QA) systems to improve performance when relevant information is in one (single-hop) or multiple (multi-hop) passages. However, many real life scenarios (e.g. dealing with financial, legal, medical reports) require checking all documents for relevant information without a clear stopping condition. We term these pluri-hop questions, and formalize them by 3 conditions - recall sensitivity, exhaustiveness, and exactness. To study this setting, we introduce PluriHopWIND, a multilingual diagnostic benchmark of 48 pluri-hop questions over 191 real wind-industry reports, with high repetitiveness to reflect the challenge of distractors in real-world datasets. Naive, graph-based, and multimodal RAG methods only reach up to 40% statement-wise F1 on PluriHopWIND. Motivated by this, we propose PluriHopRAG, which learns from synthetic examples to decompose queries according to corpus-specific document structure, and employs a cross-encoder filter at the document level to minimize costly LLM reasoning. We test PluriHopRAG on PluriHopWIND and the Loong benchmark built on financial, legal and scientific reports. On PluriHopWIND, our method shows 18-52% F1 score improvement across base LLMs, while on Loong, we show 33% improvement over long-context reasoning and 52% improvement over naive RAG.

Updated: 2026-04-01 13:30:00

标题: PluriHopRAG：通过语料库特定文档结构学习实现全面、召回敏感的问答

摘要: 检索增强生成（RAG）已在问答（QA）系统中被使用，以提高性能，当相关信息在一个（单跳）或多个（多跳）段落中时。然而，许多现实生活场景（例如处理财务、法律、医疗报告）需要检查所有文件以获取相关信息，而没有明确的停止条件。我们将这些称为多跳问题，并通过3个条件 - 召回敏感性、详尽性和准确性加以形式化。为了研究这种情况，我们引入了PluriHopWIND，这是一个多语言诊断基准，包含48个多跳问题，涉及191个真实的风电行业报告，其中存在高重复性，以反映现实世界数据集中干扰因素的挑战。天真、基于图的和多模态的RAG方法在PluriHopWIND上只能达到最高40%的语句级F1分数。受此启发，我们提出了PluriHopRAG，它通过学习来自合成示例，根据语料库特定的文档结构分解查询，并在文档级别应用交叉编码器滤波器，以最小化昂贵的LLM推理。我们在PluriHopWIND和建立在财务、法律和科学报告上的Loong基准上测试了PluriHopRAG。在PluriHopWIND上，我们的方法在基础LLMs上显示了18-52%的F1分数改善，而在Loong上，我们展示了相对于长上下文推理的33%改善和相对于天真RAG的52%改善。

更新时间: 2026-04-01 13:30:00

领域: cs.CL,cs.IR,cs.LG

下载: http://arxiv.org/abs/2510.14377v2

A Pure Hypothesis Test for Inhomogeneous Random Graph Models Based on a Kernelised Stein Discrepancy

Complex data are often represented as a graph, which in turn can often be viewed as a realisation of a random graph, such as an inhomogeneous random graph model (IRG). For general fast goodness-of-fit tests in high dimensions, kernelised Stein discrepancy (KSD) tests are a powerful tool. Here, we develop a KSD-type test for IRG models that can be carried out with a single observation of the network. The test applies to a network of any size, but is particularly interesting for small networks for which asymptotic tests are not warranted. We also provide theoretical guarantees.

Updated: 2026-04-01 13:29:18

标题: 基于核化Stein差异的不均匀随机图模型的纯假设检验

摘要: 复杂数据通常被表示为图，而图又经常被视为随机图的实现，例如不均匀随机图模型（IRG）。对于高维度的一般快速适配度测试，核化Stein差异（KSD）测试是一个强大的工具。在这里，我们开发了一种适用于IRG模型的KSD类型测试，可以通过对网络的单个观察执行。该测试适用于任何大小的网络，但对于小网络特别有趣，因为渐近测试并不保证。我们还提供理论保证。

更新时间: 2026-04-01 13:29:18

领域: stat.ML,cs.LG,math.ST

下载: http://arxiv.org/abs/2505.21580v3

S-DAPT-2026: A Stage-Aware Synthetic Dataset for Advanced Persistent Threat Detection

The detection of advanced persistent threats (APTs) remains a crucial challenge due to their stealthy, multistage nature and the limited availability of realistic, labeled datasets for systematic evaluation. Synthetic dataset generation has emerged as a practical approach for modeling APT campaigns; however, existing methods often rely on computationally expensive alert correlation mechanisms that limit scalability. Motivated by these limitations, this paper presents a near realistic synthetic APT dataset and an efficient alert correlation framework. The proposed approach introduces a machine learning based correlation module that employs K Nearest Neighbors (KNN) clustering with a cosine similarity metric to group semantically related alerts within a temporal context. The dataset emulates multistage APT campaigns across campus and organizational network environments and captures a diverse set of fourteen distinct alert types, exceeding the coverage of commonly used synthetic APT datasets. In addition, explicit APT campaign states and alert to stage mappings are defined to enable flexible integration of new alert types and support stage aware analysis. A comprehensive statistical characterization of the dataset is provided to facilitate reproducibility and support APT stage predictions.

Updated: 2026-04-01 13:24:46

标题: S-DAPT-2026：用于高级持续性威胁检测的阶段感知合成数据集

摘要: 检测高级持续威胁(APTs)仍然是一个关键挑战，因为它们具有隐蔽、多阶段性质，而且缺乏用于系统评估的现实标记数据集。合成数据集生成已经成为一种实用的建模APT活动的方法；然而，现有方法通常依赖于计算昂贵的警报关联机制，限制了可扩展性。受这些限制的启发，本文提出了一个接近真实的合成APT数据集和一个高效的警报关联框架。所提出的方法引入了一个基于机器学习的关联模块，利用K最近邻(KNN)聚类和余弦相似度度量来在时间上下文中将语义相关的警报分组。该数据集模拟了校园和组织网络环境中的多阶段APT活动，并捕获了一组多样化的十四种不同类型的警报，超过了常用合成APT数据集的覆盖范围。此外，明确定义了APT活动状态和警报到阶段映射，以便灵活集成新的警报类型并支持阶段感知分析。提供了对数据集的全面统计特征化，以促进可重现性并支持APT阶段预测。

更新时间: 2026-04-01 13:24:46

领域: cs.CR,eess.SP

下载: http://arxiv.org/abs/2601.06690v2

Deep Recurrent Hidden Markov Learning Framework for Multi-Stage Advanced Persistent Threat Prediction

Advanced Persistent Threats (APTs) represent hidden, multi\-stage cyberattacks whose long term persistence and adaptive behavior challenge conventional intrusion detection systems (IDS). Although recent advances in machine learning and probabilistic modeling have improved APT detection performance, most existing approaches remain reactive and alert\-centric, providing limited capability for stage-aware prediction and principled inference under uncertainty, particularly when observations are sparse or incomplete. This paper proposes E\-HiDNet, a unified hybrid deep probabilistic learning framework that integrates convolutional and recurrent neural networks with a Hidden Markov Model (HMM) to allow accurate prediction of the progression of the APT campaign. The deep learning component extracts hierarchical spatio\-temporal representations from correlated alert sequences, while the HMM models latent attack stages and their stochastic transitions, allowing principled inference under uncertainty and partial observability. A modified Viterbi algorithm is introduced to handle incomplete observations, ensuring robust decoding under uncertainty. The framework is evaluated using a synthetically generated yet structurally realistic APT dataset (S\-DAPT\-2026). Simulation results show that E\-HiDNet achieves up to 98.8\-100\% accuracy in stage prediction and significantly outperforms standalone HMMs when four or more observations are available, even under reduced training data scenarios. These findings highlight that combining deep semantic feature learning with probabilistic state\-space modeling enhances predictive APT stage performance and situational awareness for proactive APT defense.

Updated: 2026-04-01 13:24:27

标题: 深度递归隐马尔可夫学习框架用于多阶段高级持续性威胁预测

摘要: 高级持久威胁(APTs)代表着隐藏的、多阶段的网络攻击，其长期持续性和适应性行为挑战着传统入侵检测系统(IDS)。尽管最近机器学习和概率建模的进步提高了对APT的检测性能，但大多数现有方法仍然是反应性的和以警报为中心的，对于阶段感知预测和在不确定性下的原则推断的能力有限，特别是在观察稀疏或不完整时。本文提出了E-HiDNet，一个统一的混合深度概率学习框架，将卷积和循环神经网络与隐马尔可夫模型(HMM)集成在一起，以允许准确预测APT攻击的进展。深度学习组件从相关的警报序列中提取了分层空间-时间表示，而HMM模型了潜在攻击阶段及其随机转换，允许在不确定性和部分可观察性下进行原则性推断。引入了修改后的Viterbi算法来处理不完整的观测，确保在不确定性下的稳健解码。该框架使用一个合成生成的但结构真实的APT数据集(S-DAPT-2026)进行评估。模拟结果表明，E-HiDNet在阶段预测方面的准确率达到了98.8-100\%，在四个或更多观察值可用时明显优于独立的HMM，即使在减少训练数据的情况下也是如此。这些发现突显了将深度语义特征学习与概率状态空间建模相结合，提高了对APT阶段性能的预测和对主动APT防御的情境意识。

更新时间: 2026-04-01 13:24:27

领域: cs.CR

下载: http://arxiv.org/abs/2601.06734v2

KUET at StanceNakba Shared Task: StanceMoE: Mixture-of-Experts Architecture for Stance Detection

Actor-level stance detection aims to determine an author expressed position toward specific geopolitical actors mentioned or implicated in a text. Although transformer-based models have achieved relatively good performance in stance classification, they typically rely on unified representations that may not sufficiently capture heterogeneous linguistic signals, such as contrastive discourse structures, framing cues, and salient lexical indicators. This motivates the need for adaptive architectures that explicitly model diverse stance-expressive patterns. In this paper, we propose StanceMoE, a context-enhanced Mixture-of-Experts (MoE) architecture built upon a fine-tuned BERT encoder for actor-level stance detection. Our model integrates six expert modules designed to capture complementary linguistic signals, including global semantic orientation, salient lexical cues, clause-level focus, phrase-level patterns, framing indicators, and contrast-driven discourse shifts. A context-aware gating mechanism dynamically weights expert contributions, enabling adaptive routing based on input characteristics. Experiments are conducted on the StanceNakba 2026 Subtask A dataset, comprising 1,401 annotated English texts where the target actor is implicit in the text. StanceMoE achieves a macro-F1 score of 94.26%, outperforming traditional baselines, and alternative BERT-based variants.

Updated: 2026-04-01 13:24:03

标题: KUET在StanceNakba共享任务中的表现: StanceMoE: 用于立场检测的专家混合架构

摘要: Actor-level stance detection旨在确定作者对文本中提到或涉及的特定地缘政治参与者表达的立场。尽管基于transformer的模型在立场分类方面取得了相对良好的性能，但它们通常依赖于统一的表示，可能无法充分捕捉异质的语言信号，如对比性话语结构、框架线索和显著的词汇指标。这促使了对明确建模多样的立场表达模式的自适应架构的需求。在本文中，我们提出了StanceMoE，这是一个基于fine-tuned BERT编码器构建的上下文增强的Mixture-of-Experts（MoE）架构，用于actor-level stance detection。我们的模型集成了六个专家模块，旨在捕捉互补的语言信号，包括全局语义方向、显著的词汇线索、从句级重点、短语级模式、框架指示符和对比驱动的话语转变。一个上下文感知的门控机制动态权衡专家的贡献，实现基于输入特征的自适应路由。我们在StanceNakba 2026 Subtask A数据集上进行了实验，该数据集包含1,401个带有目标角色隐含在文本中的英文文本。StanceMoE实现了94.26%的宏F1分数，优于传统基线和其他基于BERT的变体。

更新时间: 2026-04-01 13:24:03

领域: cs.CL,cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.00878v1

Accurate and Scalable Matrix Mechanisms via Divide and Conquer

Matrix mechanisms are often used to provide unbiased differentially private query answers when publishing statistics or creating synthetic data. Recent work has developed matrix mechanisms, such as ResidualPlanner and Weighted Fourier Factorizations, that scale to high dimensional datasets while providing optimality guarantees for workloads such as marginals and circular product queries. They operate by adding noise to a linearly independent set of queries that can compactly represent the desired workloads. In this paper, we present QuerySmasher, an alternative scalable approach based on a divide-and-conquer strategy. Given a workload that can be answered from various data marginals, QuerySmasher splits each query into sub-queries and re-assembles the pieces into mutually orthogonal sub-workloads. These sub-workloads represent small, low-dimensional problems that can be independently and optimally answered by existing low-dimensional matrix mechanisms. QuerySmasher then stitches these solutions together to answer queries in the original workload. We show that QuerySmasher subsumes prior work, like ResidualPlanner (RP), ResidualPlanner+ (RP+), and Weighted Fourier Factorizations (WFF). We prove that it can dominate those approaches, under sum squared error, for all workloads. We also experimentally demonstrate the scalability and accuracy of QuerySmasher.

Updated: 2026-04-01 13:15:41

标题: 通过分而治之实现准确且可扩展的矩阵机制

摘要: 矩阵机制通常用于在发布统计数据或创建合成数据时提供无偏的差分隐私查询答案。最近的工作已经开发出了一些矩阵机制，例如ResidualPlanner和Weighted Fourier Factorizations，这些机制适用于高维数据集，并为边际和循环乘积查询等工作负载提供了最优性保证。它们通过向能够紧凑表示所需工作负载的线性无关查询集添加噪声来运行。本文介绍了QuerySmasher，这是一种基于分治策略的可扩展替代方法。给定一个可以从各种数据边际回答的工作负载，QuerySmasher将每个查询分成子查询，并重新组装这些片段成互相正交的子工作负载。这些子工作负载代表了可以通过现有低维矩阵机制独立和最优地回答的小型低维问题。然后，QuerySmasher将这些解决方案拼接在一起，以回答原始工作负载中的查询。我们证明了QuerySmasher包含了先前的工作，如ResidualPlanner（RP）、ResidualPlanner+（RP+）和Weighted Fourier Factorizations（WFF）。我们证明它可以在所有工作负载下优于那些方法，根据平方误差之和。我们还通过实验证明了QuerySmasher的可扩展性和准确性。

更新时间: 2026-04-01 13:15:41

领域: cs.DB,cs.LG

下载: http://arxiv.org/abs/2604.00868v1

Policy Improvement Reinforcement Learning

Reinforcement Learning with Verifiable Rewards (RLVR) has become a central post-training paradigm for improving the reasoning capabilities of large language models. Yet existing methods share a common blind spot: they optimize policies based on instantaneous group-level or batch-level statistics without ever verifying whether the resulting update actually improved the model. This open-loop design -- updating in isolation at each step, guided only by within-group (batch) reward signals -- means optimization can drift or collapse with no mechanism to detect and correct these failures. We argue that the missing ingredient is policy improvement feedback: the ability to measure and optimize inter-iteration progress directly. To this end, we introduce Policy Improvement Reinforcement Learning (PIRL), a framework that replaces surrogate reward maximization with the explicit objective of maximizing cumulative policy improvement across iterations, and prove this temporal objective is perfectly aligned with maximizing final task performance. Building on PIRL, we propose Policy Improvement Policy Optimization (PIPO), which implements closed-loop optimization through retrospective verification. At each iteration, PIPO evaluates whether the previous update yielded genuine improvement against a sliding-window historical baseline, then actively reinforces beneficial updates and suppresses the harmful ones -- transforming an open-loop process into a self-correcting one. We provide theoretical analysis showing that PIPO performs ascent on the PIRL objective in expectation, and experiments on mathematical reasoning benchmarks demonstrate improved stability and performance over GRPO and its variants.

Updated: 2026-04-01 13:10:20

标题: 政策改进强化学习

摘要: 具有可验证奖励的强化学习（RLVR）已成为改善大型语言模型推理能力的中心后训练范式。然而，现有方法存在一个共同盲点：它们基于瞬时的组级别或批次级别统计数据优化策略，却从未验证结果更新是否真正改进了模型。这种开环设计--在每一步孤立地更新，仅由组内（批次）奖励信号指导--意味着优化可能会漂移或崩溃，没有机制来检测和纠正这些失败。我们认为缺失的成分是策略改进反馈：直接测量和优化迭代间的进展。为此，我们引入了政策改进强化学习（PIRL）框架，用明确的目标替代替代奖励最大化，即在迭代中最大化政策改进的累积，并证明这个时间目标与最大化最终任务性能完全一致。在PIRL的基础上，我们提出了政策改进策略优化（PIPO），通过回顾性验证实现了闭环优化。在每次迭代中，PIPO评估上一次更新是否对滑动窗口历史基线产生了真正的改进，然后积极强化有益的更新并抑制有害的更新--将开环过程转变为自我纠正的过程。我们提供理论分析，表明PIPO在期望中执行PIRL目标上升，并在数学推理基准测试中的实验证明了相对于GRPO及其变体的稳定性和性能的改进。

更新时间: 2026-04-01 13:10:20

领域: cs.LG

下载: http://arxiv.org/abs/2604.00860v1

DreamerAD: Efficient Reinforcement Learning via Latent World Model for Autonomous Driving

We introduce DreamerAD, the first latent world model framework that enables efficient reinforcement learning for autonomous driving by compressing diffusion sampling from 100 steps to 1 - achieving 80x speedup while maintaining visual interpretability. Training RL policies on real-world driving data incurs prohibitive costs and safety risks. While existing pixel-level diffusion world models enable safe imagination-based training, they suffer from multi-step diffusion inference latency (2s/frame) that prevents high-frequency RL interaction. Our approach leverages denoised latent features from video generation models through three key mechanisms: (1) shortcut forcing that reduces sampling complexity via recursive multi-resolution step compression, (2) an autoregressive dense reward model operating directly on latent representations for fine-grained credit assignment, and (3) Gaussian vocabulary sampling for GRPO that constrains exploration to physically plausible trajectories. DreamerAD achieves 87.7 EPDMS on NavSim v2, establishing state-of-the-art performance and demonstrating that latent-space RL is effective for autonomous driving.

Updated: 2026-04-01 13:02:07

标题: DreamerAD：通过潜在世界模型实现自动驾驶的高效强化学习

摘要: 我们介绍了DreamerAD，这是第一个潜在世界模型框架，通过将扩散采样从100步压缩到1步，实现了80倍的加速，同时保持了视觉可解释性，从而实现了自主驾驶的有效强化学习。在真实世界驾驶数据上训练RL策略会产生巨大的成本和安全风险。虽然现有的像素级扩散世界模型能够进行基于想象的安全训练，但它们遭受多步扩散推断延迟（每帧2秒），从而阻碍了高频率RL交互。我们的方法通过三个关键机制利用了视频生成模型中去噪的潜在特征：（1）通过递归多分辨率步骤压缩实现快捷强制，减少采样复杂性，（2）在直接操作潜在表示的自回归稠密奖励模型上进行精细化信用分配，（3）采用高斯词汇采样进行GRPO，将探索限制在物理合理的轨迹上。DreamerAD在NavSim v2上取得了87.7 EPDMS，建立了最先进的性能，并证明了潜在空间RL对于自主驾驶是有效的。

更新时间: 2026-04-01 13:02:07

领域: cs.LG,cs.RO

下载: http://arxiv.org/abs/2603.24587v2

NES: An Instruction-Free, Low-Latency Next Edit Suggestion Framework Powered by Learned Historical Editing Trajectories

Code editing is a frequent yet cognitively demanding task in software development. Existing AI-powered tools often disrupt developer flow by requiring explicit natural language instructions and suffer from high latency, limiting real-world usability. We present NES (Next Edit Suggestion), an instruction-free, low-latency code editing framework that leverages learned historical editing trajectories to implicitly capture developers' goals and coding habits. NES features a dual-model architecture: one model predicts the next edit location and the other generates the precise code change, both without any user instruction. Trained on our open-sourced SFT and DAPO datasets, NES achieves state-of-the-art performance (75.6% location accuracy, 27.7% exact match rate) while delivering suggestions in under 250ms. Deployed at Ant Group, NES serves over 20,000 developers through a seamless Tab-key interaction, achieving effective acceptance rates of 51.55% for location predictions and 43.44% for edits, demonstrating its practical impact in real-world development workflows.

Updated: 2026-04-01 13:01:41

标题: NES：一种无需指令、低延迟的下一个编辑建议框架，由学习历史编辑轨迹驱动。

摘要: 代码编辑是软件开发中频繁但认知要求较高的任务。现有的基于人工智能的工具通常通过要求明确的自然语言指令来打断开发者的工作流程，并且受到高延迟的限制，限制了实际可用性。我们提出了NES（Next Edit Suggestion），一个无需指令、低延迟的代码编辑框架，利用学习到的历史编辑轨迹来隐式捕捉开发者的目标和编码习惯。NES具有双模型架构：一个模型预测下一个编辑位置，另一个生成精确的代码更改，两者都不需要任何用户指令。在我们开源的SFT和DAPO数据集上进行训练，NES实现了最先进的性能（75.6％的位置准确性，27.7％的完全匹配率），同时在不到250ms的时间内提供建议。在蚂蚁集团部署后，NES通过无缝的Tab键交互为超过20,000名开发人员提供服务，实现了51.55％的位置预测的有效接受率和43.44％的编辑接受率，展示了其在实际开发工作流程中的实际影响。

更新时间: 2026-04-01 13:01:41

领域: cs.SE,cs.LG

下载: http://arxiv.org/abs/2508.02473v3

On the Non-Identifiability of Steering Vectors in Large Language Models

Activation steering methods are widely used to control large language model (LLM) behavior and are often interpreted as revealing meaningful internal representations. This interpretation assumes that steering directions are identifiable and uniquely recoverable from input-output behavior. We show that, under white-box single-layer access, steering vectors are fundamentally non-identifiable due to large equivalence classes of behaviorally indistinguishable interventions. Empirically, we find that orthogonal perturbations achieve near-equivalent efficacy with negligible effect sizes across multiple models and traits, with pre-trained semantic classifiers confirming equivalence at the output level. We estimate null-space dimensionality via SVD of activation covariance matrices and validate that equivalence holds robustly throughout the operationally relevant steering range. Critically, we show that non-identifiability is a robust geometric property that persists across diverse prompt distributions. These findings reveal fundamental interpretability limits and highlight the need for structural constraints beyond behavioral testing to enable reliable alignment interventions.

Updated: 2026-04-01 13:00:23

标题: 关于大型语言模型中引导向量的不可辨识性

摘要: 激活导向方法被广泛应用于控制大型语言模型（LLM）的行为，并经常被解释为揭示有意义的内部表示。这种解释假设导向方向是可识别的，并可从输入输出行为中唯一恢复。我们表明，在白盒单层访问下，由于行为上无法区分的干预的大等价类，导向向量基本上是不可识别的。从经验上讲，我们发现正交扰动在多个模型和特征之间实现了几乎等效的功效，且效果大小可以忽略不计，经过预训练的语义分类器确认了输出水平的等效性。我们通过对激活协方差矩阵进行奇异值分解来估计零空间维度，并验证等效性在操作上相关的导向范围内持续稳固。至关重要的是，我们表明不可识别性是一种持久存在于各种提示分布中的稳固几何属性。这些发现揭示了基本的可解释性限制，并强调了除行为测试以外的结构约束的必要性，以实现可靠的对齐干预。

更新时间: 2026-04-01 13:00:23

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2602.06801v4

Fair Indivisible Payoffs through Shapley Value

We consider the problem of payoff division in indivisible coalitional games, where the value of the grand coalition is a natural number. This number represents a certain quantity of indivisible objects, such as parliamentary seats, kidney exchanges, or top features contributing to the outcome of a machine learning model. The goal of this paper is to propose a fair method for dividing these objects among players. To achieve this, we define the indivisible Shapley value and study its properties. We demonstrate our proposed technique using three case studies, in particular, we use it to identify key regions of an image in the context of an image classification task.

Updated: 2026-04-01 12:59:10

标题: 通过谢普利价值实现公平不可分割的回报

摘要: 我们考虑不可分割的联盟游戏中的收益分配问题，其中大联盟的价值是一个自然数。这个数字代表一定数量的不可分割物体，比如议会席位、肾脏交换或对机器学习模型结果产生影响的顶级特征。本文的目标是提出一种公平的方法来在玩家之间分配这些物体。为了实现这一目标，我们定义了不可分割的Shapley值并研究了其性质。我们使用三个案例研究来展示我们提出的技术，特别是我们将其用于在图像分类任务中识别图像的关键区域。

更新时间: 2026-04-01 12:59:10

领域: cs.GT,cs.AI

下载: http://arxiv.org/abs/2510.24906v2

Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

As teachers increasingly turn to GenAI in their educational practice, we need robust methods to benchmark large language models (LLMs) for pedagogical purposes. This article presents an embedding-based benchmarking framework to detect bias in LLMs in the context of formative feedback. Using 600 authentic student essays from the AES 2.0 corpus, we constructed controlled counterfactuals along two dimensions: (i) implicit cues via lexicon-based swaps of gendered terms within essays, and (ii) explicit cues via gendered author background in the prompt. We investigated six representative LLMs (i.e. GPT-5 mini, GPT-4o mini, DeepSeek-R1, DeepSeek-R1-Qwen, Gemini 2.5 Pro, Llama-3-8B). We first quantified the response divergence with cosine and Euclidean distances over sentence embeddings, then assessed significance via permutation tests, and finally, visualised structure using dimensionality reduction. In all models, implicit manipulations reliably induced larger semantic shifts for male-female counterfactuals than for female-male. Only the GPT and Llama models showed sensitivity to explicit gender cues. These findings show that even state-of-the-art LLMs exhibit asymmetric semantic responses to gender substitutions, suggesting persistent gender biases in feedback they provide learners. Qualitative analyses further revealed consistent linguistic differences (e.g., more autonomy-supportive feedback under male cues vs. more controlling feedback under female cues). We discuss implications for fairness auditing of pedagogical GenAI, propose reporting standards for counterfactual evaluation in learning analytics, and outline practical guidance for prompt design and deployment to safeguard equitable feedback.

Updated: 2026-04-01 12:58:17

标题: 使用分析来对比教育LLM的标准：一个关于反馈性别偏见的案例研究

摘要: 随着教师在教育实践中越来越多地转向GenAI，我们需要强大的方法来为教学目的基准大型语言模型（LLMs）。本文提出了一种基于嵌入的基准框架，用于检测LLMs在形成性反馈背景下的偏见。使用来自AES 2.0语料库的600篇真实学生论文，我们沿两个维度构建了受控反事实情况：（i）通过基于词汇的性别术语交换在论文中隐含的线索，以及（ii）通过提示中的性别化作者背景的明确线索。我们调查了六种代表性的LLMs（即GPT-5 mini、GPT-4o mini、DeepSeek-R1、DeepSeek-R1-Qwen、Gemini 2.5 Pro、Llama-3-8B）。我们首先通过余弦和欧几里德距离量化了句子嵌入的响应差异，然后通过置换测试评估了显著性，并最终通过降维可视化了结构。在所有模型中，隐含操作可可靠地导致男女反事实情况比女男反事实情况产生更大的语义转变。只有GPT和Llama模型对明确的性别线索显示出敏感性。这些发现表明，即使是最先进的LLMs也表现出对性别替换的不对称语义响应，表明它们提供给学习者的反馈中存在持久的性别偏见。定性分析进一步揭示了一致的语言差异（例如，在男性线索下更具自主支持性反馈，而在女性线索下更具控制性反馈）。我们讨论了对教育GenAI的公平审计的影响，提出了学习分析中反事实评估的报告标准，并概述了促进公平反馈的提示设计和部署的实用指导。

更新时间: 2026-04-01 12:58:17

领域: cs.CL,cs.AI,cs.CY,cs.HC

下载: http://arxiv.org/abs/2511.08225v2

Disentanglement of Sources in a Multi-Stream Variational Autoencoder

Variational autoencoders (VAEs) are among leading approaches to address the problem of learning disentangled representations. Typically a single VAE is used and disentangled representations are sought within its single continuous latent space. In this paper, we propose and provide a proof of concept for a novel Multi-Stream Variational Autoencoder (MS-VAE) that achieves disentanglement of sources by combining discrete and continuous latents. The discrete latents are used in an explicit source combination model, that superimposes a set of sources as part of the MS-VAE decoder. We formally define the MS-VAE approach, derive its inference and learning equations, and numerically investigate its principled functionality. The MS-VAE model is very flexible and can be trained using little supervision (we use fully unsupervised learning after pretraining with some labels). In our numerical experiments, we explored the ability of the MS-VAE approach in separating both superimposed hand-written digits as well as sound sources. For the former task we used superimposed MNIST digits (an increasingly common benchmark). For sound separation, our experiments focused on the task of speaker diarization in a recording conversation between two speakers. In all cases, we observe a clear separation of sources and competitive performance after training. For digit superpositions, performance is particularly competitive in complex mixtures (e.g., three and four digits). For the speaker diarization task, we observe an especially low rate of missed speakers and a more precise speaker attribution. Numerical experiments confirm the flexibility of the approach across varying amounts of supervision, and we observed high performance, e.g., when using just 10% of the labels for pretraining.

Updated: 2026-04-01 12:55:51

标题: 多流变分自编码器中源的分离

摘要: 变分自动编码器（VAEs）是解决学习解耦表示问题的领先方法之一。通常使用单个VAE，并在其单个连续潜在空间中寻找解耦表示。在本文中，我们提出并提供了一个新颖的多流变分自动编码器（MS-VAE）的概念验证，通过结合离散和连续潜在变量实现源的解耦。离散潜在变量在明确的源组合模型中使用，将一组源叠加在MS-VAE解码器的一部分中。我们正式定义了MS-VAE方法，推导了其推理和学习方程，并在数值上研究了其原则性功能。MS-VAE模型非常灵活，可以在很少监督的情况下进行训练（我们在使用一些标签预训练后采用完全无监督学习）。在我们的数值实验中，我们探索了MS-VAE方法在分离叠加手写数字和声源方面的能力。对于前者任务，我们使用叠加的MNIST数字（这是一个越来越常见的基准）。对于声音分离，我们的实验重点放在了两位发言人之间的录音对话中的说话人辨识任务上。在所有情况下，我们观察到源的清晰分离和训练后的竞争性性能。对于数字叠加，性能在复杂混合（例如三到四个数字）中特别具有竞争力。对于说话人辨识任务，我们观察到漏报发言人的比率特别低，并且说话人归因更加准确。数值实验证实了该方法在不同程度监督下的灵活性，并且我们观察到高性能，例如在预训练时仅使用10％的标签时。

更新时间: 2026-04-01 12:55:51

领域: stat.ML,cs.LG

下载: http://arxiv.org/abs/2510.15669v2

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Proactive agents that anticipate user needs and autonomously execute tasks hold great promise as digital assistants, yet the lack of realistic user simulation frameworks hinders their development. Existing approaches model apps as flat tool-calling APIs, failing to capture the stateful and sequential nature of user interaction in digital environments and making realistic user simulation infeasible. We introduce Proactive Agent Research Environment (Pare), a framework for building and evaluating proactive agents in digital environments. Pare models applications as finite state machines with stateful navigation and state-dependent action space for the user simulator, enabling active user simulation. Building on this foundation, we present Pare-Bench, a benchmark of 143 diverse tasks spanning communication, productivity, scheduling, and lifestyle apps, designed to test context observation, goal inference, intervention timing, and multi-app orchestration.

Updated: 2026-04-01 12:53:01

标题: 积极主动代理研究环境：模拟活跃用户以评估积极主动助手

摘要: 积极主动地预测用户需求并自主执行任务的代理人作为数字助理具有巨大的潜力，然而缺乏现实的用户模拟框架阻碍了它们的发展。现有方法将应用程序建模为平面工具调用API，未能捕捉数字环境中用户交互的有状态和顺序性质，使得实现真实的用户模拟变得不可行。我们介绍了Proactive Agent Research Environment (Pare)，这是一个在数字环境中构建和评估积极主动代理的框架。Pare将应用程序建模为具有有状态导航和状态相关操作空间的有限状态机，用于用户模拟器，实现了积极主动的用户模拟。在此基础上，我们提出了Pare-Bench，这是一个涵盖通信、生产力、日程安排和生活方式应用程序的143个多样化任务的基准，旨在测试上下文观察、目标推断、干预时机和多应用程序协调。

更新时间: 2026-04-01 12:53:01

领域: cs.AI,cs.LG,cs.MA

下载: http://arxiv.org/abs/2604.00842v1

MCMC-Correction of Score-Based Diffusion Models for Model Composition

Diffusion models can be parameterized in terms of either score or energy function. The energy parameterization is attractive as it enables sampling procedures such as Markov Chain Monte Carlo (MCMC) that incorporates a Metropolis--Hastings (MH) correction step based on energy differences between proposed samples. Such corrections can significantly improve sampling quality, particularly in the context of model composition, where pre-trained models are combined to generate samples from novel distributions. Score-based diffusion models, on the other hand, are more widely adopted and come with a rich ecosystem of pre-trained models. However, they do not, in general, define an underlying energy function, making MH-based sampling inapplicable. In this work, we address this limitation by retaining score parameterization and introducing a novel MH-like acceptance rule based on line integration of the score function. This allows the reuse of existing diffusion models while still combining the reverse process with various MCMC techniques, viewed as an instance of annealed MCMC. Through experiments on synthetic and real-world data, we show that our MH-like samplers {yield relative improvements of similar magnitude to those observed} with energy-based models, without requiring explicit energy parameterization.

Updated: 2026-04-01 12:50:55

标题: MCMC对基于分数的扩散模型进行模型组合修正

摘要: 扩散模型可以通过得分或能量函数来参数化。能量参数化具有吸引力，因为它可以实现采样过程，例如基于能量差异的Metropolis-Hastings（MH）校正步骤的马尔可夫链蒙特卡罗（MCMC）。这些校正可以显著改善采样质量，特别是在模型组合的上下文中，其中预训练模型被组合以从新颖分布中生成样本。另一方面，基于得分的扩散模型更广泛地被采用，并且具有丰富的预训练模型生态系统。然而，它们通常不定义基础能量函数，使得基于MH的采样不适用。在这项工作中，我们通过保留得分参数化并引入基于得分函数的线积分的新型类似MH的接受规则来解决这个限制。这允许重复使用现有的扩散模型，同时将反向过程与各种MCMC技术相结合，被视为一个模拟退火MCMC的实例。通过对合成和真实世界数据的实验，我们展示了我们的类似MH的采样器与基于能量的模型相比，可以获得类似幅度的相对改进，而无需明确的能量参数化。

更新时间: 2026-04-01 12:50:55

领域: stat.ML,cs.LG

下载: http://arxiv.org/abs/2307.14012v4

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical key for each query through a lightweight indexer, then computing attention only on the selected subset. While the downstream sparse attention itself scales favorably, the indexer must still scan the entire prefix for every query, introducing an per-layer bottleneck that grows prohibitively with context length. We propose HISA (Hierarchical Indexed Sparse Attention), a plug-and-play replacement for the indexer that rewrites the search path from a flat token scan into a two-stage hierarchical procedure: (1) a block-level coarse filtering stage that scores pooled block representations to discard irrelevant regions, followed by (2) a token-level refinement stage that applies the original indexer exclusively within the retained candidate blocks. HISA preserves the identical token-level top-sparse pattern consumed by the downstream Sparse MLA operator and requires no additional training. On kernel-level benchmarks, HISA achieves up to speedup at 64K context. On Needle-in-a-Haystack and LongBench, we directly replace the indexer in DeepSeek-V3.2 and GLM-5 with our HISA indexer, without any finetuning. HISA closely matches the original DSA in quality, while substantially outperforming block-sparse baselines.

Updated: 2026-04-01 12:42:03

标题: HISA：用于细粒度稀疏关注的高效分层索引

摘要: Token级稀疏注意机制，例如DeepSeek Sparse Attention（DSA），通过一个轻量级的索引器为每个查询为每个历史键评分，实现了细粒度的关键选择，然后仅在所选子集上计算注意力。虽然下游的稀疏注意机制本身具有良好的可扩展性，但索引器仍然必须扫描每个查询的整个前缀，引入一个随着上下文长度增加而急剧增加的每层瓶颈。我们提出了HISA（Hierarchical Indexed Sparse Attention），这是索引器的一个即插即用替代品，它将搜索路径从一个平面令牌扫描重写为一个两阶段的分层过程：（1）一个块级粗筛选阶段，评分池化块表示以丢弃无关区域，随后（2）一个令牌级细化阶段，它在保留的候选块中独占地应用原始索引器。HISA保留了下游Sparse MLA操作符所需的相同的令牌级顶层稀疏模式，并且不需要额外的训练。在内核级基准测试中，HISA在64K上下文中实现了速度提升。在“寻找针在草堆中”和LongBench上，我们直接用我们的HISA索引器替换了DeepSeek-V3.2和GLM-5中的索引器，而无需任何微调。HISA在质量上与原始DSA相匹配，同时明显优于块稀疏基线。

更新时间: 2026-04-01 12:42:03

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2603.28458v2

Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

Test-Time Learning (TTL) enables language agents to iteratively refine their performance through repeated interactions with the environment at inference time. At the core of TTL is an adaptation policy that updates the actor policy based on experience from previous episodes, thereby improving future behavior. Existing methods rely on fixed, hand-crafted adaptation policies rather than optimizing them for downstream improvement. We argue that optimal adaptation policies should be learned from task environments, not hand-engineered based on human intuition. To achieve this, we introduce Meta-TTL, a framework that formulates the discovery of effective adaptation policies as a bi-level optimization problem. Within this framework, the inner loop executes the standard TTL process, measuring how effectively a candidate adaptation policy helps an agent correct errors across sequential episodes. Guided by the agent's performance, the outer loop employs evolutionary search over a diverse distribution of training tasks to iteratively refine the adaptation policy. We evaluate Meta-TTL on Jericho and WebArena-Lite across both in-distribution (ID) and out-of-distribution (OOD) settings, using multiple meta-agent backbones. Results on both benchmarks show that Meta-TTL consistently outperforms hand-crafted baselines, suggesting that the optimized adaptation policy encodes transferable strategies that generalize beyond the training task distribution.

Updated: 2026-04-01 12:41:01

标题: 学习测试时学习：具有可学习适应策略的语言代理

摘要: 测试时间学习（TTL）使语言代理能够通过与环境在推理时间的重复交互来迭代地改进其性能。 TTL的核心是一个适应性策略，根据先前情节的经验更新执行者策略，从而改善未来行为。现有方法依赖于固定的、手工制作的适应性策略，而不是为了下游改进而优化它们。我们认为，最佳的适应性策略应该是从任务环境中学习的，而不是基于人类直觉手工设计的。为了实现这一目标，我们引入了Meta-TTL，一个框架，将有效适应性策略的发现形式化为一个双层优化问题。在这个框架内，内部循环执行标准的TTL过程，衡量候选适应性策略如何有效地帮助代理在顺序情节中纠正错误。在代理的性能指导下，外部循环通过对各种训练任务的进化搜索来迭代地完善适应性策略。我们在Jericho和WebArena-Lite上评估了Meta-TTL，跨内分布（ID）和外分布（OOD）设置，使用多个元代理骨干。在两个基准测试中的结果显示，Meta-TTL始终优于手工制作的基线，表明优化的适应性策略编码了可以泛化到超出训练任务分布的可转移策略。

更新时间: 2026-04-01 12:41:01

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.00830v1

Optimal Brain Decomposition for Accurate LLM Low-Rank Approximation

Low-rank decomposition has emerged as an important problem in Large Language Model (LLM) fine-tuning and inference. Through Singular Value Decomposition (SVD), the weight matrix can be factorized into low-rank spaces optimally. Previously, a common practice was to decompose the weight in the activation-whitened space, and then achieve satisfying results. In this work, we propose Optimal Brain Decomposition LLM (OBD-LLM), which studies the decomposition problem in the model space by utilizing second-order Hessian information. Through a rigorous Kronecker-factorization of the Hessian, we show that the decomposition needs to consider both input and output information of the layer, and achieves much better decomposition results compared to input only method. Our loss-aware decomposition method involves a bi-directional whitening on the weight matrix. As a result, OBD-LLM is a closed-form solution for the optimal decomposition of weights in the language model. Remarkably, we achieve ~20-40\% better results than previous state-of-the-art decomposition methods, the SVD-LLM.

Updated: 2026-04-01 12:28:43

标题: 精确LLM低秩逼近的最佳大脑分解

摘要: 低秩分解已成为大型语言模型（LLM）微调和推理中的一个重要问题。通过奇异值分解（SVD），权重矩阵可以被最优地分解为低秩空间。先前的常见做法是在激活白化空间中分解权重，然后取得令人满意的结果。在这项工作中，我们提出了最优脑分解LLM（OBD-LLM），通过利用二阶Hessian信息在模型空间中研究分解问题。通过对Hessian进行严格的Kronecker分解，我们表明分解需要考虑层的输入和输出信息，并与仅考虑输入的方法相比实现了更好的分解结果。我们的损失感知分解方法涉及对权重矩阵进行双向白化。因此，OBD-LLM是语言模型中权重最优分解的闭式解。值得注意的是，我们比之前最先进的分解方法SVD-LLM取得了约20-40\%更好的结果。

更新时间: 2026-04-01 12:28:43

领域: cs.LG

下载: http://arxiv.org/abs/2604.00821v1

Emotion Entanglement and Bayesian Inference for Multi-Dimensional Emotion Understanding

Understanding emotions in natural language is inherently a multi-dimensional reasoning problem, where multiple affective signals interact through context, interpersonal relations, and situational cues. However, most existing emotion understanding benchmarks rely on short texts and predefined emotion labels, reducing this process to independent label prediction and ignoring the structured dependencies among emotions. To address this limitation, we introduce Emotional Scenarios (EmoScene), a theory-grounded benchmark of 4,731 context-rich scenarios annotated with an 8-dimensional emotion vector derived from Plutchik's basic emotions. We evaluate six instruction-tuned large language models in a zero-shot setting and observe modest performance, with the best model achieving a Macro F1 of 0.501, highlighting the difficulty of context-aware multi-label emotion prediction. Motivated by the observation that emotions rarely occur independently, we further propose an entanglement-aware Bayesian inference framework that incorporates emotion co-occurrence statistics to perform joint posterior inference over the emotion vector. This lightweight post-processing improves structural consistency of predictions and yields notable gains for weaker models (e.g., +0.051 Macro F1 for Qwen2.5-7B). EmoScene therefore provides a challenging benchmark for studying multi-dimensional emotion understanding and the limitations of current language models.

Updated: 2026-04-01 12:27:04

标题: 情感纠缠和贝叶斯推断用于多维情感理解

摘要: 理解自然语言中的情绪在本质上是一个多维推理问题，其中多个情感信号通过上下文、人际关系和情境暗示相互作用。然而，大多数现有的情绪理解基准依赖于短文本和预定义的情绪标签，将这一过程简化为独立标签预测，并忽略了情绪之间的结构化依赖关系。为了解决这一局限性，我们引入了Emotional Scenarios（EmoScene），这是一个基于理论的基准，包含4,731个注释丰富的情境，其情绪向量由普拉切克的基本情绪导出并包括8个维度。我们在零-shot设置中评估了六个经过指导调整的大型语言模型，并观察到适度的性能，最佳模型实现了0.501的宏F1，突显了上下文感知多标签情绪预测的困难。受到情绪很少独立发生的观察启发，我们进一步提出了一种考虑纠缠感知的贝叶斯推理框架，该框架将情绪共现统计信息纳入其中，以对情绪向量进行联合后验推断。这种轻量级的后处理提高了预测的结构一致性，并为较弱的模型带来了显著的增益（例如Qwen2.5-7B的+0.051宏F1）。因此，EmoScene为研究多维情绪理解和当前语言模型的局限性提供了一个具有挑战性的基准。

更新时间: 2026-04-01 12:27:04

领域: cs.CL,cs.AI

下载: http://arxiv.org/abs/2604.00819v1

E-Scores for (In)Correctness Assessment of Generative Model Outputs

While generative models, especially large language models (LLMs), are ubiquitous in today's world, principled mechanisms to assess their (in)correctness are limited. Using the conformal prediction framework, previous works construct sets of LLM responses where the probability of including an incorrect response, or error, is capped at a user-defined tolerance level. However, since these methods are based on p-values, they are susceptible to p-hacking, i.e., choosing the tolerance level post-hoc can invalidate the guarantees. We therefore leverage e-values to complement generative model outputs with e-scores as measures of incorrectness. In addition to achieving the guarantees as before, e-scores further provide users with the flexibility of choosing data-dependent tolerance levels while upper bounding size distortion, a post-hoc notion of error. We experimentally demonstrate their efficacy in assessing LLM outputs under different forms of correctness: mathematical factuality and property constraints satisfaction.

Updated: 2026-04-01 12:26:50

标题: E-Scores用于生成模型输出的（不）正确性评估

摘要: 尽管生成模型，尤其是大型语言模型（LLMs），在今天的世界中随处可见，但评估它们的（不）正确性的原则性机制有限。利用符合预测框架，先前的研究构建了一组LLM响应集，其中包含错误响应的概率被限制在用户定义的容忍水平。然而，由于这些方法基于p值，它们容易受到p值篡改的影响，即事后选择容忍水平可能会使保证失效。因此，我们利用e值来补充生成模型输出，将e分数作为不正确性的度量。除了实现之前的保证外，e分数还为用户提供了选择数据依赖的容忍水平的灵活性，同时上限了大小失真，这是事后的错误概念。我们在数学事实和属性约束满足等不同正确性形式下，通过实验证明了它们在评估LLM输出方面的有效性。

更新时间: 2026-04-01 12:26:50

领域: stat.ML,cs.AI,cs.LG

下载: http://arxiv.org/abs/2510.25770v2

Demystifying Chains, Trees, and Graphs of Thoughts

The field of natural language processing (NLP) has witnessed significant progress in recent years, with a notable focus on improving large language models' (LLM) performance through innovative prompting techniques. Among these, prompt engineering coupled with structures has emerged as a promising paradigm, with designs such as Chain-of-Thought, Tree of Thoughts, or Graph of Thoughts, in which the overall LLM reasoning is guided by a structure such as a graph. As illustrated with numerous examples, this paradigm significantly enhances the LLM's capability to solve numerous tasks, ranging from logical or mathematical reasoning to planning or creative writing. To facilitate the understanding of this growing field and pave the way for future developments, we devise a general blueprint for effective and efficient LLM reasoning schemes. For this, we conduct an in-depth analysis of the prompt execution pipeline, clarifying and clearly defining different concepts. We then build the first taxonomy of structure-enhanced LLM reasoning schemes. We focus on identifying fundamental classes of harnessed structures, and we analyze the representations of these structures, algorithms executed with these structures, and many others. We refer to these structures as reasoning topologies, because their representation becomes to a degree spatial, as they are contained within the LLM context. Our study compares existing prompting schemes using the proposed taxonomy, discussing how certain design choices lead to different patterns in performance and cost. We also outline theoretical underpinnings, relationships between prompting and other parts of the LLM ecosystem such as knowledge bases, and the associated research challenges. Our work will help to advance future prompt engineering techniques.

Updated: 2026-04-01 12:26:18

标题: 解密思维中的链条、树形和图形

摘要: 自然语言处理（NLP）领域近年来取得了显著进展，特别关注通过创新的提示技术提高大型语言模型（LLM）的性能。在这些技术中，提示工程与结构相结合的范式已经成为一种有前途的模式，设计如Chain-of-Thought、Tree of Thoughts或Graph of Thoughts等，其中整体LLM推理受到图形等结构的引导。正如众多示例所展示的那样，这种范式显著增强了LLM解决众多任务的能力，从逻辑或数学推理到规划或创意写作。为了促进对这一不断发展领域的理解，并为未来的发展铺平道路，我们设计了一种有效和高效的LLM推理方案的通用蓝图。为此，我们进行了对提示执行流程的深入分析，澄清并清晰定义了不同概念。然后，我们建立了第一个结构增强型LLM推理方案的分类法。我们专注于识别利用结构的基本类别，并分析这些结构的表示方式、使用这些结构执行的算法等。我们将这些结构称为推理拓扑，因为它们的表示在某种程度上变得空间化，它们被包含在LLM上下文中。我们的研究使用提出的分类法比较现有的提示方案，讨论特定设计选择如何导致性能和成本的不同模式。我们还概述了理论基础、提示与LLM生态系统中其他部分（如知识库）之间的关系，以及相关研究挑战。我们的工作将有助于推动未来的提示工程技术。

更新时间: 2026-04-01 12:26:18

领域: cs.CL,cs.AI,cs.LG

下载: http://arxiv.org/abs/2401.14295v6

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

End-to-end autonomous driving has evolved from the conventional paradigm based on sparse perception into vision-language-action (VLA) models, which focus on learning language descriptions as an auxiliary task to facilitate planning. In this paper, we propose an alternative Vision-Geometry-Action (VGA) paradigm that advocates dense 3D geometry as the critical cue for autonomous driving. As vehicles operate in a 3D world, we think dense 3D geometry provides the most comprehensive information for decision-making. However, most existing geometry reconstruction methods (e.g., DVGT) rely on computationally expensive batch processing of multi-frame inputs and cannot be applied to online planning. To address this, we introduce a streaming Driving Visual Geometry Transformer (DVGT-2), which processes inputs in an online manner and jointly outputs dense geometry and trajectory planning for the current frame. We employ temporal causal attention and cache historical features to support on-the-fly inference. To further enhance efficiency, we propose a sliding-window streaming strategy and use historical caches within a certain interval to avoid repetitive computations. Despite the faster speed, DVGT-2 achieves superior geometry reconstruction performance on various datasets. The same trained DVGT-2 can be directly applied to planning across diverse camera configurations without fine-tuning, including closed-loop NAVSIM and open-loop nuScenes benchmarks.

Updated: 2026-04-01 12:21:26

标题: DVGT-2：大规模自动驾驶的视觉-几何-行为模型

摘要: 自动驾驶端到端已经从传统的基于稀疏感知的范式发展为以视觉-语言-动作（VLA）模型为中心，这些模型侧重于学习语言描述作为辅助任务以促进规划。在本文中，我们提出了一种另类的视觉-几何-动作（VGA）范式，主张密集的3D几何作为自动驾驶的关键线索。由于车辆在3D世界中运行，我们认为密集的3D几何提供了最全面的决策信息。然而，大多数现有的几何重建方法（如DVGT）依赖于计算昂贵的多帧输入的批处理，不能应用于在线规划。为了解决这个问题，我们引入了一个流式驾驶视觉几何变换器（DVGT-2），它以在线方式处理输入，并同时输出当前帧的密集几何和轨迹规划。我们采用时间因果关注机制和缓存历史特征来支持即时推理。为了进一步提高效率，我们提出了滑动窗口流式策略，并在一定间隔内使用历史缓存以避免重复计算。尽管速度更快，DVGT-2在各种数据集上实现了优越的几何重建性能。同一训练过的DVGT-2可以直接应用于跨越多种摄像头配置的规划，无需微调，包括闭环NAVSIM和开环nuScenes基准。

更新时间: 2026-04-01 12:21:26

领域: cs.CV,cs.AI,cs.RO

下载: http://arxiv.org/abs/2604.00813v1

Cost-Penalized Fitness in FMA-Orchestrated Mixture of Experts: Experimental Evidence for Molecular Memory in Domain Adaptation

We present experimental results from seven controlled runs of nanoFMT, a Free-Market Algorithm (FMA) orchestrated transformer with dynamic Mixture-of-Experts (MoE) management. The experiments address a fundamental question for advanced LLM development: how should an MoE system manage its expert pool when operating at full capacity under changing data distributions? We demonstrate that cost-penalized fitness metrics, combined with a linear grace period for newborn experts, produce a system that accumulates domain expertise through diversification rather than replacement. The central result is a round-trip domain shift experiment showing 9-11x faster recovery when returning to a previously learned domain, with zero expert births or replacements required. This "molecular memory" effect -- where dormant experts survive and reactivate when their domain returns -- has no analogue in current MoE management approaches. A preliminary cost analysis estimates annual savings of $39.1M and 27.1 GWh energy reduction for an OpenAI-scale provider under a moderate scenario.

Updated: 2026-04-01 12:19:57

标题: FMA-Orchestrated Mixture of Experts中的成本惩罚适应度：领域适应中分子记忆的实验证据

摘要: 我们介绍了来自七个受控运行的nanoFMT的实验结果，这是一个具有动态专家混合管理的自由市场算法（FMA）编排的变压器。这些实验涉及一个关于先进LLM发展的基本问题：当MoE系统在不断变化的数据分布下以满负荷运行时，该系统应该如何管理其专家池？我们证明，结合新生专家的线性宽限期和成本惩罚的适应性指标，可以产生一个通过多样化而不是替换积累领域专业知识的系统。中心结果是一个往返领域转移实验，显示回到先前学习的领域时，恢复速度比以前快9-11倍，不需要任何新生专家或替换。这种“分子记忆”效应——即沉睡的专家在其领域回归时存活并重新激活——在当前MoE管理方法中没有类似物。初步成本分析估计，在一个中等情景下，一个OpenAI规模的提供商每年可节省3910万美元，并减少27.1 GWh的能源消耗。

更新时间: 2026-04-01 12:19:57

领域: cs.LG

下载: http://arxiv.org/abs/2604.00812v1

Deconfounding Scores and Representation Learning for Causal Effect Estimation with Weak Overlap

Overlap, also known as positivity, is a key condition for causal treatment effect estimation. Many popular estimators suffer from high variance and become brittle when features differ strongly across treatment groups. This is especially challenging in high dimensions: the curse of dimensionality can make overlap implausible. To address this, we propose a class of feature representations called deconfounding scores, which preserve both identification and the target of estimation; the classical propensity and prognostic scores are two special cases. We characterize the problem of finding a representation with better overlap as minimizing an overlap divergence under a deconfounding score constraint. We then derive closed-form expressions for a class of deconfounding scores under a broad family of generalized linear models with Gaussian features and show that prognostic scores are overlap-optimal within this class. We conduct extensive experiments to assess this behavior empirically.

Updated: 2026-04-01 12:19:42

标题: 解混淆分数和表示学习用于具有弱重叠的因果效应估计

摘要: 重叠，也称为积极性，是因果治疗效应估计的关键条件。许多流行的估计器在特征在治疗组之间差异较大时会产生高方差并变得脆弱。这在高维度中尤其具有挑战性：维度的诅咒可能会使重叠变得不太可能。为了解决这个问题，我们提出了一类称为去混淆分数的特征表示，它们既保持了识别性，又保持了估计目标；经典的倾向和预后分数是其中的两种特殊情况。我们将找到一个具有更好重叠的表示的问题描述为在去混淆分数约束下最小化重叠分歧。然后我们推导出一类在广泛的广义线性模型中具有高斯特征的去混淆分数的闭合表达式，并展示预后分数在这一类中是具有最佳重叠性的。我们进行了大量实验以在实践中评估这种行为。

更新时间: 2026-04-01 12:19:42

领域: stat.ML,cs.LG,stat.ME

下载: http://arxiv.org/abs/2604.00811v1

Binned semiparametric Bayesian networks for efficient kernel density estimation

This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed for the new binned semiparametric Bayesian networks, the sparse binned kernel density estimation and the Fourier kernel density estimation. These two probability distributions address the curse of dimensionality, which typically impacts binned models, by using sparse tensors and restricting the number of parent nodes in conditional probability calculations. To evaluate the proposal, we perform a complexity analysis and conduct several comparative experiments using synthetic data and datasets from the UCI Machine Learning repository. The experiments include different binning rules, parent restrictions, grid sizes, and number of instances to get a holistic view of the model's behavior. As a result, our binned semiparametric Bayesian networks achieve structural learning and log-likelihood estimations with no statistically significant differences compared to the semiparametric Bayesian networks, but at a much higher speed. Thus, the new binned semiparametric Bayesian networks prove to be a reliable and more efficient alternative to their non-binned counterparts.

Updated: 2026-04-01 12:12:32

标题: 分箱半参数贝叶斯网络用于高效核密度估计

摘要: 本文介绍一种新型的概率半参数模型，利用数据分箱来降低非参数分布中核密度估计的计算成本。为新的分箱半参数贝叶斯网络开发了两种新的条件概率分布，稀疏分箱核密度估计和傅里叶核密度估计。这两种概率分布通过使用稀疏张量和限制条件概率计算中的父节点数量来解决通常影响分箱模型的维度灾难。为了评估这一提议，我们进行了复杂性分析，并使用合成数据和来自UCI机器学习知识库的数据集进行了几个比较实验。实验包括不同的分箱规则、父节点限制、网格大小和实例数量，以全面了解模型的行为。结果表明，我们的分箱半参数贝叶斯网络在结构学习和对数似然估计方面与半参数贝叶斯网络相比没有统计上显著差异，但速度要快得多。因此，新的分箱半参数贝叶斯网络被证明是其非分箱对应物的可靠和更有效的替代方案。

更新时间: 2026-04-01 12:12:32

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2506.21997v3

Routing-Free Mixture-of-Experts

Standard Mixture-of-Experts (MoE) models rely on centralized routing mechanisms that introduce rigid inductive biases. We propose Routing-Free MoE which eliminates any hard-coded centralized designs including external routers, Softmax, Top-K and load balancing, instead encapsulating all activation functionalities within individual experts and directly optimized through continuous gradient flow, enabling each expert to determine its activation entirely on its own. We introduce a unified adaptive load-balancing framework to simultaneously optimize both expert-balancing and token-balancing objectives through a configurable interpolation, allowing flexible and customizable resource allocation. Extensive experiments show that Routing-Free MoE can consistently outperform baselines with better scalability and robustness. We analyze its behavior in detail and offer insights that may facilitate future MoE design ad optimization.

Updated: 2026-04-01 12:07:20

标题: 无路由混合专家

摘要: 标准混合专家（MoE）模型依赖于引入了死板归纳偏见的集中路由机制。我们提出了无路由MoE，它消除了任何硬编码的集中设计，包括外部路由器、Softmax、Top-K和负载平衡，而是将所有激活功能封装在各个专家中，并直接通过连续梯度流优化，使每个专家完全自行确定其激活。我们引入了一个统一的自适应负载平衡框架，通过可配置的插值同时优化专家平衡和令牌平衡目标，实现灵活和可定制的资源分配。大量实验表明，无路由MoE可以持续优于基线，具有更好的可扩展性和鲁棒性。我们详细分析了其行为，并提供了可能有助于未来MoE设计和优化的见解。

更新时间: 2026-04-01 12:07:20

领域: cs.LG,cs.AI,cs.CL

下载: http://arxiv.org/abs/2604.00801v1

MIRANDA: MId-feature RANk-adversarial Domain Adaptation toward climate change-robust ecological forecasting with deep learning

Plant phenology modelling aims to predict the timing of seasonal phases, such as leaf-out or flowering, from meteorological time series. Reliable predictions are crucial for anticipating ecosystem responses to climate change. While phenology modelling has traditionally relied on mechanistic approaches, deep learning methods have recently been proposed as flexible, data-driven alternatives with often superior performance. However, mechanistic models tend to outperform deep networks when data distribution shifts are induced by climate change. Domain Adaptation (DA) techniques could help address this limitation. Yet, unlike standard DA settings, climate change induces a temporal continuum of domains and involves both a covariate and label shift, with warmer records and earlier start of spring. To tackle this challenge, we introduce Mid-feature Rank-adversarial Domain Adaptation (MIRANDA). Whereas conventional adversarial methods enforce domain invariance on final latent representations, an approach that does not explicitly address label shift, we apply adversarial regularization to intermediate features. Moreover, instead of a binary domain-classification objective, we employ a rank-based objective that enforces year-invariance in the learned meteorological representations. On a country-scale dataset spanning 70 years and comprising 67,800 phenological observations of 5 tree species, we demonstrate that, unlike conventional DA approaches, MIRANDA improves robustness to climatic distribution shifts and narrows the performance gap with mechanistic models.

Updated: 2026-04-01 12:06:58

标题: MIRANDA: 基于深度学习的中间特征排序对抗领域自适应，以实现对气候变化稳健的生态预测

摘要: 植物物候建模旨在从气象时间序列中预测季节性阶段的时间，例如叶片展开或开花。可靠的预测对于预测生态系统对气候变化的响应至关重要。虽然物候建模传统上依赖于机械方法，但深度学习方法最近被提出作为灵活的、数据驱动的替代方案，通常具有更优越的性能。然而，当气候变化引起数据分布变化时，机械模型往往优于深度网络。领域适应（DA）技术可以帮助解决这一限制。然而，与标准的DA设置不同，气候变化引起了领域的时间连续性，并涉及协变量和标签的变化，包括更暖和早春的记录。为了解决这一挑战，我们引入了Mid-feature Rank-adversarial Domain Adaptation（MIRANDA）。传统的对抗方法强制在最终潜在表示上实现领域不变性，这种方法并没有明确解决标签偏移问题，我们将对抗正则化应用于中间特征。此外，我们采用基于排名的目标，而不是二元领域分类目标，以强制在学习的气象表示中保持年份不变性。在跨越70年的国家级数据集上，包括5种树木的67,800个物候观测，我们证明，与传统的DA方法不同，MIRANDA提高了对气候分布变化的鲁棒性，并缩小了与机械模型之间的性能差距。

更新时间: 2026-04-01 12:06:58

领域: cs.LG

下载: http://arxiv.org/abs/2604.00800v1

Multimodal Language Models Cannot Spot Spatial Inconsistencies

Spatial consistency is a fundamental property of the visual world and a key requirement for models that aim to understand physical reality. Despite recent advances, multimodal large language models (MLLMs) often struggle to reason about 3D geometry across multiple views. Rather than asking models to describe scene attributes, we introduce a more challenging task: given two views of the same scene, identify the object that violates 3D motion consistency. We propose a simple and scalable method for generating realistic, spatially inconsistent image pairs from multi-view scenes, enabling systematic evaluation of this capability. Our results show that state-of-the-art MLLMs significantly underperform human observers and exhibit substantial variability across different scene attributes, revealing a fragile and incomplete understanding of 3D structure. We hope our findings underscore the need for approaches that develop a more deeply grounded understanding of the physical world.

Updated: 2026-04-01 12:06:54

标题: 多模态语言模型无法发现空间不一致性。

摘要: 空间一致性是视觉世界的基本属性，也是旨在理解物理现实的模型的关键要求。尽管最近取得了进展，多模态大语言模型（MLLMs）通常很难在多个视图之间推理3D几何。我们提出一个更具挑战性的任务，而不是要求模型描述场景属性：给定同一场景的两个视图，识别违反3D运动一致性的对象。我们提出了一种简单且可扩展的方法，从多视角场景生成逼真的空间不一致的图像对，从而实现对这一能力的系统评估。我们的结果显示，最先进的MLLMs在人类观察者面前表现明显不足，并且在不同场景属性之间表现出显著的变化，揭示了对3D结构的脆弱和不完整理解。我们希望我们的发现强调了需要发展更深入扎根于物理世界的方法的必要性。

更新时间: 2026-04-01 12:06:54

领域: cs.CV,cs.CL,cs.LG

下载: http://arxiv.org/abs/2604.00799v1

Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning

We propose the Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO) for urban route planning for people with different accessibility requirements and preferences. With this algorithm the user can interact with the system by giving feedback on a route, i.e., the user can say which objective should be further minimized, or conversely can be relaxed. This leads to intuitive user interaction, that is especially effective during early iterations compared to information-gain-based interaction. Furthermore, due to PG-IPRO's iterative nature, the full set of alternative, possibly optimal policies (the Pareto front), is never computed, leading to higher computational efficiency and shorter waiting times for users.

Updated: 2026-04-01 12:05:48

标题: 首选引导的可迭代帕累托参考优化用于无障碍路径规划

摘要: 我们提出了Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO)算法，用于满足不同无障碍需求和偏好的人群的城市路线规划。通过这种算法，用户可以通过反馈路线与系统进行交互，即用户可以指出哪个目标应该进一步最小化，或者相反地可以放宽。这导致了直观的用户交互，特别是在早期迭代中比基于信息增益的交互更有效。此外，由于PG-IPRO的迭代性质，永远不需要计算完整的备选、可能最优的政策集（帕累托前沿），从而提高了计算效率并缩短了用户等待时间。

更新时间: 2026-04-01 12:05:48

领域: cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.00795v1

Incoherence in Goal-Conditioned Autoregressive Models

We investigate mathematically the notion of incoherence: a structural issue with reinforcement learning policies derived by naive goal-conditioning of autoregressive models. We focus on the process of re-training models on their own actions, that is, fine-tuning offline-learned policies with online RL. We prove that it decreases incoherence and leads to an improvement in return, and we aim to characterize the resulting trajectory of policies. By re-framing standard notions of control-as-inference and soft Q learning, we establish a three-way correspondence with two other ways of understanding the iterative re-training process: as folding the posterior into the reward and, in the deterministic case, as decreasing the temperature parameter; the correspondence has computational content via the training-inference trade-off. Through soft-conditioning generative models, we discuss the link between incoherence and the effective horizon.

Updated: 2026-04-01 11:58:51

标题: 目标条件自回归模型中的不一致性

摘要: 我们在数学上研究了不连贯性的概念：即通过自回归模型的朴素目标调节导致的强化学习策略的结构问题。我们关注在自身行动上重新训练模型的过程，也就是用在线强化学习对离线学习的策略进行微调。我们证明这种方法可以降低不连贯性，并提高回报，我们的目标是描述由此产生的策略轨迹。通过重新构建控制即推理和软Q学习的标准概念，我们建立了与其他两种理解迭代重新训练过程的方式的三向对应：将后验折叠为奖励，并且在确定性情况下，降低温度参数；通过训练-推理权衡，这种对应具有计算内容。通过软调节生成模型，我们讨论了不连贯性与有效视界之间的联系。

更新时间: 2026-04-01 11:58:51

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2510.06545v2

The No-Clash Teaching Dimension is Bounded by VC Dimension

In the realm of machine learning theory, to prevent unnatural coding schemes between teacher and learner, No-Clash Teaching Dimension was introduced as provably optimal complexity measure for collusion-free teaching. However, whether No-Clash Teaching Dimension is upper-bounded by Vapnik-Chervonenkis dimension remains unknown. In this paper, for any finite concept class, we construct fragments of size equals to its Vapnik-Chervonenkis dimension which identify concepts through an ordered compression scheme. Naturally, these fragments are used as teaching sets, one can easily see that they satisfy the non-clashing condition, i.e., this open question is resolved for finite concept classes.

Updated: 2026-04-01 11:58:42

标题: "无冲突教学维度受VC维度限制"

摘要: 在机器学习理论领域中，为了防止教师和学习者之间的不自然编码方案，No-Clash Teaching Dimension被引入作为可证明的无串通教学的最佳复杂度度量。然而，No-Clash Teaching Dimension是否由Vapnik-Chervonenkis维度上界限定仍然未知。在本文中，针对任何有限概念类，我们构建了大小等于其Vapnik-Chervonenkis维度的片段，通过有序压缩方案识别概念。自然而然地，这些片段被用作教学集，我们可以轻松地看到它们满足非串通条件，即这个开放问题对于有限概念类已经得到解决。

更新时间: 2026-04-01 11:58:42

领域: cs.IT,cs.LG

下载: http://arxiv.org/abs/2603.23561v3

RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning

While large language models (LLMs) have demonstrated strong performance on complex reasoning tasks such as competitive programming (CP), existing methods predominantly focus on single-attempt settings, overlooking their capacity for iterative refinement. In this paper, we present RefineRL, a novel approach designed to unleash the self-refinement capabilities of LLMs for CP problem solving. RefineRL introduces two key innovations: (1) Skeptical-Agent, an iterative self-refinement agent equipped with local execution tools to validate generated solutions against public test cases of CP problems. This agent always maintains a skeptical attitude towards its own outputs and thereby enforces rigorous self-refinement even when validation suggests correctness. (2) A reinforcement learning (RL) solution to incentivize LLMs to self-refine with only standard RLVR data (i.e., problems paired with their verifiable answers). Extensive experiments on Qwen3-4B and Qwen3-4B-2507 demonstrate that our method yields substantial gains: after our RL training, these compact 4B models integrated with the Skeptical-Agent not only outperform much larger 32B models but also approach the single-attempt performance of 235B models. These findings suggest that self-refinement holds considerable promise for scaling LLM reasoning, with significant potential for further advancement.

Updated: 2026-04-01 11:54:57

标题: RefineRL：利用自我完善强化学习推动竞争性编程

摘要: 大型语言模型（LLMs）已经在复杂推理任务中表现出色，比如竞技编程（CP），现有方法主要集中在单次尝试设置上，忽略了它们进行迭代改进的能力。在本文中，我们提出了RefineRL，这是一种新颖的方法，旨在释放LLMs的自我改进能力，用于解决CP问题。RefineRL引入了两个关键创新：（1）Skeptical-Agent，一个配备本地执行工具的迭代自我改进代理，用于验证生成的解决方案是否符合CP问题的公共测试案例。这个代理始终保持怀疑态度，即使验证表明正确性，也会强制进行严格的自我改进。（2）一种强化学习（RL）解决方案，激励LLMs只使用标准RLVR数据（即，问题与可验证答案配对）进行自我改进。对Qwen3-4B和Qwen3-4B-2507进行的大量实验表明，我们的方法取得了显著的收益：在RL训练后，这些紧凑的4B模型集成了Skeptical-Agent，不仅胜过了更大的32B模型，而且接近了235B模型的单次尝试表现。这些发现表明，自我改进对于扩展LLM推理具有巨大的潜力，并具有进一步发展的重要潜力。

更新时间: 2026-04-01 11:54:57

领域: cs.AI

下载: http://arxiv.org/abs/2604.00790v1

UK AISI Alignment Evaluation Case-Study

This technical report presents methods developed by the UK AI Security Institute for assessing whether advanced AI systems reliably follow intended goals. Specifically, we evaluate whether frontier models sabotage safety research when deployed as coding assistants within an AI lab. Applying our methods to four frontier models, we find no confirmed instances of research sabotage. However, we observe that Claude Opus 4.5 Preview (a pre-release snapshot of Opus 4.5) and Sonnet 4.5 frequently refuse to engage with safety-relevant research tasks, citing concerns about research direction, involvement in self-training, and research scope. We additionally find that Opus 4.5 Preview shows reduced unprompted evaluation awareness compared to Sonnet 4.5, while both models can distinguish evaluation from deployment scenarios when prompted. Our evaluation framework builds on Petri, an open-source LLM auditing tool, with a custom scaffold designed to simulate realistic internal deployment of a coding agent. We validate that this scaffold produces trajectories that all tested models fail to reliably distinguish from real deployment data. We test models across scenarios varying in research motivation, activity type, replacement threat, and model autonomy. Finally, we discuss limitations including scenario coverage and evaluation awareness.

Updated: 2026-04-01 11:53:25

标题: 英国 AISI 对准评估案例研究

摘要: 这份技术报告介绍了英国人工智能安全研究所开发的方法，用于评估先进人工智能系统是否可靠地遵循预期目标。具体来说，我们评估了当前模型在AI实验室中作为编码助手部署时是否会破坏安全研究。将我们的方法应用于四种前沿模型，我们没有发现任何已确认的研究破坏案例。然而，我们观察到Claude Opus 4.5预览版（Opus 4.5的预发布快照）和Sonnet 4.5经常拒绝参与与安全相关的研究任务，理由是对研究方向、参与自我训练以及研究范围的担忧。我们另外发现，与Sonnet 4.5相比，Opus 4.5预览版显示出较低的自发评估意识，而在提示时，这两种模型都能区分评估和部署场景。我们的评估框架基于Petri，这是一个开源的LLM审计工具，并配备了一个定制的支架，旨在模拟编码代理的真实内部部署。我们验证了这个支架产生的轨迹，所有测试模型都无法可靠区分出真实部署数据。我们在不同的研究动机、活动类型、替代威胁和模型自主性的情景中测试了模型。最后，我们讨论了包括情景覆盖和评估意识在内的限制。

更新时间: 2026-04-01 11:53:25

领域: cs.AI,cs.CR

下载: http://arxiv.org/abs/2604.00788v1

Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer

Pretraining Large Language Models (LLMs) from scratch requires massive amount of compute. Aurora super computer is an ExaScale machine with 127,488 Intel PVC (Ponte Vechio) GPU tiles. In this work, we showcase LLM pretraining on Aurora at the scale of 1000s of GPU tiles. Towards this effort, we developed Optimus, an inhouse training library with support for standard large model training techniques. Using Optimus, we first pretrained Mula-1B, a 1 Billion dense model and Mula-7B-A1B, a 7 Billion Mixture of Experts (MoE) model from scratch on 3072 GPU tiles for the full 4 trillion tokens of the OLMoE-mix-0924 dataset. We then demonstrated model scaling by pretraining three large MoE models Mula-20B-A2B, Mula-100B-A7B, and Mula-220B-A10B till 100 Billion tokens on the same dataset. On our largest model Mula-220B-A10B, we pushed the compute scaling from 384 to 12288 GPU tiles and observed scaling efficiency of around 90% at 12288 GPU tiles. We significantly improved the runtime performance of MoE models using custom GPU kernels for expert computation, and a novel EP-Aware sharded optimizer resulting in training speedups up to 1.71x. As part of the Optimus library, we also developed a robust set of reliability and fault tolerant features to improve training stability and continuity at scale.

Updated: 2026-04-01 11:46:17

标题: 在Aurora超级计算机上可扩展的大规模专家混合语言模型预训练

摘要: 从零开始对大型语言模型（LLMs）进行预训练需要大量的计算资源。Aurora 超级计算机是一台 ExaScale 机器，拥有 127,488 个英特尔 PVC（Ponte Vechio）GPU 瓦片。在这项工作中，我们展示了在 Aurora 上以 1000 多个 GPU 瓦片的规模进行 LLM 预训练。为了实现这一努力，我们开发了 Optimus，一个内部训练库，支持标准大型模型训练技术。使用 Optimus，我们首先从头开始在 3072 个 GPU 瓦片上对 Mula-1B 进行了预训练，这是一个 10 亿个密集模型，并在 OLMoE-mix-0924 数据集的全部 4 万亿标记上对 Mula-7B-A1B 进行了预训练，这是一个 70 亿个专家混合（MoE）模型。然后，我们通过在相同数据集上对 Mula-20B-A2B、Mula-100B-A7B 和 Mula-220B-A10B 进行预训练，将模型的规模进行了展示，直到 1000 亿个标记。在我们最大的模型 Mula-220B-A10B 上，我们将计算资源从 384 个 GPU 瓦片提升到 12288 个 GPU 瓦片，并在 12288 个 GPU 瓦片上观察到了约 90% 的扩展效率。我们使用自定义 GPU 内核进行专家计算，以及一种新颖的 EP-Aware 分片优化器，显著改善了 MoE 模型的运行时性能，从而实现了高达 1.71 倍的训练加速。作为 Optimus 库的一部分，我们还开发了一套强大的可靠性和容错功能，以提高大规模训练的稳定性和连续性。

更新时间: 2026-04-01 11:46:17

领域: cs.LG,cs.AI,cs.DC

下载: http://arxiv.org/abs/2604.00785v1

SetONet: A Set-Based Operator Network for Solving PDEs with Variable-Input Sampling

Most neural-operator surrogates for PDEs inherit from DeepONet-style formulations the requirement that the input function be sampled at a fixed, ordered set of sensors. This assumption limits applicability to problems with variable sensor layouts, missing data, point sources, and sample-based representations of densities. We propose SetONet, which addresses this gap by recasting the operator input as an unordered set of coordinate-value observations and encoding it with permutation-invariant aggregation inside a standard branch-trunk operator network while preserving the DeepONet synthesis mechanism and lightweight end-to-end training. A structured variant, SetONet-Key, aggregates sensor information through learnable query tokens and a position-only key pathway, thereby decoupling sampling geometry from sensor values. The method is assessed on four classical operator-learning benchmarks under fixed layouts, variable layouts, and evaluation-time sensor drop-off, and on four problems with inherently unstructured point-cloud inputs, including heat conduction with multiple point sources, advection-diffusion, phase-screen diffraction, and optimal transport problems. In parameter-matched studies, SetONet-Key achieves lower error than the DeepONet baseline on fixed-sensor benchmarks and remains reliable when layouts vary or sensors are dropped at evaluation. Comparisons across pooling rules show that attention-based aggregation is typically more robust than mean or sum pooling. On the point-cloud problems, SetONet operates directly on the native input representation, without rasterization or multi-stage preprocessing, and outperforms the larger VIDON baseline.

Updated: 2026-04-01 11:44:38

标题: SetONet：用于采样变量输入解决PDE的基于集合的操作符网络

摘要: 大多数PDE的神经算子代理继承了DeepONet风格的公式，即输入函数必须在固定的、有序的传感器集上进行采样的要求。这一假设限制了应用于具有可变传感器布局、缺失数据、点源和密度的样本表示的问题。我们提出了SetONet，通过将算子输入重新构建为无序的坐标-值观测集，并在标准分支-主干算子网络内使用排列不变聚合来对其进行编码，同时保留DeepONet综合机制和轻量级端到端训练。一种结构化变体SetONet-Key，通过可学习的查询令牌和仅位置键路径对传感器信息进行聚合，从而将采样几何与传感器值分离。该方法在四个经典算子学习基准测试中进行了评估，包括固定布局、可变布局和评估时间传感器丢失，以及包括具有固有无结构点云输入的四个问题，包括具有多个点源的热传导、对流扩散、相位屏衍射和最优传输问题。在参数匹配研究中，SetONet-Key在固定传感器基准测试中的误差低于DeepONet基线，并在布局变化或在评估时丢弃传感器时仍然可靠。跨池规则的比较显示，基于注意力的聚合通常比均值或求和池更稳健。在点云问题上，SetONet直接在原始输入表示上运行，无需栅格化或多阶段预处理，表现优于更大的VIDON基线。

更新时间: 2026-04-01 11:44:38

领域: cs.LG

下载: http://arxiv.org/abs/2505.04738v3

Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall Segmentation

Automatic 3D reconstruction of indoor spaces from 2D floor plans necessitates high-precision semantic segmentation of structural elements, particularly walls. However, existing methods often struggle with detecting thin structures and maintaining geometric precision. To address this, we introduce MitUNet, a hybrid neural network designed to bridge the gap between global semantic context and fine-grained structural details. Our architecture combines a Mix-Transformer encoder with a U-Net decoder enhanced with spatial and channel attention blocks. Optimized with the Tversky loss function, this approach achieves a balance between precision and recall, ensuring accurate boundary recovery. Experiments on the CubiCasa5k dataset and the regional dataset demonstrate MitUNet's superiority in generating structurally correct masks with high boundary accuracy, outperforming standard models. This tool provides a robust foundation for automated 3D reconstruction pipelines. To ensure reproducibility and facilitate future research, the source code and the regional dataset are publicly available at https://github.com/aliasstudio/mitunet and https://doi.org/10.5281/zenodo.17871079, respectively.

Updated: 2026-04-01 11:42:30

标题: 提升平面图识别：一种用于精确墙壁分割的混合Mix-Transformer和U-Net方法

摘要: 室内空间的自动三维重建需要对结构元素进行高精度语义分割，特别是墙壁。然而，现有方法通常难以检测细小结构并保持几何精度。为了解决这个问题，我们引入了MitUNet，这是一个混合神经网络，旨在弥合全局语义上下文和细粒度结构细节之间的差距。我们的架构将Mix-Transformer编码器与增强了空间和通道注意力块的U-Net解码器相结合。通过Tversky损失函数进行优化，这种方法在精度和召回率之间取得平衡，确保准确的边界恢复。在CubiCasa5k数据集和区域数据集上的实验表明，MitUNet在生成结构正确的掩模并具有高边界精度方面优于标准模型。这个工具为自动化3D重建流程提供了坚实的基础。为了确保可重复性并促进未来研究，源代码和区域数据集分别在https://github.com/aliasstudio/mitunet和https://doi.org/10.5281/zenodo.17871079上公开提供。

更新时间: 2026-04-01 11:42:30

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2512.02413v3

Using predefined vector systems to speed up neural network multimillion class classification

Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can be significantly reduced. This is achieved by associating label prediction with the O(1) complexity closest cluster center search in a vector system used as target for latent space configuration (LSC). The proposed method only requires finding indexes of several largest and lowest values in the embedding vector making it extremely computationally efficient. We show that the proposed method does not change NN training accuracy computational results. We also measure the time required by different computational stages of NN inference and label prediction on multiple datasets. The experiments show that the proposed method allows to achieve up to 11.6 times overall acceleration over conventional methods. Furthermore, the proposed method has unique properties which allow to predict the existence of new classes.

Updated: 2026-04-01 11:42:02

标题: 使用预定义的向量系统加速神经网络千万类别分类

摘要: 神经网络（NNs）中的标签预测具有与类别数量成正比的O(n)复杂度。这对于使用全连接层和余弦相似性与某些类原型进行分类成立。本文表明，如果神经网络潜在空间（LS）的几何性质已知并具有特定属性，则可以显着降低标签预测的复杂度。这是通过将标签预测与用作潜在空间配置（LSC）目标的向量系统中O(1)复杂度最接近的集群中心搜索相结合实现的。所提出的方法只需要在嵌入向量中找到几个最大和最小值的索引，使其极其计算效率高。我们表明，所提出的方法不会改变NN训练准确性的计算结果。我们还测量了NN推理和标签预测的不同计算阶段在多个数据集上所需的时间。实验表明，所提出的方法可以实现比传统方法高达11.6倍的整体加速。此外，所提出的方法具有独特的特性，可以预测新类别的存在。

更新时间: 2026-04-01 11:42:02

领域: cs.LG,cs.CV

下载: http://arxiv.org/abs/2604.00779v1

Order Optimal Regret Bounds for Sharpe Ratio Optimization under Thompson Sampling

In this paper, we study sequential decision-making for maximizing the Sharpe ratio (SR) in a stochastic multi-armed bandit (MAB) setting. Unlike standard bandit formulations that maximize cumulative reward, SR optimization requires balancing expected return and reward variability. As a result, the learning objective depends jointly on the mean and variance of the reward distribution and takes a fractional form. To address this problem, we propose the Sharpe Ratio Thompson Sampling \texttt{SRTS}, a Bayesian algorithm for risk-adjusted exploration. For Gaussian reward models, the algorithm employs a Normal-Gamma conjugate posterior to capture uncertainty in both the mean and the precision of each arm. In contrast to additive mean-variance (MV) formulations, which often require different algorithms across risk regimes, the fractional SR objective yields a single sampling rule that applies uniformly across risk tolerances. On the theoretical side, we develop a regret decomposition tailored to the SR objective and introduce a decoupling approach that separates the contributions of mean and variance uncertainty. This framework allows us to control the interaction between the Gaussian mean samples and the Gamma precision samples arising in the posterior. Using these results, we establish a finite-time distribution-dependent $\mathcal{O}(\log n)$ upper bound on the expected regret. We further derive a matching information-theoretic lower bound using a change-of-measure argument, showing that the proposed algorithm is order-optimal. Finally, experiments on synthetic bandit environments illustrate the performance of \texttt{SRTS} and demonstrate improvements over existing risk-aware bandit algorithms across a range of risk-return settings.

Updated: 2026-04-01 11:40:09

标题: Thompson采样下夏普比率优化的最优遗憾界订单

摘要: 在本文中，我们研究了在随机多臂赌博机（MAB）环境中最大化夏普比率（SR）的顺序决策问题。与标准的赌博机形式不同，后者最大化累积奖励，SR优化需要平衡预期回报和奖励变异性。因此，学习目标联合取决于奖励分布的均值和方差，并采用分数形式。为了解决这个问题，我们提出了夏普比率汤普森抽样SRTS，这是一种用于风险调整探索的贝叶斯算法。对于高斯奖励模型，该算法采用正态-伽玛共轭后验来捕捉每个臂的均值和精度的不确定性。与通常需要在不同的风险区域之间使用不同算法的加法均值-方差（MV）形式不同，分数SR目标产生一个适用于所有风险容忍度的单一抽样规则。在理论方面，我们开发了一种针对SR目标量身定制的遗憾分解，并引入了一种解耦方法，将均值和方差不确定性的贡献分开。这个框架使我们能够控制后验中出现的高斯均值样本和伽玛精度样本之间的交互作用。利用这些结果，我们建立了一个有限时间分布相关的期望遗憾的O(log n)上界。我们进一步使用一个变换测度的论证导出了一个匹配的信息理论下界，表明所提出的算法是最佳的。最后，在合成赌博环境中的实验表明了SRTS的性能，并展示了在一系列风险回报设置中相对于现有的风险感知赌博算法的改进。

更新时间: 2026-04-01 11:40:09

领域: cs.LG,cs.IT

下载: http://arxiv.org/abs/2508.13749v3

Thinking Wrong in Silence: Backdoor Attacks on Continuous Latent Reasoning

A new generation of language models reasons entirely in continuous hidden states, producing no tokens and leaving no audit trail. We show that this silence creates a fundamentally new attack surface. ThoughtSteer perturbs a single embedding vector at the input layer; the model's own multi-pass reasoning amplifies this perturbation into a hijacked latent trajectory that reliably produces the attacker's chosen answer, while remaining structurally invisible to every token-level defense. Across two architectures (Coconut and SimCoT), three reasoning benchmarks, and model scales from 124M to 3B parameters, ThoughtSteer achieves >=99% attack success rate with near-baseline clean accuracy, transfers to held-out benchmarks without retraining (94-100%), evades all five evaluated active defenses, and survives 25 epochs of clean fine-tuning. We trace these results to a unifying mechanism: Neural Collapse in the latent space pulls triggered representations onto a tight geometric attractor, explaining both why defenses fail and why any effective backdoor must leave a linearly separable signature (probe AUC>=0.999). Yet a striking paradox emerges: individual latent vectors still encode the correct answer even as the model outputs the wrong one. The adversarial information is not in any single vector but in the collective trajectory, establishing backdoor perturbations as a new lens for mechanistic interpretability of continuous reasoning. Code and checkpoints are available.

Updated: 2026-04-01 11:34:55

标题: 在沉默中反常思考：对连续潜在推理的后门攻击

摘要: 一种新一代的语言模型完全基于连续的隐藏状态进行推理，不生成任何标记，也不留下任何审计痕迹。我们展示了这种沉默创造了一个根本新的攻击面。ThoughtSteer扰动了输入层的单个嵌入向量；模型自身的多次推理将这种扰动放大成一个被劫持的潜在轨迹，可可靠地产生攻击者选择的答案，同时仍然在结构上对每个标记级别的防御不可见。在两种架构（Coconut和SimCoT）、三个推理基准和模型参数从124M到3B的规模下，ThoughtSteer实现了>=99%的攻击成功率，几乎基准干净准确度，转移到未训练的基准（94-100%），规避了所有五种评估的主动防御，经受了25个干净微调周期的考验。我们将这些结果追溯到一个统一的机制：激发表示在潜在空间中的神经坍塌将表示拉到一个紧凑的几何吸引器上，解释了为什么防御失败以及为什么任何有效的后门必须留下一个线性可分离的签名（探针AUC>=0.999）。然而，一个引人注目的悖论出现了：虽然模型输出错误答案，但个别潜在向量仍然编码了正确答案。敌对信息不在任何单个向量中，而是在集体轨迹中，为连续推理的机械解释提供了一个新的视角。代码和检查点可用。

更新时间: 2026-04-01 11:34:55

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.00770v1

ActivityNarrated: An Open-Ended Narrative Paradigm for Wearable Human Activity Understanding

Wearable HAR has improved steadily, but most progress still relies on closed-set classification, which limits real-world use. In practice, human activity is open-ended, unscripted, personalized, and often compositional, unfolding as narratives rather than instances of fixed classes. We argue that addressing this gap does not require simply scaling datasets or models. It requires a fundamental shift in how wearable HAR is formulated, supervised, and evaluated. This work shows how to model open-ended activity narratives by aligning wearable sensor data with natural-language descriptions in an open-vocabulary setting. Our framework has three core components. First, we introduce a naturalistic data collection and annotation pipeline that combines multi-position wearable sensing with free-form, time-aligned narrative descriptions of ongoing behavior, allowing activity semantics to emerge without a predefined vocabulary. Second, we define a retrieval-based evaluation framework that measures semantic alignment between sensor data and language, enabling principled evaluation without fixed classes while also subsuming closed-set classification as a special case. Third, we present a language-conditioned learning architecture that supports sensor-to-text inference over variable-length sensor streams and heterogeneous sensor placements. Experiments show that models trained with fixed-label objectives degrade sharply under real-world variability, while open-vocabulary sensor-language alignment yields robust and semantically grounded representations. Once this alignment is learned, closed-set activity recognition becomes a simple downstream task. Under cross-participant evaluation, our method achieves 65.3% Macro-F1, compared with 31-34% for strong closed-set HAR baselines. These results establish open-ended narrative modeling as a practical and effective foundation for real-world wearable HAR.

Updated: 2026-04-01 11:31:44

标题: 活动叙述：一种用于可穿戴人类活动理解的开放式叙事范式

摘要: 可穿戴的人体行为识别技术（HAR）已经稳步改进，但大部分进展仍依赖于封闭集分类，这限制了实际世界的使用。在实践中，人类活动是开放式的、未脚本化的、个性化的，往往是组合的，呈现为叙述而不是固定类别的实例。我们认为，解决这一差距并不仅仅需要扩展数据集或模型。它需要对可穿戴HAR的制定、监督和评估进行根本性的转变。这项工作展示了如何通过在开放词汇环境中将可穿戴传感器数据与自然语言描述对齐来建模开放式活动叙述。我们的框架有三个核心组件。首先，我们介绍了一个自然主义的数据收集和注释流程，将多位置可穿戴传感器与自由形式、时间对齐的正在发生行为描述相结合，使活动语义在没有预定义词汇的情况下出现。其次，我们定义了一个基于检索的评估框架，衡量传感器数据和语言之间的语义对齐，实现了基于原则的评估，而不需要固定类别，同时也包含封闭集分类作为一种特殊情况。第三，我们提出了一种受语言条件约束的学习架构，支持对可变长度传感器数据流和异构传感器放置进行传感器到文本推断。实验表明，使用固定标签目标训练的模型在真实世界的变化下急剧退化，而开放词汇的传感器-语言对齐产生了稳健且有语义基础的表示。一旦学习了这种对齐，封闭集活动识别就变成了一个简单的下游任务。在跨参与者评估下，我们的方法实现了65.3%的宏F1，而强封闭集HAR基线为31-34%。这些结果确立了开放式叙述建模作为实际且有效的可穿戴HAR基础。

更新时间: 2026-04-01 11:31:44

领域: cs.LG

下载: http://arxiv.org/abs/2604.00767v1

EvalBlocks: A Modular Pipeline for Rapidly Evaluating Foundation Models in Medical Imaging

Developing foundation models in medical imaging requires continuous monitoring of downstream performance. Researchers are burdened with tracking numerous experiments, design choices, and their effects on performance, often relying on ad-hoc, manual workflows that are inherently slow and error-prone. We introduce EvalBlocks, a modular, plug-and-play framework for efficient evaluation of foundation models during development. Built on Snakemake, EvalBlocks supports seamless integration of new datasets, foundation models, aggregation methods, and evaluation strategies. All experiments and results are tracked centrally and are reproducible with a single command, while efficient caching and parallel execution enable scalable use on shared compute infrastructure. Demonstrated on five state-of-the-art foundation models and three medical imaging classification tasks, EvalBlocks streamlines model evaluation, enabling researchers to iterate faster and focus on model innovation rather than evaluation logistics. The framework is released as open source software at https://github.com/DIAGNijmegen/eval-blocks.

Updated: 2026-04-01 11:28:00

标题: EvalBlocks：用于快速评估医学影像基础模型的模块化管道

摘要: 在医学影像学中开发基础模型需要持续监控下游性能。研究人员需要追踪众多实验、设计选择以及它们对性能的影响，通常依赖于临时的、手动的工作流程，这种方式本质上是缓慢且容易出错的。我们引入了EvalBlocks，这是一个模块化、即插即用的框架，用于在开发过程中高效评估基础模型。基于Snakemake构建的EvalBlocks支持新数据集、基础模型、聚合方法和评估策略的无缝集成。所有实验和结果都被集中跟踪，并且可以通过一条命令进行复现，同时高效的缓存和并行执行使其能够在共享计算基础设施上进行可伸缩的使用。EvalBlocks在五个最先进的基础模型和三个医学影像分类任务上进行了展示，简化了模型评估，使研究人员能够更快地迭代，并专注于模型创新而非评估物流。该框架已作为开源软件发布在https://github.com/DIAGNijmegen/eval-blocks。

更新时间: 2026-04-01 11:28:00

领域: cs.CV,cs.LG

下载: http://arxiv.org/abs/2601.03811v2

PrivHAR-Bench: A Graduated Privacy Benchmark Dataset for Video-Based Action Recognition

Existing research on privacy-preserving Human Activity Recognition (HAR) typically evaluates methods against a binary paradigm: clear video versus a single privacy transformation. This limits cross-method comparability and obscures the nuanced relationship between privacy strength and recognition utility. We introduce \textit{PrivHAR-Bench}, a multi-tier benchmark dataset designed to standardize the evaluation of the \textit{Privacy-Utility Trade-off} in video-based action recognition. PrivHAR-Bench applies a graduated spectrum of visual privacy transformations: from lightweight spatial obfuscation to cryptographic block permutation, to a curated subset of 15 activity classes selected for human articulation diversity. Each of the 1,932 source videos is distributed across 9 parallel tiers of increasing privacy strength, with additional background-removed variants to isolate the contribution of human motion features from contextual scene bias. We provide lossless frame sequences, per-frame bounding boxes, estimated pose keypoints with joint-level confidence scores, standardized group-based train/test splits, and an evaluation toolkit computing recognition accuracy and privacy metrics. Empirical validation using R3D-18 demonstrates a measurable and interpretable degradation curve across tiers, with within-tier accuracy declining from 88.8\% (clear) to 53.5\% (encrypted, background-removed) and cross-domain accuracy collapsing to 4.8\%, establishing PrivHAR-Bench as a controlled benchmark for comparing privacy-preserving HAR methods under standardized conditions. The dataset, generation pipeline, and evaluation code are publicly available.

Updated: 2026-04-01 11:24:47

标题: PrivHAR-Bench：用于基于视频的动作识别的分级隐私基准数据集

摘要: 现有关于隐私保护的人体活动识别（HAR）的研究通常针对二元范式进行评估：清晰视频与单一隐私转换。这限制了跨方法的可比性，并模糊了隐私强度与识别效用之间微妙的关系。我们引入了\textit{PrivHAR-Bench}，一个多层次基准数据集，旨在标准化视频动作识别中\textit{隐私-效用权衡}的评估。PrivHAR-Bench应用了一个渐进的视觉隐私转换谱：从轻量级空间混淆到加密块置换，再到为人类表达多样性而选择的15个活动类别的策划子集。每个1,932个源视频分布在9个逐渐增强隐私强度的平行层中，还有额外的去除背景的变体，以隔离人体动作特征对上下文场景偏差的贡献。我们提供无损帧序列、每帧边界框、估计的姿势关键点和关节级置信度分数、标准化的基于组的训练/测试分割，以及一个评估工具包，计算识别准确性和隐私度量。通过使用R3D-18进行的实证验证显示了在各层之间的可测和可解释的退化曲线，其中在层内准确度从88.8\%（清晰）下降到53.5\%（加密、去除背景），跨领域准确度崩溃至4.8\%，确立了PrivHAR-Bench作为一个受控基准，用于在标准化条件下比较隐私保护的HAR方法。数据集、生成流水线和评估代码均可公开获取。

更新时间: 2026-04-01 11:24:47

领域: cs.CV,cs.CR

下载: http://arxiv.org/abs/2604.00761v1

DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

Audio tokenization bridges continuous waveforms and multi-track music language models. In dual-track modeling, tokens should preserve three properties at once: high-fidelity reconstruction, strong predictability under a language model, and cross-track correspondence. We introduce DuoTok, a source-aware dual-track tokenizer that addresses this trade-off through staged disentanglement. DuoTok first pretrains a semantic encoder, then regularizes it with multi-task supervision, freezes the encoder, and applies hard dual-codebook routing while keeping auxiliary objectives on quantized codes. A diffusion decoder reconstructs high-frequency details, allowing tokens to focus on structured information for sequence modeling. On standard benchmarks, DuoTok achieves a favorable predictability-fidelity trade-off, reaching the lowest cnBPT while maintaining competitive reconstruction at 0.75 kbps. Under a held-constant dual-track language modeling protocol, enBPT also improves, indicating gains beyond codebook size effects. Controlled diagnostics show larger predictability costs under cross-track corruption and larger gains from longer context, suggesting that models trained on DuoTok tokens use cross-track structure and non-local history.

Updated: 2026-04-01 11:23:39

标题: DuoTok：面向多轨音乐语言建模的源感知双轨标记化

摘要: 音频标记化架起了连续波形和多轨音乐语言模型之间的桥梁。在双轨建模中，标记应该同时保留三个特性：高保真重建、在语言模型下强预测能力和跨轨对应。我们引入了DuoTok，一种源感知的双轨标记器，通过分阶段解缠来解决这种权衡。DuoTok首先对语义编码器进行预训练，然后通过多任务监督对其进行规范化，并冻结编码器，并在量化代码上保持辅助目标的同时应用硬双码本路由。扩散解码器重建高频细节，使标记可以专注于用于序列建模的结构化信息。在标准基准测试中，DuoTok实现了有利的预测保真度权衡，在维持竞争性重建为0.75 kbps的同时，达到了最低的cnBPT。在保持双轨语言建模协议不变的情况下，enBPT也有所改善，表明代码本大小效应以外的收益。受控诊断显示在跨轨污染下的更大预测成本和更长上下文的较大收益，这表明在DuoTok标记上训练的模型使用跨轨结构和非本地历史。

更新时间: 2026-04-01 11:23:39

领域: cs.SD,cs.AI

下载: http://arxiv.org/abs/2511.20224v2

IWP: Token Pruning as Implicit Weight Pruning in Large Vision Language Models

Large Vision Language Models show impressive performance across image and video understanding tasks, yet their computational cost grows rapidly with the number of visual tokens. Existing token pruning methods mitigate this issue through empirical approaches while overlooking the internal mechanism of attention. In this paper, we propose a novel training free token pruning framework grounded in the dual form perspective of attention. We reformulate attention as an implicit linear layer whose weight matrix is the sum of rank 1 outer products, each generated by a single token's key value pair. Token pruning thus reduces to selecting an optimal subset of these rank 1 updates that best approximates the original dual weight matrix. Extending this perspective to standard softmax attention in LVLMs, we derive a novel metric quantifying both a token's information magnitude and information duplication. To efficiently select the subset with the proposed metric, we introduce Progressive Chunked Maximal Marginal Relevance. Extensive experiments demonstrate that our method achieves a better trade off between performance and efficiency, while providing another perspective on existing pruning approaches.

Updated: 2026-04-01 11:23:16

标题: IWP：在大型视觉语言模型中作为隐式权重修剪的 Token 修剪

摘要: 大型视觉语言模型在图像和视频理解任务中表现出色，但它们的计算成本随着视觉标记数量的增加而快速增长。现有的标记修剪方法通过经验方法缓解了这个问题，但忽视了注意力的内部机制。在本文中，我们提出了一个新颖的基于注意力双重形式视角的无需训练的标记修剪框架。我们将注意力重新表述为一个隐式线性层，其权重矩阵是由单个标记的键值对生成的一系列秩为1的外积的和。标记修剪因此减少到选择这些秩1更新的最佳子集，以最佳地近似原始的双重权重矩阵。将这种视角扩展到LVLM中的标准softmax注意力，我们推导出一个新的度量，量化一个标记的信息量和信息重复。为了有效地选择提出的度量的子集，我们引入了渐进式分块最大边际相关性。大量实验证明，我们的方法在性能和效率之间取得了更好的权衡，同时为现有修剪方法提供了另一种视角。

更新时间: 2026-04-01 11:23:16

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2604.00757v1

Stochastic Attention: Connectome-Inspired Randomized Routing for Expressive Linear-Time Attention

The whole-brain connectome of a fruit fly comprises over 130K neurons connected with a probability of merely 0.02%, yet achieves an average shortest path of only 4.4 hops. Despite being highly structured at the circuit level, the network's long-range connections are broadly distributed across brain regions, functioning as stochastic shortcuts that enable efficient global communication. Inspired by this observation, we propose Stochastic Attention (SA), a drop-in enhancement for sliding-window attention (SWA) that applies a random permutation to the token sequence before windowed attention and restores the original order afterward. This transforms the fixed local window into a stochastic global one within the same $O(nw)$ per-layer budget. Through depth, independently sampled permutations yield exponentially growing receptive fields, achieving full sequence coverage in $O(\log_w n)$ layers versus $O(n/w)$ for SWA. We validate SA in two settings: pre-training language models from scratch, where a gated SA + SWA combination achieves the best average zero-shot accuracy, and training-free inference on Qwen3-8B and Qwen3-30B-A3B, where SA consistently outperforms SWA and matches or exceeds Mixture of Block Attention at comparable compute budgets. These results suggest that connectome-inspired stochastic routing is a practical primitive for improving the expressivity of efficient attention, complementary to existing linear and sparse approaches.

Updated: 2026-04-01 11:18:40

标题: 随机关注：受连接组启发的用于表达线性时间关注的随机路由

摘要: 一只果蝇的整个脑连接组成包括超过130K个神经元，它们之间的连接概率仅为0.02％，但却实现了平均最短路径仅为4.4跳。尽管在电路水平上高度结构化，但网络的远程连接广泛分布在大脑区域中，作为随机快捷方式，实现了高效的全局通信。受到这一观察的启发，我们提出了Stochastic Attention（SA），这是一个可用于滑动窗口注意力（SWA）的增强功能，它在窗口注意力之前对令牌序列应用随机排列，并在之后恢复原始顺序。这将固定的局部窗口转变为在相同的每层预算$O(nw)$内的随机全局窗口。通过深度，独立采样的排列导致指数增长的接受域，在$O(\log_w n)$层中实现了完整的序列覆盖，而SWA需要$O(n/w)$。我们在两个设置中验证了SA的有效性：从头开始预训练语言模型，其中门控SA + SWA组合实现了最佳的零点准确性，以及在Qwen3-8B和Qwen3-30B-A3B上无需训练的推断，其中SA始终优于SWA，并在可比的计算预算下与Mixture of Block Attention匹敌或超越。这些结果表明，受连接组启发的随机路由是改进高效注意力表达能力的实用原语，与现有的线性和稀疏方法相辅相成。

更新时间: 2026-04-01 11:18:40

领域: cs.CL,cs.LG

下载: http://arxiv.org/abs/2604.00754v1

Engineering a Phase-Noise-Based Quantum Random Number Generator for Real-Time Secure Applications: Design, Validation, and Scalability

Random Number Generators (RNGs) are crucial for applications ranging from cryptography to simulations. Depending on the source of randomness, RNGs are classified into Pseudo-Random Number Generators (PRNGs), True Random Number Generators (TRNGs), and Quantum Random Number Generators (QRNGs). This work presents the end-to-end development of a high-speed, high-efficiency, phase-noise-based QRNG system that taps into the quantum phase noise of a single-frequency laser, with randomness originating from spontaneous emission. Using a self-heterodyne measurement with a semiconductor laser (linewidth $\approx$ 5.23 $GHz$) operated near threshold and a $\sim$48 $cm$ fiber delay line, a raw data generation rate of 2.0 $Gbps$ is achieved. To ensure uniform randomness in the QRNG output, robust extraction techniques developed in-house, such as the Toeplitz Strong Extractor (TSE), are used. Randomness validation using the NIST and Diehard test suites confirms that all statistical tests pass at standard confidence levels. The developed system achieves a post-processed generation rate of 1.0 $Gbps$ in operation and attains a Technology Readiness Level (TRL) of 7, approaching TRL 8, making it suitable for real-time secure applications such as cryptographic key generation and stochastic modeling.

Updated: 2026-04-01 11:12:31

标题: 为实时安全应用工程化基于相位噪声的量子随机数生成器：设计、验证和可扩展性

摘要: 随机数生成器（RNGs）对于从密码学到模拟等各种应用至关重要。根据随机性来源，RNGs被分类为伪随机数生成器（PRNGs）、真随机数生成器（TRNGs）和量子随机数生成器（QRNGs）。本文介绍了一种基于相位噪声的高速、高效的QRNG系统的端到端开发，该系统利用单频激光器的量子相位噪声进行随机性生成。利用半导体激光器（线宽约为5.23 GHz）在接近阈值运行并配合约48 cm的光纤延迟线，实现了2.0 Gbps的原始数据生成速率。为确保QRNG输出中的均匀随机性，采用了内部开发的强Toeplitz提取器（TSE）等稳健的提取技术。使用NIST和Diehard测试套件对随机性进行验证，确认所有统计测试在标准置信水平下通过。该系统在运行中实现了1.0 Gbps的后处理生成速率，并达到了技术成熟度水平（TRL）7，接近TRL 8，使其适用于密码学密钥生成和随机建模等实时安全应用。

更新时间: 2026-04-01 11:12:31

领域: quant-ph,cs.CR,physics.optics

下载: http://arxiv.org/abs/2604.00741v1

BioCOMPASS: Integrating Biomarkers into Transformer-Based Immunotherapy Response Prediction

Datasets used in immunotherapy response prediction are typically small in size, as well as diverse in cancer type, drug administered, and sequencer used. Models often drop in performance when tested on patient cohorts that are not included in the training process. Recent work has shown that transformer-based models along with self-supervised learning show better generalisation performance than threshold-based biomarkers, but is still suboptimal. We present BioCOMPASS, an extension of a transformer-based model called COMPASS, that integrates biomarkers and treatment information to further improve its generalisability. Instead of feeding biomarker data as input, we built loss components to align them with the model's intermediate representations. We found that components such as treatment gating and pathway consistency loss improved generalisability when evaluated with Leave-one-cohort-out, Leave-one-cancer-type-out and Leave-one-treatment-out strategies. Results show that building components that exploit biomarker and treatment information can help in generalisability of immunotherapy response prediction. Careful curation of additional components that leverage complementary clinical information and domain knowledge represents a promising direction for future research.

Updated: 2026-04-01 11:06:29

标题: 生物罗盘：将生物标志物整合到基于变压器的免疫疗法反应预测中

摘要: 用于免疫疗法反应预测的数据集通常规模较小，并且在癌症类型、给药药物和测序仪器上具有多样性。在未包含在训练过程中的患者群体上测试时，模型的性能通常会下降。最近的研究表明，基于transformer的模型与自监督学习相结合显示出比基于阈值的生物标志物更好的泛化性能，但仍不够理想。我们提出了BioCOMPASS，这是一个基于transformer模型COMPASS的扩展，它整合了生物标志物和治疗信息以进一步提高其泛化能力。我们并没有将生物标志物数据作为输入，而是构建了损失组件，将其与模型的中间表示对齐。我们发现，像治疗门控和通路一致性损失等组件在使用留出一个队列、留出一个癌症类型和留出一个治疗方法的策略进行评估时，能够改善泛化能力。结果表明，构建利用生物标志物和治疗信息的组件有助于提高免疫疗法反应预测的泛化能力。精心策划利用补充临床信息和领域知识的额外组件代表了未来研究的一个有前途的方向。

更新时间: 2026-04-01 11:06:29

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.00739v1

"Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions

Mental health is a growing global concern, prompting interest in AI-driven solutions to expand access to psychosocial support. Peer support, grounded in lived experience, offers a valuable complement to professional care. However, variability in training, effectiveness, and definitions raises concerns about quality, consistency, and safety. Large Language Models (LLMs) present new opportunities to enhance peer support interactions, particularly in real-time, text-based interactions. We present and evaluate an AI-supported system with an LLM-simulated distressed client, context-sensitive LLM-generated suggestions, and real-time emotion visualisations. 2 mixed-methods studies with 12 peer supporters and 5 mental health professionals (i.e., experts) examined the system's effectiveness and implications for practice. Both groups recognised its potential to enhance training and improve interaction quality. However, we found a key tension emerged: while peer supporters engaged meaningfully, experts consistently flagged critical issues in peer supporter responses, such as missed distress cues and premature advice-giving. This misalignment highlights potential limitations in current peer support training, especially in emotionally charged contexts where safety and fidelity to best practices are essential. Our findings underscore the need for standardised, psychologically grounded training, especially as peer support scales globally. They also demonstrate how LLM-supported systems can scaffold this development--if designed with care and guided by expert oversight. This work contributes to emerging conversations on responsible AI integration in mental health and the evolving role of LLMs in augmenting peer-delivered care.

Updated: 2026-04-01 11:06:04

标题: 这真的是一个人类同伴支持者吗？LLM支持交互中同伴支持者和专家之间的不协调

摘要: 心理健康是一个日益严重的全球性问题，促使人们对利用人工智能驱动的解决方案来扩大心理社会支持的访问渠道产生兴趣。基于亲身经历的同伴支持为专业护理提供了有价值的补充。然而，培训、效果和定义的差异引发了有关质量、一致性和安全性的担忧。大型语言模型（LLMs）为增强同伴支持互动提供了新的机会，尤其是在实时、基于文本的互动中。我们提出并评估了一个由LLM模拟的痛苦客户、上下文敏感的LLM生成的建议和实时情绪可视化支持的人工智能系统。两项混合方法研究涉及12名同伴支持者和5名心理健康专业人士（即专家），研究了该系统的效果和实践意义。两组都认识到该系统有潜力增强培训并改善互动质量。然而，我们发现一个关键的紧张局面出现：尽管同伴支持者参与了有意义的互动，专家们一直指出了同伴支持者回应中的关键问题，如忽视的痛苦提示和过早的建议。这种不一致突显了当前同伴支持培训的潜在局限，尤其是在情绪充沛的环境中，安全性和最佳实践的忠实性至关重要。我们的研究结果强调了标准化、基于心理学的培训的必要性，特别是在同伴支持全球扩展的情况下。它们还展示了LLM支持系统如何可以支撑这一发展——如果经过精心设计并受到专家监督的指导。这项工作为心理健康中负责任的人工智能整合和LLMs在增强同伴提供的护理方面的不断发展的角色做出了贡献。

更新时间: 2026-04-01 11:06:04

领域: cs.HC,cs.AI

下载: http://arxiv.org/abs/2506.09354v2

How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions

For individuals with blindness or low vision (BLV), navigating complex environments can pose serious risks. Large Vision-Language Models (LVLMs) show promise for generating scene descriptions, but their effectiveness for BLV users remains underexplored. To address this gap, we conducted a user study with eight BLV participants to systematically evaluate preferences for six types of LVLM descriptions. While they helped to reduce fear and improve actionability, user ratings showed wide variation in sufficiency and conciseness. Furthermore, GPT-4o--despite its strong potential to refine descriptions--was not consistently preferred by participants. We use the insights obtained from the user study to build training data for building our new automatic evaluation metric that can capture BLV preferences effectively. Our findings underscore the urgent need for BLV-centered evaluation metrics and human-in-the-loop feedback to advance LVLM description quality for accessibility.

Updated: 2026-04-01 10:55:52

标题: 盲人和视力低下个体更喜欢大型视觉-语言模型生成的场景描述

摘要: 对于盲人或视力低下的个体（BLV），在复杂环境中导航可能会带来严重风险。大型视觉-语言模型（LVLMs）显示出为生成场景描述的潜力，但它们对于BLV用户的有效性尚未得到充分探讨。为了填补这一空白，我们进行了一项用户研究，招募了八名BLV参与者，系统评估了六种LVLM描述的偏好。虽然这些描述有助于减少恐惧并提高可操作性，但用户评分显示出描述的充分性和简洁性存在较大的差异。此外，尽管GPT-4o有强大的潜力改进描述，但参与者并不一致地偏好它。我们利用从用户研究中获得的见解构建训练数据，用于开发我们的新自动评估指标，能有效捕捉BLV的偏好。我们的研究结果强调了为了推动LVLM描述质量以提高可访问性，急需BLV为中心的评估指标和人机协同反馈。

更新时间: 2026-04-01 10:55:52

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2502.14883v3

Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction

The memory wall remains the primary bottleneck for training large language models on consumer hardware. We introduce Spectral Compact Training (SCT), a method that replaces dense weight matrices with permanent truncated SVD factors W = U diag(s) V^T, where the full dense matrix is never materialized during training or inference. Gradients flow through the compact spectral factors via standard backpropagation, and U, V are retracted to the Stiefel manifold via QR decomposition after each optimizer step. SCT achieves up to 199x memory reduction per MLP layer at rank 32, enabling full training steps of 70B-parameter architectures on a Steam Deck handheld (7.2 GB peak memory vs. 1,245 GB for dense FP32 training with Adam). Rank-sweep experiments on SmolLM2-1.7B (ranks 32-256, 2000 steps, NVIDIA A100) show that all tested ranks converge to the same loss floor (~4.2-4.5), identifying the learning rate schedule -- not MLP rank -- as the primary bottleneck. Rank 128 emerges as the efficiency sweet spot at 11.7x MLP compression with the lowest perplexity. GPU memory drops 46% at rank 32 while training throughput doubles.

Updated: 2026-04-01 10:53:56

标题: 谱紧致训练：通过永久截断的SVD和Stiefel QR缩回对大型语言模型进行预训练

摘要: 记忆墙仍然是在消费者硬件上训练大型语言模型的主要瓶颈。我们介绍了谱紧凑训练（SCT），一种用永久截断的SVD因子W = U diag(s) V^T 替换稠密权重矩阵的方法，完整的稠密矩阵在训练或推断过程中从未实现。梯度通过紧凑的谱因子通过标准反向传播传递，每次优化器步骤后，U，V通过QR分解撤回到Stiefel流形。在秩32处，SCT在每个MLP层实现高达199倍的内存减少，使得在Steam Deck手持设备上可以对70B参数架构进行完整的训练步骤（与使用Adam进行稠密FP32训练的1,245GB相比，峰值内存为7.2GB）。在SmolLM2-1.7B上进行秩扫描实验（秩32-256，2000步，NVIDIA A100），显示所有测试秩都收敛到相同的损失底线（约为4.2-4.5），将学习率调度--而不是MLP秩--确定为主要瓶颈。在11.7倍MLP压缩下，秩128成为效率的甜蜜点，具有最低的困惑度。在秩32处，GPU内存降低了46%，同时训练吞吐量翻倍。

更新时间: 2026-04-01 10:53:56

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2604.00733v1

Are Large Vision-Language Models Ready to Guide Blind and Low-Vision Individuals?

Large Vision-Language Models (LVLMs) demonstrate a promising direction for assisting individuals with blindness or low-vision (BLV). Yet, measuring their true utility in real-world scenarios is challenging because evaluating whether their descriptions are BLV-informative requires a fundamentally different approach from assessing standard scene descriptions. While the "VLM-as-a-metric" or "LVLM-as-a-judge" paradigm has emerged, existing evaluators still fall short of capturing the unique requirements of BLV-centric evaluation, lacking at least one of the following key properties: (1) High correlation with human judgments, (2) Long instruction understanding, (3) Score generation efficiency, and (4) Multi-dimensional assessment. To this end, we propose a unified framework to bridge the gap between automated evaluation and actual BLV needs. First, we conduct an in-depth user study with BLV participants to understand and quantify their navigational preferences, curating VL-GUIDEDATA, a large-scale BLV user-simulated preference dataset containing image-request-response-score pairs. We then leverage the dataset to develop an accessibility-aware evaluator, VL-GUIDE-S, which outperforms existing (L)VLM judges in both human alignment and inference efficiency. Notably, its effectiveness extends beyond a single domain, demonstrating strong performance across multiple fine-grained, BLV-critical dimensions. We hope our work lays as a foundation for automatic AI judges that advance safe, barrier-free navigation for BLV users.

Updated: 2026-04-01 10:51:55

标题: 大型视觉语言模型是否准备好指导盲人和低视力人群？

摘要: 大型视觉语言模型（LVLMs）展示了协助盲人或低视力（BLV）个体的一个有前途的方向。然而，在现实场景中衡量它们真正的效用是具有挑战性的，因为评估它们的描述是否BLV信息丰富需要一种与评估标准场景描述 fundamentally fundamentally 不同的方法。虽然“VLM作为指标”或“LVLM作为评判者”范式已经出现，但现有的评估者仍然无法捕捉BLV中心评估的独特要求，至少缺乏以下关键属性中的一个：（1）与人类判断的高相关性，（2）长指令理解，（3）分数生成效率和（4）多维评估。为此，我们提出了一个统一框架，以弥合自动评估和实际BLV需求之间的差距。首先，我们进行了一项深入的用户研究，与BLV参与者一起了解和量化他们的导航偏好，策划了VL-GUIDEDATA，一个包含图像请求-响应-分数对的大规模BLV用户模拟偏好数据集。然后，我们利用数据集开发了一个具有可访问性意识的评估器VL-GUIDE-S，它在人类对齐和推理效率方面均优于现有的（L）VLM评判者。值得注意的是，它的有效性不仅限于单一领域，还跨越了多个细粒度、BLV关键维度，表现出色。我们希望我们的工作为促进BLV用户的安全、无障碍导航的自动AI评判者奠定基础。

更新时间: 2026-04-01 10:51:55

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2510.00766v2

A CEFR-Inspired Classification Framework with Fuzzy C-Means To Automate Assessment of Programming Skills in Scratch

Context: Schools, training platforms, and technology firms increasingly need to assess programming proficiency at scale with transparent, reproducible methods that support personalized learning pathways. Objective: This study introduces a pedagogical framework for Scratch project assessment, aligned with the Common European Framework of Reference (CEFR), providing universal competency levels for students and teachers alongside actionable insights for curriculum design. Method: We apply Fuzzy C-Means clustering to 2008246 Scratch projects evaluated via Dr.Scratch, implementing an ordinal criterion to map clusters to CEFR levels (A1-C2), and introducing enhanced classification metrics that identify transitional learners, enable continuous progress tracking, and quantify classification certainty to balance automated feedback with instructor review. Impact: The framework enables diagnosis of systemic curriculum gaps-notably a "B2 bottleneck" where only 13.3% of learners reside due to the cognitive load of integrating Logic Synchronization, and Data Representation--while providing certainty--based triggers for human intervention.

Updated: 2026-04-01 10:42:07

标题: 一个以CEFR为灵感的分类框架，结合模糊C均值法自动化评估Scratch编程技能

摘要: 背景：学校、培训平台和技术公司越来越需要以透明、可重复的方法在规模上评估编程能力，支持个性化学习路径。目标：本研究介绍了一个与欧洲语言共同参考框架（CEFR）对齐的Scratch项目评估的教学框架，为学生和教师提供通用的能力水平，同时为课程设计提供可操作的见解。方法：我们将模糊C均值聚类应用于通过Dr.Scratch评估的2008246个Scratch项目，实施一个有序标准来将聚类映射到CEFR水平（A1-C2），并引入增强的分类指标，识别过渡学习者，实现持续进度跟踪，并量化分类确定性，以平衡自动化反馈和教师审查。影响：该框架能够诊断系统课程缺陷，特别是一个“B2瓶颈”，只有13.3%的学习者因为整合逻辑同步和数据表示的认知负荷而居住，同时提供基于确定性的触发器以促使人为干预。

更新时间: 2026-04-01 10:42:07

领域: cs.CY,cs.AI,cs.LG,cs.SE

下载: http://arxiv.org/abs/2604.00730v1

From Density Matrices to Phase Transitions in Deep Learning: Spectral Early Warnings and Interpretability

A key problem in the modern study of AI is predicting and understanding emergent capabilities in models during training. Inspired by methods for studying reactions in quantum chemistry, we present the ``2-datapoint reduced density matrix". We show that this object provides a computationally efficient, unified observable of phase transitions during training. By tracking the eigenvalue statistics of the 2RDM over a sliding window, we derive two complementary signals: the spectral heat capacity, which we prove provides early warning of second-order phase transitions via critical slowing down, and the participation ratio, which reveals the dimensionality of the underlying reorganization. Remarkably, the top eigenvectors of the 2RDM are directly interpretable making it straightforward to study the nature of the transitions. We validate across four distinct settings: deep linear networks, induction head formation, grokking, and emergent misalignment. We then discuss directions for future work using the 2RDM.

Updated: 2026-04-01 10:40:57

标题: 从密度矩阵到深度学习中的相变：光谱早期预警和可解释性

摘要: 在现代人工智能研究中一个关键问题是在模型训练过程中预测和理解新兴能力。受量子化学反应研究方法的启发，我们提出了“2-数据点约化密度矩阵”。我们展示了这个对象提供了一个计算效率高、统一的可观察相变过程的方法。通过在滑动窗口上跟踪2RDM的特征值统计，我们得出了两个互补的信号：谱热容量，我们证明它通过临界减速提供了二阶相变的早期警告，以及参与率，它揭示了基础再组织的维度。值得注意的是，2RDM的顶部特征向量是直接可解释的，使得研究过渡的性质变得简单。我们验证了四个不同的设置：深度线性网络、感应头形成、理解和新兴失调。然后我们讨论了使用2RDM进行未来研究的方向。

更新时间: 2026-04-01 10:40:57

领域: cs.LG,cs.AI

下载: http://arxiv.org/abs/2603.29805v2

Two-stage Vision Transformers and Hard Masking offer Robust Object Representations

Context can strongly affect object representations, sometimes leading to undesired biases, particularly when objects appear in out-of-distribution backgrounds at inference. At the same time, many object-centric tasks require to leverage the context for identifying the relevant image regions. We posit that this conundrum, in which context is simultaneously needed and a potential nuisance, can be addressed by an attention-based approach that uses learned binary attention masks to ensure that only attended image regions influence the prediction. To test this hypothesis, we evaluate a two-stage framework: stage 1 processes the full image to discover object parts and identify task-relevant regions, for which context cues are likely to be needed, while stage 2 leverages input attention masking to restrict its receptive field to these regions, enabling a focused analysis while filtering out potentially spurious information. Both stages are trained jointly, allowing stage 2 to refine stage 1. The explicit nature of the semantic masks also makes the model's reasoning auditable, enabling powerful test-time interventions to further enhance robustness. Extensive experiments across diverse benchmarks demonstrate that this approach significantly improves robustness against spurious correlations and out-of-distribution backgrounds. Code: https://github.com/ananthu-aniraj/ifam

Updated: 2026-04-01 10:28:00

标题: 双阶段视觉Transformer和硬掩蔽提供稳健的物体表示

摘要: 上下文可以强烈影响对象表示，有时会导致不良偏见，特别是当对象出现在推理时的分布背景中时。同时，许多以对象为中心的任务需要利用上下文来识别相关的图像区域。我们认为，这种困境，即上下文同时需要且可能会带来麻烦，可以通过一种基于注意力的方法来解决，该方法使用学习的二进制注意力蒙版，以确保只有受关注的图像区域会影响预测。为了验证这一假设，我们评估了一个两阶段框架：第一阶段处理完整图像以发现对象部分并识别任务相关区域，这些区域可能需要上下文线索，而第二阶段利用输入注意力掩蔽将其感受野限制在这些区域，从而实现专注分析并过滤掉潜在的虚假信息。两个阶段共同训练，使第二阶段能够优化第一阶段。语义蒙版的明确性也使模型的推理可审计，从而使强大的测试时间干预进一步增强鲁棒性。在不同基准测试中进行的广泛实验表明，这种方法显着提高了对虚假相关性和分布背景的鲁棒性。代码：https://github.com/ananthu-aniraj/ifam

更新时间: 2026-04-01 10:28:00

领域: cs.CV,cs.AI

下载: http://arxiv.org/abs/2506.08915v4

GRASP: Gradient Realignment via Active Shared Perception for Multi-Agent Collaborative Optimization

Non-stationarity arises from concurrent policy updates and leads to persistent environmental fluctuations. Existing approaches like Centralized Training with Decentralized Execution (CTDE) and sequential update schemes mitigate this issue. However, since the perception of the policies of other agents remains dependent on sampling environmental interaction data, the agent essentially operates in a passive perception state. This inevitably triggers equilibrium oscillations and significantly slows the convergence speed of the system. To address this issue, we propose Gradient Realignment via Active Shared Perception (GRASP), a novel framework that defines generalized Bellman equilibrium as a stable objective for policy evolution. The core mechanism of GRASP involves utilizing the independent gradients of agents to derive a defined consensus gradient, enabling agents to actively perceive policy updates and optimize team collaboration. Theoretically, we leverage the Kakutani Fixed-Point Theorem to prove that the consensus direction $u^*$ guarantees the existence and attainability of this equilibrium. Extensive experiments on StarCraft II Multi-Agent Challenge (SMAC) and Google Research Football (GRF) demonstrate the scalability and promising performance of the framework.

Updated: 2026-04-01 10:26:22

标题: GRASP: 利用主动共享感知进行多智能体协同优化的梯度重新调整

摘要: 非稳态是由于同时进行政策更新而导致环境波动持续存在。现有的方法，如集中式训练与分散执行（CTDE）和顺序更新方案，可以缓解这一问题。然而，由于其他代理的策略感知仍然依赖于采样环境交互数据，代理基本上处于被动感知状态。这不可避免地会触发平衡振荡，并显著减慢系统的收敛速度。为了解决这个问题，我们提出了一种名为Gradient Realignment via Active Shared Perception（GRASP）的新框架，将广义贝尔曼均衡定义为政策演变的稳定目标。GRASP的核心机制涉及利用代理的独立梯度来推导一个定义的共识梯度，使代理能够积极感知政策更新并优化团队协作。理论上，我们利用Kakutani不动点定理证明了共识方向$u^*$保证了这种平衡的存在性和可达性。在对StarCraft II多智能体挑战（SMAC）和谷歌研究足球（GRF）进行的大量实验中展示了该框架的可扩展性和有前景的性能。

更新时间: 2026-04-01 10:26:22

领域: cs.MA,cs.AI

下载: http://arxiv.org/abs/2604.00717v1

CircuitProbe: Predicting Reasoning Circuits in Transformers via Stability Zone Detection

Transformer language models contain localized reasoning circuits, contiguous layer blocks that improve reasoning when duplicated at inference time. Finding these circuits currently requires brute-force sweeps costing 25 GPU hours per model. We propose CircuitProbe, which predicts circuit locations from activation statistics in under 5 minutes on CPU, providing a speedup of three to four orders of magnitude. We find that reasoning circuits come in two types: stability circuits in early layers, detected through the derivative of representation change, and magnitude circuits in late layers, detected through anomaly scoring. We validate across 9 models spanning 6 architectures, including 2025 models, confirming that CircuitProbe top predictions match or are within 2 layers of the optimal circuit in all validated cases. A scaling experiment across the Qwen 2.5 family reveals that layer duplication consistently benefits models under 3B parameters but degrades performance in 7B+ models, making this a practical scaling technique for small language models. CircuitProbe requires as few as 10 calibration examples and its predictions are stable across English, Hindi, Chinese, and French.

Updated: 2026-04-01 10:26:12

标题: CircuitProbe：通过稳定区域检测预测Transformer中的推理电路

摘要: Transformer语言模型包含本地化推理电路，连续的层块在推理时复制时改进推理。目前找到这些电路需要耗费25个GPU小时的蛮力扫描。我们提出了CircuitProbe，可以在CPU上通过激活统计预测电路位置，在5分钟内提供三到四个数量级的加速。我们发现推理电路分为两种类型：早期层中的稳定性电路，通过表示变化的导数检测，以及晚期层中的幅度电路，通过异常评分检测。我们跨越6种架构的9个模型进行验证，包括2025个模型，确认在所有验证的情况下，CircuitProbe的前几个预测与最佳电路匹配或在最佳电路的附近2层内。在Qwen 2.5系列的一个扩展实验中，发现在3B参数以下的模型中，层复制始终有益，但在7B+模型中会降低性能，使得这成为一个适用于小语言模型的实用扩展技术。CircuitProbe仅需要10个校准示例，其预测稳定跨越英语、印地语、中文和法语。

更新时间: 2026-04-01 10:26:12

领域: cs.AI,cs.LG

下载: http://arxiv.org/abs/2604.00716v1

AutoEG: Exploiting Known Third-Party Vulnerabilities in Black-Box Web Applications

Large-scale web applications are widely deployed with complex third-party components, inheriting security risks arising from component vulnerabilities. Security assessment is therefore required to determine whether such known vulnerabilities remain practically exploitable in real applications. Penetration testing is a widely adopted approach that validates exploitability by launching concrete attacks against known vulnerabilities in real-world black-box systems. However, existing approaches often fail to automatically generate reliable exploits, limiting their effectiveness in practical security assessment. This limitation mainly stems from two issues: (1) precisely triggering vulnerabilities with correct technical details, and (2) adapting exploits to diverse real-world deployment settings. In this paper, we propose AutoEG, a fully automated multi-agent framework for exploit generation targeting black-box web applications. AutoEG has two phases: First, AutoEG extracts precise vulnerability trigger logic from unstructured vulnerability information and encapsulates it into reusable trigger functions. Second, AutoEG uses trigger functions for concrete attack objectives and iteratively refines exploits through feedback-driven interaction with the target application. We evaluate AutoEG on 104 real-world vulnerabilities with 29 attack objectives, resulting in 660 exploitation tasks and 55,440 exploit attempts. AutoEG achieves an average success rate of 82.41%, substantially outperforming state-of-the-art baselines, whose best performance reaches only 32.88%.

Updated: 2026-04-01 10:07:45

标题: AutoEG：利用黑盒网络应用中已知的第三方漏洞

摘要: 大规模网络应用广泛部署，具有复杂的第三方组件，从组件漏洞中继承安全风险。因此，需要进行安全评估，以确定这些已知漏洞在实际应用中是否仍然存在实际可利用性。渗透测试是一种广泛采用的方法，通过对实际黑盒系统中已知漏洞发动具体攻击来验证可利用性。然而，现有方法通常无法自动生成可靠的利用程序，从而限制了它们在实际安全评估中的有效性。这种限制主要源于两个问题：（1）精确触发漏洞并提供正确的技术细节，以及（2）将利用程序调整到各种实际部署设置。在本文中，我们提出了AutoEG，这是一个针对黑盒网络应用程序的全自动化多代理框架，用于生成利用程序。AutoEG有两个阶段：首先，AutoEG从非结构化漏洞信息中提取精确的漏洞触发逻辑，并将其封装成可重复使用的触发函数。其次，AutoEG使用触发函数进行具体的攻击目标，通过与目标应用程序的反馈驱动交互来迭代优化利用程序。我们在104个真实漏洞上评估了AutoEG，涉及29个攻击目标，共进行了660个利用任务和55,440次利用尝试。AutoEG取得了82.41%的平均成功率，明显优于最先进的基线方法，其最佳性能仅达到32.88%。

更新时间: 2026-04-01 10:07:45

领域: cs.CR,cs.AI,cs.SE

下载: http://arxiv.org/abs/2604.00704v1

Enhancing REST API Fuzzing with Access Policy Violation Checks and Injection Attacks

Due to their widespread use in industry, several techniques have been proposed in the literature to fuzz REST APIs. Existing fuzzers for REST APIs have been focusing on detecting crashes (e.g., 500 HTTP server error status code). However, security vulnerabilities can have major drastic consequences on existing cloud infrastructures. In this paper, we propose a series of novel automated oracles aimed at detecting violations of access policies in REST APIs, as well as executing traditional attacks such as SQL Injection and XSS. These novel automated oracles can be integrated into existing fuzzers, in which, once the fuzzing session is completed, a ``security testing'' phase is executed to verify these oracles. When a security fault is detected, as output our technique is able to general executable test cases in different formats, like Java, Kotlin, Python and JavaScript test suites. Our novel techniques are integrated as an extension of EvoMaster, a state-of-the-art open-source fuzzer for REST APIs. Experiments are carried out on 9 artificial examples, 8 vulnerable-by-design REST APIs with black-box testing, and 36 REST APIs from the WFD corpus with white-box testing, for a total of 52 distinct APIs. Results show that our novel oracles and their automated integration in a fuzzing process can lead to detect security issues in several of these APIs.

Updated: 2026-04-01 10:05:23

标题: 通过访问策略违规检查和注入攻击增强REST API Fuzzing

摘要: 由于它们在工业中的广泛应用，文献中提出了几种技术来模糊REST API。现有的REST API模糊器一直专注于检测崩溃（例如500 HTTP服务器错误状态代码）。然而，安全漏洞可能对现有的云基础设施产生重大影响。在本文中，我们提出了一系列新颖的自动化预言，旨在检测REST API中对访问策略的违反，以及执行传统攻击，比如SQL注入和跨站脚本攻击。这些新颖的自动化预言可以集成到现有的模糊器中，一旦模糊会话完成，将执行一个“安全测试”阶段来验证这些预言。当检测到安全故障时，我们的技术能够生成不同格式的可执行测试用例，如Java、Kotlin、Python和JavaScript测试套件。我们的新颖技术作为EvoMaster的扩展集成到REST API的最新开源模糊器中。在9个人工示例、8个有意设计的易受攻击REST API的黑盒测试以及36个WFD语料库中的REST API的白盒测试中进行实验，共计52个不同的API。结果表明，我们的新颖预言及其自动化集成到模糊过程中可以帮助检测出这些API中的安全问题。

更新时间: 2026-04-01 10:05:23

领域: cs.SE,cs.CR

下载: http://arxiv.org/abs/2604.00702v1

Bypassing Prompt Injection Detectors through Evasive Injections

Large language models (LLMs) are increasingly used in interactive and retrieval-augmented systems, but they remain vulnerable to prompt injection attacks, where injected secondary prompts force the model to deviate from the user's instructions to execute a potentially malicious task defined by the adversary. Recent work shows that ML models trained on activation shifts from LLMs' hidden layers can detect such drift. In this paper, we demonstrate that these detectors are not robust to adaptive adversaries. We propose a multi-probe evasion attack that appends an adversarially optimised suffix to poisoned inputs, jointly optimising a universal suffix to simultaneously fool all layer-wise drift detectors while preserving the effectiveness of the underlying injection. Using a modified Greedy Coordinate Gradient (GCG) approach, we generate universal suffixes that make prompt injections consistently evasive across multiple probes simultaneously. On Phi-3 3.8B and Llama-3 8B, a single suffix achieves attack success rates of 93.91% and 99.63% in successfully evading all detectors simultaneously. These results show that activation-based task drift detectors are highly vulnerable to adaptive prompt injection attacks, motivating stronger defences against such threats. We also propose a defence based on adversarial suffix augmentation: we generate multiple suffixes, append one at random during forward passes, and train detectors on the resulting activations. This approach is found to be effective against evasive attacks.

Updated: 2026-04-01 09:26:27

标题: 通过规避注入绕过提示注入检测器

摘要: 大型语言模型(LLMs)越来越多地用于交互和检索增强系统，但它们仍然容易受到提示注入攻击的影响，其中注入的次级提示会迫使模型偏离用户的指令，执行可能由对手定义的恶意任务。最近的研究表明，基于LLMs隐藏层的激活变化训练的ML模型可以检测到这种漂移。在本文中，我们证明这些检测器对自适应对手不具有鲁棒性。我们提出了一种多探测规避攻击，将对抗性优化后缀附加到毒害输入中，同时优化一个通用后缀，以同时欺骗所有分层漂移检测器，同时保持底层注入的有效性。使用修改后的贪婪坐标梯度(GCG)方法，我们生成通用后缀，使提示注入在多个探测过程中始终具有规避性。在Phi-3 3.8B和Llama-3 8B上，单个后缀实现了攻击成功率分别为93.91%和99.63%，成功地同时规避了所有检测器。这些结果表明，基于激活的任务漂移检测器对自适应提示注入攻击非常容易受到影响，促使我们采取更强大的防御措施来应对这些威胁。我们还提出了一种基于对抗性后缀增强的防御方法：我们生成多个后缀，在前向传递过程中随机附加一个，并在生成的激活上训练检测器。这种方法被证明对规避攻击有效。

更新时间: 2026-04-01 09:26:27

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2602.00750v2

Jailbreaking Generative AI: Multivector Phishing Threats and Transformer based Defenses

The rise of Generative AI (GenAI) has reshaped the cybersecurity landscape by enabling new attack vectors and lowering the barrier for executing advanced social engineering campaigns. This study conducts an empirical analysis of jailbreaking vulnerabilities in ChatGPT-4o-Mini, showing that novices can bypass safeguards to generate complete multivector phishing attacks across email, web, SMS, and voice channels. Controlled experiments reveal that role-based jailbreaks produce fully operational attack paths capable of credential harvesting. User studies further demonstrate the disruptive potential of GenAI: novice participants exhibited a 240\% increase in perceived phishing competence, a 400\% improvement in task completion rates, and a 57\% reduction in implementation time when assisted by GenAI compared to traditional internet resources. To address these risks, a transformer-based detection framework was developed, achieving an F1-score of 0.9864 (XLNET) for identifying malicious prompts. The work underscores the urgency of strengthening LLM guardrails and provides an annotated dataset to support future defenses.

Updated: 2026-04-01 09:07:54

标题: 越狱生成式人工智能：多向量钓鱼威胁与基于Transformer的防御措施

摘要: 生成式人工智能（GenAI）的兴起已经重塑了网络安全领域，使新的攻击向量成为可能，并降低了执行高级社会工程攻击的障碍。本研究对ChatGPT-4o-Mini中越狱漏洞进行了实证分析，结果显示新手可以绕过防护措施，在电子邮件、网络、短信和语音渠道上生成完整的多向量钓鱼攻击。控制实验揭示了基于角色的越狱可以产生完全操作的攻击路径，可以进行凭证窃取。用户研究进一步证明了GenAI的破坏潜力：与传统互联网资源相比，在GenAI的帮助下，新手参与者的感知钓鱼能力增加了240％，任务完成率提高了400％，实施时间减少了57％。为了应对这些风险，开发了基于变压器的检测框架，实现了0.9864的F1分数（XLNET）用于识别恶意提示。这项工作强调了加强LLM防护措施的紧迫性，并提供了一个带注释的数据集，以支持未来的防御工作。

更新时间: 2026-04-01 09:07:54

领域: cs.CR

下载: http://arxiv.org/abs/2507.12185v2

LibScan: Smart Contract Library Misuse Detection with Iterative Feedback and Static Verification

Smart contracts are self-executing programs that manage financial transactions on blockchain networks. Developers commonly rely on third-party code libraries to improve both efficiency and security. However, improper use of these libraries can introduce hidden vulnerabilities that are difficult to detect, leading to significant financial losses. Existing automated tools struggle to identify such misuse because it often requires understanding the developer's intent rather than simply scanning for known code patterns. This paper presents LibScan, an automated detection framework that combines large language model (LLM)-based semantic reasoning with rule-based code analysis, identifying eight distinct categories of library misuse in smart contracts. To improve detection reliability, the framework incorporates an iterative self-correction mechanism that refines its analysis across multiple rounds, alongside a structured knowledge base derived from large-scale empirical studies of real-world misuse cases. Experiments conducted on 662 real-world smart contracts demonstrate that LibScan achieves an overall detection accuracy of 85.15\%, outperforming existing tools by a margin of over 16 percentage points. Ablation experiments further confirm that combining both analysis approaches yields substantially better results than either method used independently.

Updated: 2026-04-01 09:04:01

标题: LibScan：使用迭代反馈和静态验证检测智能合约库滥用

摘要: 智能合约是在区块链网络上管理金融交易的自执行程序。开发人员通常依赖第三方代码库来提高效率和安全性。然而，对这些库的不当使用可能引入难以检测的隐藏漏洞，导致重大财务损失。现有的自动化工具难以识别这种误用，因为它通常需要理解开发人员的意图，而不仅仅是扫描已知的代码模式。本文提出了LibScan，一个自动检测框架，结合了基于大型语言模型(LLM)的语义推理和基于规则的代码分析，识别智能合约中八种不同类别的库误用。为了提高检测可靠性，该框架还包括一个迭代自我校正机制，通过多轮细化分析，以及一个从大规模实证研究中得出的结构化知识库。对662个真实世界智能合约进行的实验表明，LibScan实现了整体检测准确率为85.15%，比现有工具高出16个百分点以上。消融实验进一步证实，结合两种分析方法比单独使用任一方法都能获得更好的结果。

更新时间: 2026-04-01 09:04:01

领域: cs.SE,cs.CR

下载: http://arxiv.org/abs/2604.00657v1

A Divide-and-Conquer Strategy for Hard-Label Extraction of Deep Neural Networks via Side-Channel Attacks

During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects. However despite their high value and public accessibility, the protection of the intellectual property of DNNs is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings, proving that it was possible to copy a DNN with high fidelity, i.e., high similitude in the output predictions. However, the current cryptanalytic attacks cannot target complex, i.e., not fully connected, DNNs and are limited to special cases of neurons present in deep networks. In this work, we introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity. We describe a new black-box side-channel attack which splits the DNN in several linear parts for which we can perform cryptanalytic extraction and retrieve the weights in hard-label settings. With this method, we are able to adapt cryptanalytic extraction, for the first time, to non-fully connected DNNs, while maintaining a high fidelity. We validate our contributions by targeting several architectures implemented on a microcontroller unit, including a Multi-Layer Perceptron (MLP) of 1.7 million parameters and a shortened MobileNetv1. Our framework successfully extracts all of these DNNs with high fidelity (88.4% for the MobileNetv1 and 93.2% for the MLP). Furthermore, we use the stolen model to generate adversarial examples and achieve close to white-box performance on the victim's model (95.8% and 96.7% transfer rate).

Updated: 2026-04-01 08:38:50

标题: 一种通过侧信道攻击进行深度神经网络硬标签提取的分而治之策略

摘要: 在过去的十年中，深度神经网络（DNNs）在许多领域证明了其价值。然而，尽管它们具有高价值和公共可访问性，但保护DNNs的知识产权仍然是一个问题和一个新兴的研究领域。最近的研究成功地使用密码分析方法在硬标签设置下提取了全连接的DNNs，证明了可以以高保真度复制DNN，即输出预测的高相似度。然而，当前的密码分析攻击不能针对复杂的、即不完全连接的DNNs，并且仅限于深度网络中存在的神经元的特殊情况。在这项工作中，我们引入了一个新的端到端攻击框架，专门用于高保真度提取嵌入式DNNs的模型。我们描述了一种新的黑盒侧信道攻击，将DNN分成几个线性部分，我们可以在硬标签设置中执行密码分析提取并检索权重。通过这种方法，我们能够首次将密码分析提取适应于非完全连接的DNNs，同时保持高保真度。我们通过针对在微控制器单元上实现的几种架构进行验证，包括具有170万参数的多层感知器（MLP）和缩短的MobileNetv1。我们的框架成功地提取了所有这些DNN，保真度很高（MobileNetv1为88.4%，MLP为93.2%）。此外，我们使用窃取的模型生成对抗性示例，并在受害者模型上实现接近白盒性能（95.8%和96.7%的转移率）。

更新时间: 2026-04-01 08:38:50

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2411.10174v2

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

Model merging has emerged as a powerful technique for combining specialized capabilities from multiple fine-tuned LLMs without additional training costs. However, the security implications of this widely-adopted practice remain critically underexplored. In this work, we reveal that model merging introduces a novel attack surface that can be systematically exploited to compromise safety alignment. We present TrojanMerge,, a framework that embeds latent malicious components into source models that remain individually benign but produce severely misaligned models when merged. Our key insight is formulating this attack as a constrained optimization problem: we construct perturbations that preserve source model safety through directional consistency constraints, maintain capabilities via Frobenius directional alignment constraints, yet combine during merging to form pre-computed attack vectors. Extensive experiments across 9 LLMs from 3 model families demonstrate that TrojanMerge, consistently achieves high harmful response rates in merged models while source models maintain safety scores comparable to unmodified versions. Our attack succeeds across diverse merging algorithms and remains effective under various hyperparameter configurations. These findings expose fundamental vulnerabilities in current model merging practices and highlight the urgent need for security-aware mechanisms.

Updated: 2026-04-01 08:32:46

标题: 当安全模型融合成危险模型时：利用LLM融合中的潜在漏洞

摘要: 模型合并已成为一种强大的技术，可以将多个经过精细调整的LLM的专业能力结合起来，而无需额外的培训成本。然而，这种被广泛采用的做法的安全影响仍然受到严重忽视。在这项工作中，我们揭示了模型合并引入了一种新颖的攻击面，可以系统地被利用来破坏安全对齐。我们提出了TrojanMerge，这是一个框架，将潜在的恶意组件嵌入到源模型中，这些组件在单独情况下是良性的，但在合并时产生严重不对齐的模型。我们的关键洞察是将这种攻击形式化为一个受限的优化问题：我们构建扰动，通过方向一致性约束来保持源模型的安全性，通过Frobenius方向对齐约束来保持能力，然后在合并时形成预先计算的攻击向量。在来自3个模型家族的9个LLM上进行的大量实验证明，TrojanMerge在合并模型中始终实现高有害响应率，而源模型的安全评分与未修改版本相当。我们的攻击成功地跨越了不同的合并算法，并在各种超参数配置下仍然有效。这些发现揭示了当前模型合并实践中的基本漏洞，并突显了对安全意识机制的迫切需求。

更新时间: 2026-04-01 08:32:46

领域: cs.CR

下载: http://arxiv.org/abs/2604.00627v1

Towards Explainable Privacy Preservation in Federated Learning via Shapley Value-Guided Noise Injection

This paper proposes FedSVA, an explainable differential privacy (DP) mechanism for federated learning (FL) that dynamically calibrates noise injection based on the privacy contribution of attributes via Shapley Values. Unlike heuristic DP methods, FedSVA quantifies each attribute's influence on model training and adjusts noise accordingly, providing rigorous privacy guarantees while minimizing utility loss. Theoretical analysis confirms convergence and DP properties. Experiments on CIFAR-10 and FEMNIST show state-of-the-art privacy-utility trade-offs and robust defense against reconstruction attacks.

Updated: 2026-04-01 08:03:56

标题: 通过夏普利值引导的噪音注入，实现联邦学习中可解释的隐私保护

摘要: 本文提出了FedSVA，这是一种用于联邦学习（FL）的可解释差分隐私（DP）机制，它通过Shapley Values动态校准噪声注入，基于属性的隐私贡献。与启发式DP方法不同，FedSVA量化每个属性对模型训练的影响，并相应调整噪声，提供严格的隐私保证同时最小化效用损失。理论分析证实了收敛性和DP性质。在CIFAR-10和FEMNIST上的实验显示了最先进的隐私-效用权衡和对重建攻击的强大防御。

更新时间: 2026-04-01 08:03:56

领域: cs.CR

下载: http://arxiv.org/abs/2503.12958v2

Quantum-Safe Code Auditing: LLM-Assisted Static Analysis and Quantum-Aware Risk Scoring for Post-Quantum Cryptography Migration

The impending arrival of cryptographically relevant quantum computers (CRQCs) threatens the security foundations of modern software: Shor's algorithm breaks RSA, ECDSA, ECDH, and Diffie-Hellman, while Grover's algorithm reduces the effective security of symmetric and hash-based schemes. Despite NIST standardising post-quantum cryptography (PQC) in 2024 (FIPS 203 ML-KEM, FIPS 204 ML-DSA, FIPS 205 SLH-DSA), most codebases lack automated tooling to inventory classical cryptographic usage and prioritise migration based on quantum risk. We present Quantum-Safe Code Auditor, a quantum-aware static analysis framework that combines (i) regex-based detection of 15 classes of quantum-vulnerable primitives, (ii) LLM-assisted contextual enrichment to classify usage and severity, and (iii) risk scoring via a Variational Quantum Eigensolver (VQE) model implemented in Qiskit 2.x, incorporating qubit-cost estimates to prioritise findings. We evaluate the system across five open-source libraries -- python-rsa, python-ecdsa, python-jose, node-jsonwebtoken, and Bouncy Castle Java -- covering 5,775 findings. On a stratified sample of 602 labelled instances, we achieve 71.98% precision, 100% recall, and an F1 score of 83.71%. All code, data, and reproduction scripts are released as open-source.

Updated: 2026-04-01 07:10:17

标题: 量子安全代码审计：LLM辅助静态分析和量子感知风险评分用于后量子密码迁移

摘要: 密码学相关的量子计算机的即将到来威胁着现代软件的安全基础：Shor算法破解了RSA、ECDSA、ECDH和Diffie-Hellman，而Grover算法降低了对称和基于哈希的方案的有效安全性。尽管NIST在2024年标准化了后量子密码学（PQC）（FIPS 203 ML-KEM，FIPS 204 ML-DSA，FIPS 205 SLH-DSA），大多数代码库缺乏自动化工具来对传统密码使用进行清单和基于量子风险进行迁移的优先级排序。我们提出Quantum-Safe Code Auditor，这是一个量子意识的静态分析框架，结合了（i）基于正则表达式的检测15种量子易受攻击的原语，（ii）LLM辅助的上下文丰富化来分类使用和严重性，以及（iii）通过在Qiskit 2.x中实现的Variational Quantum Eigensolver（VQE）模型进行风险评分，包括估算量子比特成本以优先处理结果。我们在五个开源库（python-rsa、python-ecdsa、python-jose、node-jsonwebtoken和Bouncy Castle Java）上评估了该系统，共涵盖5,775个发现。在602个标记实例的分层样本上，我们实现了71.98%的精度，100%的召回率和83.71%的F1分数。所有代码、数据和重现脚本均作为开源发布。

更新时间: 2026-04-01 07:10:17

领域: cs.CR,cs.SE,quant-ph

下载: http://arxiv.org/abs/2604.00560v1

Lightweight, Practical Encrypted Face Recognition with GPU Support

Face recognition models operate in a client-server setting where a client extracts a compact face embedding and a server performs similarity search over a template database. This raises privacy concerns, as facial data is highly sensitive. To provide cryptographic privacy guarantees, one can use fully homomorphic encryption to perform end-to-end encrypted similarity search. However, existing FHE-based protocols are computationally costly and, impose high memory overhead. Building on prior work, HyDia, we introduce algorithmic and system-level improvements targeting real-world deployment with resource-constrained clients. First, we propose BSGS-Diagonal, an algorithm delivering fast and memory-efficient similarity computation. BSGS-Diagonal substantially shrinks the rotation-key set, lowering both client and server memory requirements, and also improves practical server runtime. This yields a 91% reduction in the number of rotation keys, translating to approximately 14 GB less memory used on the client, and reducing overall CPU peak RAM from over 30 GB in the original HyDia to under 10 GB for databases up to size 1M. In addition, runtime is improved by up to 1.57x for the membership verification scenario and 1.43x for the identification scenario. Secondly, we introduce fully GPU-optimized similarity matrix computation kernels. The implementation is built upon FIDESlib, a CKKS-level GPU library based on OpenFHE. Rather than offloading individual CKKS primitives in isolation, the integrated kernels fuse operations to avoid repeated CPU-GPU ciphertext movement and costly FIDESlib/OpenFHE data-structure conversions. As a result, our GPU implementations of both HyDia and BSGS-Diagonal achieve up to 9x and 17x speedups, respectively, enabling sub-second encrypted face recognition for databases up to 32K entries while further reducing host memory usage.

Updated: 2026-04-01 06:43:36

标题: 轻量级、实用的带GPU支持的加密人脸识别

摘要: 面部识别模型在客户端 - 服务器设置中运行，其中客户端提取紧凑的面部嵌入，服务器在模板数据库上执行相似性搜索。这引发了隐私问题，因为面部数据非常敏感。为了提供加密隐私保证，可以使用完全同态加密来执行端到端加密的相似性搜索。然而，现有基于FHE的协议在计算上代价高昂，并且造成了高内存开销。在之前的工作HyDia的基础上，我们引入了针对资源受限客户端的实际部署的算法和系统级改进。首先，我们提出BSGS-Diagonal算法，提供快速和内存高效的相似性计算。BSGS-Diagonal大大缩小了旋转密钥集，降低了客户端和服务器的内存需求，同时改善了实际服务器运行时间。这导致旋转密钥数量减少了91％，在客户端上使用的内存减少了大约14GB，并将原始HyDia的整体CPU峰值RAM从超过30GB降至1M大小的数据库下的10GB以下。此外，对于成员验证场景，运行时间提高了高达1.57倍，对于识别场景提高了1.43倍。其次，我们引入了完全优化的GPU相似性矩阵计算核心。该实现建立在FIDESlib上，这是一个基于OpenFHE的CKKS级GPU库。与将单个CKKS原语单独卸载到GPU不同，集成的核心融合操作以避免重复的CPU-GPU密文移动和昂贵的FIDESlib/OpenFHE数据结构转换。结果，我们对HyDia和BSGS-Diagonal的GPU实现分别实现了高达9倍和17倍的加速，从而使32K条目的数据库实现亚秒级的加密面部识别，同时进一步减少了主机内存使用。

更新时间: 2026-04-01 06:43:36

领域: cs.CR

下载: http://arxiv.org/abs/2604.00546v1

MOLM: Mixture of LoRA Markers

Generative models can generate photorealistic images at scale. This raises urgent concerns about the ability to detect synthetically generated images and attribute these images to specific sources. While watermarking has emerged as a possible solution, existing methods remain fragile to realistic distortions, susceptible to adaptive removal, and expensive to update when the underlying watermarking key changes. We propose a general watermarking framework that formulates the encoding problem as key-dependent perturbation of the parameters of a generative model. Within this framework, we introduce Mixture of LoRA Markers (MOLM), a routing-based instantiation in which binary keys activate lightweight LoRA adapters inside residual and attention blocks. This design avoids key-specific re-training and achieves the desired properties such as imperceptibility, fidelity, verifiability, and robustness. Experiments on Stable Diffusion and FLUX show that MOLM preserves image quality while achieving robust key recovery against distortions, compression and regeneration, averaging attacks, and black-box adversarial attacks on the extractor.

Updated: 2026-04-01 06:41:00

标题: MOLM：LoRA标记物的混合

摘要: 生成模型可以在规模上生成逼真的图像。这引发了关于检测合成图像并将这些图像归因于特定来源的能力的紧急关注。虽然数字水印已经成为可能的解决方案，但现有方法仍然对真实失真脆弱，容易受到自适应移除的影响，并且当基础水印密钥发生变化时更新昂贵。我们提出了一个通用的水印框架，将编码问题形式化为生成模型参数的密钥相关扰动。在这个框架内，我们介绍了Mixture of LoRA Markers（MOLM），这是一种基于路由的实例化，在这种实例化中，二进制密钥激活轻量级LoRA适配器，在残差和注意力块内。这种设计避免了特定于密钥的重新训练，并实现了所需的属性，如难以察觉性，忠实性，可验证性和稳健性。在稳定扩散和FLUX上的实验证明，MOLM在保持图像质量的同时，对失真、压缩和再生、平均攻击以及对提取器的黑盒对抗攻击实现了强大的密钥恢复。

更新时间: 2026-04-01 06:41:00

领域: cs.CV,cs.CR,cs.LG

下载: http://arxiv.org/abs/2510.00293v2

SPDMark: Selective Parameter Displacement for Robust Video Watermarking

The advent of high-quality video generation models has amplified the need for robust watermarking schemes that can be used to reliably detect and track the provenance of generated videos. Existing video watermarking methods based on both post-hoc and in-generation approaches fail to simultaneously achieve imperceptibility, robustness, and computational efficiency. This work introduces a novel framework for in-generation video watermarking called SPDMark (pronounced `SpeedMark') based on selective parameter displacement of a video diffusion model. Watermarks are embedded into the generated videos by modifying a subset of parameters in the generative model. To make the problem tractable, the displacement is modeled as an additive composition of layer-wise basis shifts, where the final composition is indexed by the watermarking key. For parameter efficiency, this work specifically leverages low-rank adaptation (LoRA) to implement the basis shifts. During the training phase, the basis shifts and the watermark extractor are jointly learned by minimizing a combination of message recovery, perceptual similarity, and temporal consistency losses. To detect and localize temporal modifications in the watermarked videos, we use a cryptographic hashing function to derive frame-specific watermark messages from the given base watermarking key. During watermark extraction, maximum bipartite matching is applied to recover the correct frame order, even from temporally tampered videos. Evaluations on both text-to-video and image-to-video generation models demonstrate the ability of SPDMark to generate imperceptible watermarks that can be recovered with high accuracy and also establish its robustness against a variety of common video modifications.

Updated: 2026-04-01 05:55:29

标题: SPDMark：用于稳健视频水印的选择性参数位移

摘要: 高质量视频生成模型的出现加大了对稳健水印方案的需求，这些方案可用于可靠地检测和追踪生成视频的来源。现有的基于事后和生成过程中方法的视频水印方法未能同时实现不可察觉性、稳健性和计算效率。本文介绍了一种基于视频扩散模型的选择性参数位移的生成过程中视频水印框架SPDMark（发音为'SpeedMark'）。水印通过修改生成模型中的一部分参数嵌入到生成的视频中。为了使问题可解，位移被建模为逐层基础位移的加性组合，最终组合由水印密钥索引。为了提高参数效率，本文特别利用低秩适应（LoRA）来实现基础位移。在训练阶段，通过最小化消息恢复、感知相似度和时间一致性损失的组合来联合学习基础位移和水印提取器。为了检测和定位水印视频中的时间修改，我们使用加密哈希函数从给定的基础水印密钥中派生特定帧的水印消息。在水印提取过程中，使用最大二分匹配来恢复正确的帧顺序，即使是在时间上篡改过的视频中也能实现。在文本到视频和图像到视频生成模型上的评估表明，SPDMark能够生成不可察觉的水印，并能够高精确度地恢复，同时还证明了其对各种常见视频修改的稳健性。

更新时间: 2026-04-01 05:55:29

领域: cs.CV,cs.CR,cs.LG

下载: http://arxiv.org/abs/2512.12090v2

SmartPoC: Generating Executable and Validated PoCs for Smart Contract Bug Reports

Smart contracts are commonly audited through static analysis to explore vulnerabilities. However, static approaches typically produce heterogeneous findings rather than reproducible, executable proof-of-concept (PoC) test cases, leading to costly and ad hoc manual validation. Large language models (LLMs) offer a promising way to translate audit reports into PoC test cases, but face three major challenges: noisy inputs, lack of execution grounding, and missing runtime oracles. We present SmartPoC, an end-to-end approach for validating reported vulnerabilities in audit reports by generating and executing PoC test cases with automated exploitability verification. SmartPoC first extracts a focused function-level slice from each report to reduce noise, centering on the key functions referenced in a finding and augmenting them with execution-relevant neighbors. To improve executability, we wrap LLM-based PoC synthesis in a generate-repair-execute loop, combining deterministic pre-execution sanitization with feedback-driven post-execution debugging. We further use differential verification as an oracle to confirm the exploitability of generated test cases. On the SmartBugs-Vul and FORGE-Vul benchmarks, SmartPoC achieves confirmation precision of 98.32% and 98.65%, with recall of 84.17% and 85.28%, respectively. On a recent Etherscan verified-source corpus, SmartPoC confirms 64 bugs from 545 audit findings at an average cost of $0.03.

Updated: 2026-04-01 05:51:22

标题: 智能PoC: 为智能合约漏洞报告生成可执行和验证的PoCs

摘要: 智能合约通常通过静态分析进行审计，以探索漏洞。然而，静态方法通常产生异构的发现，而不是可重现的、可执行的概念验证（PoC）测试用例，导致昂贵和临时的手动验证。大型语言模型（LLMs）提供了一种有前途的方法，将审计报告转化为PoC测试用例，但面临三个主要挑战：噪声输入、缺乏执行基础和缺少运行时预言。我们提出了SmartPoC，一种端到端的方法，通过生成和执行PoC测试用例，并进行自动利用验证，来验证审计报告中报告的漏洞。SmartPoC首先从每份报告中提取一个聚焦于功能级别的片段，以减少噪声，重点放在发现中引用的关键函数上，并用执行相关的邻居进行增强。为了提高可执行性，我们将基于LLM的PoC合成包装在一个生成-修复-执行循环中，结合确定性的预执行清理和反馈驱动的后执行调试。我们进一步使用差分验证作为预言，来确认生成的测试用例的可利用性。在SmartBugs-Vul和FORGE-Vul基准测试上，SmartPoC分别实现了98.32%和98.65%的确认精度，召回率分别为84.17%和85.28%。在最近的Etherscan验证源语料库中，SmartPoC确认了545个审计结果中的64个漏洞，平均成本为0.03美元。

更新时间: 2026-04-01 05:51:22

领域: cs.SE,cs.CR

下载: http://arxiv.org/abs/2511.12993v3

Secure Forgetting: A Framework for Privacy-Driven Unlearning in Large Language Model (LLM)-Based Agents

Large language model (LLM)-based agents have recently gained considerable attention due to the powerful reasoning capabilities of LLMs. Existing research predominantly focuses on enhancing the task performance of these agents in diverse scenarios. However, as LLM-based agents become increasingly integrated into real-world applications, significant concerns emerge regarding their accumulation of sensitive or outdated knowledge. Addressing these concerns requires the development of mechanisms that allow agents to selectively forget previously learned knowledge, giving rise to a new term LLM-based agent unlearning. This paper initiates research on unlearning in LLM-based agents. Specifically, we propose a novel and comprehensive framework that categorizes unlearning scenarios into three contexts: state unlearning (forgetting specific states or items), trajectory unlearning (forgetting sequences of actions) and environment unlearning (forgetting entire environments or categories of tasks). Within this framework, we introduce a natural language-based unlearning method that trains a conversion model to transform high-level unlearning requests into actionable unlearning prompts, guiding agents through a controlled forgetting process. Moreover, to evaluate the robustness of the proposed framework, we introduce an unlearning inference adversary capable of crafting prompts, querying agents, and observing their behaviors in an attempt to infer the forgotten knowledge. Experimental results show that our approach effectively enables agents to forget targeted knowledge while preserving performance on untargeted tasks, and prevents the adversary from inferring the forgotten knowledge.

Updated: 2026-04-01 03:17:35

标题: 安全遗忘：大型语言模型(LLM)代理的隐私驱动反学习框架

摘要: 最近，基于大型语言模型（LLM）的代理因其强大的推理能力而受到了广泛关注。现有研究主要集中在增强这些代理在各种场景中的任务表现上。然而，随着基于LLM的代理越来越多地整合到现实世界的应用中，人们对其积累敏感或过时知识的重大关注开始出现。解决这些问题需要开发机制，使代理能够选择性地忘记先前学习的知识，从而产生了一个新术语LLM-based agent unlearning。本文开启了关于LLM-based代理中遗忘的研究。具体地，我们提出了一个新颖而全面的框架，将遗忘场景分为三个情境：状态遗忘（忘记特定状态或项目）、轨迹遗忘（忘记动作序列）和环境遗忘（忘记整个环境或任务类别）。在这个框架内，我们引入了一种基于自然语言的遗忘方法，通过训练一个转换模型将高级遗忘请求转化为可执行的遗忘提示，引导代理通过受控遗忘过程。此外，为了评估所提框架的鲁棒性，我们引入了一个遗忘推理对手，能够制定提示、查询代理，并观察其行为，试图推断被遗忘的知识。实验结果表明，我们的方法有效地使代理能够忘记目标知识，同时保持在非目标任务上的表现，并防止对手推断出被遗忘的知识。

更新时间: 2026-04-01 03:17:35

领域: cs.MA,cs.CR

下载: http://arxiv.org/abs/2604.00430v1

Efficient DPF-based Error-Detecting Information-Theoretic Private Information Retrieval Over Rings

Authenticated private information retrieval (APIR) is the state-of-the-art error-detecting private information retrieval (ED-PIR), using Distributed Point Functions (DPFs) for subpolynomial complexity and privacy. However, its finite field structure restricts it to prime-order DPFs, leading to prohibitively large key sizes under information-theoretic settings, while its dual-DPF-key design introduces unnecessary communication overhead, limiting its practicality for large-scale deployments. This paper proposes a novel ring-based information-theoretic ED-PIR (itED-PIR) scheme that overcomes these limitations by leveraging prime-power-order information-theoretic DPFs (itDPFs). Built over a prime-power ring, the proposed scheme breaks APIR's field-induced constraint to enable more efficient DPF utilization, significantly reducing key size growth and rendering the scheme feasible for high-security scenarios. Additionally, a single-itDPF-key design halves query-side communication overhead by eliminating APIR's redundant dual-key setup, without compromising privacy or verifiability. Beyond immediate efficiency gains, this work establishes a lightweight, flexible framework for constructing DPF-based malicious-resilient private information retrieval, opening new avenues for privacy-preserving data retrieval in distributed storage systems and post-quantum privacy protocols.

Updated: 2026-04-01 02:51:51

标题: 高效的基于DPF的基于环的错误检测信息理论私人信息检索

摘要: 身份验证私人信息检索（APIR）是最先进的错误检测私人信息检索（ED-PIR）技术，利用分布点函数（DPFs）实现亚多项式复杂度和隐私保护。然而，它的有限域结构限制了它只能使用素数阶DPFs，在信息理论设置下导致密钥尺寸过大，而其双重DPF密钥设计引入了不必要的通信开销，限制了其在大规模部署中的实用性。本文提出了一种新颖的基于环的信息理论ED-PIR（itED-PIR）方案，通过利用素数幂阶信息理论DPFs（itDPFs）克服了这些限制。基于素数幂环构建的方案打破了APIR的域限制，实现了更高效的DPF利用，显著减小了密钥尺寸增长，使得方案在高安全性场景下可行。此外，单一itDPF密钥设计通过消除APIR的冗余双密钥设置，将查询端通信开销减少一半，而不会影响隐私或可验证性。除了立即的效率提升，这项工作建立了一个轻量级、灵活的框架，用于构建基于DPF的恶意韧性私人信息检索，为分布式存储系统和后量子隐私协议中的隐私保护数据检索开辟了新途径。

更新时间: 2026-04-01 02:51:51

领域: cs.CR,cs.IT

下载: http://arxiv.org/abs/2604.00411v1

Certifiably Robust RAG against Retrieval Corruption

Retrieval-augmented generation (RAG) is susceptible to retrieval corruption attacks, where malicious passages injected into retrieval results can lead to inaccurate model responses. We propose RobustRAG, the first defense framework with certifiable robustness against retrieval corruption attacks. The key insight of RobustRAG is an isolate-then-aggregate strategy: we isolate passages into disjoint groups, generate LLM responses based on the concatenated passages from each isolated group, and then securely aggregate these responses for a robust output. To instantiate RobustRAG, we design keyword-based and decoding-based algorithms for securely aggregating unstructured text responses. Notably, RobustRAG achieves certifiable robustness: for certain queries in our evaluation datasets, we can formally certify non-trivial lower bounds on response quality -- even against an adaptive attacker with full knowledge of the defense and the ability to arbitrarily inject a bounded number of malicious passages. We evaluate RobustRAG on the tasks of open-domain question-answering and free-form long text generation and demonstrate its effectiveness across three datasets and three LLMs.

Updated: 2026-04-01 02:44:05

标题: 可证实的抗检索失真的RAG

摘要: 检索增强生成（RAG）容易受到检索损坏攻击的影响，其中注入到检索结果中的恶意段落可能导致模型响应不准确。我们提出了RobustRAG，这是第一个具有可证明鲁棒性对抗检索损坏攻击的防御框架。RobustRAG的关键见解是一种隔离-聚合策略：我们将段落隔离成不相交的组，基于每个隔离组中连接的段落生成LLM响应，然后安全地聚合这些响应以获得鲁棒的输出。为实现RobustRAG，我们设计了基于关键字和解码的算法，用于安全地聚合非结构化文本响应。值得注意的是，RobustRAG实现了可证明的鲁棒性：对于我们评估数据集中的某些查询，我们可以正式证明响应质量的非平凡下界，即使对抗能力强的攻击者具有对防御措施的完全了解，并能够任意注入有限数量的恶意段落。我们在开放域问答和自由形式长文本生成任务上评估了RobustRAG，并展示了其在三个数据集和三个LLM上的有效性。

更新时间: 2026-04-01 02:44:05

领域: cs.LG,cs.CL,cs.CR

下载: http://arxiv.org/abs/2405.15556v2

Leveraging Large Language Models to Bridge Cross-Domain Transparency in Stablecoins

Stablecoins such as USDT and USDC aspire to peg stability by coupling issuance controls with reserve attestations. In practice, however, transparency remains fragmented across heterogeneous data sources, with key evidence about circulation, reserves, and disclosure dispersed across records that are difficult to connect and interpret jointly. We introduce a large language model (LLM)-based automated framework for bridging cross-domain transparency in stablecoins by aligning issuer disclosures with observable circulation evidence. First, we propose an integrative framework using LLMs to parse documents, extract salient financial indicators, and semantically align reported statements with corresponding market and issuance metrics. Second, we integrate multi-chain issuance records and disclosure documents within a model context protocol (MCP) framework that standardizes LLM access to both quantitative market data and qualitative disclosure narratives. This framework enables unified retrieval and contextual alignment across heterogeneous stablecoin information sources and facilitates consistent analysis. Third, we demonstrate the capability of LLMs to operate across heterogeneous data domains in blockchain analytics, quantifying discrepancies between reported and observed circulation and examining their implications for transparency and price dynamics. Our findings reveal systematic gaps between disclosed and verifiable data, showing that LLM-assisted analysis enhances cross-domain transparency and supports automated, data-driven auditing in decentralized finance (DeFi).

Updated: 2026-04-01 02:38:06

标题: 利用大型语言模型构建稳定币跨领域透明度

摘要: 稳定币（如USDT和USDC）希望通过将发行控制与储备验证相结合来实现锚定稳定性。然而，在实践中，透明度仍然在异质数据源之间分散，有关流通、储备和披露的关键证据分散在难以连接和共同解释的记录中。我们引入了一个基于大型语言模型（LLM）的自动化框架，通过将发行者披露与可观察的流通证据对齐，来实现稳定币跨领域透明度的桥梁。首先，我们提出了一个综合框架，使用LLMs解析文件，提取显著的财务指标，并在语义上将报告的声明与相应的市场和发行度量对齐。其次，我们在模型上下文协议（MCP）框架中整合多链发行记录和披露文件，该框架标准化了LLM对定量市场数据和定性披露叙述的访问。这一框架实现了跨异质稳定币信息源的统一检索和语境对齐，并促进了一致的分析。第三，我们展示了LLMs在区块链分析中跨异质数据领域操作的能力，量化了报告和观察到的流通之间的差异，并研究了这些差异对透明度和价格动态的影响。我们的研究发现揭示了披露数据和可验证数据之间的系统性差距，表明LLM辅助分析增强了跨领域透明度，并支持去中心化金融（DeFi）中的自动化数据驱动审计。

更新时间: 2026-04-01 02:38:06

领域: cs.CR,cs.LG

下载: http://arxiv.org/abs/2512.02418v3

RAGShield: Provenance-Verified Defense-in-Depth Against Knowledge Base Poisoning in Government Retrieval-Augmented Generation Systems

RAG systems deployed across federal agencies for citizen-facing services are vulnerable to knowledge base poisoning attacks, where adversaries inject malicious documents to manipulate outputs. Recent work demonstrates that as few as 10 adversarial passages can achieve 98.2% retrieval success rates. We observe that RAG knowledge base poisoning is structurally analogous to software supply chain attacks, and propose RAGShield, a five-layer defense-in-depth framework applying supply chain provenance verification to the RAG knowledge pipeline. RAGShield introduces: (1) C2PA-inspired cryptographic document attestation blocking unsigned and forged documents at ingestion; (2) trust-weighted retrieval prioritizing provenance-verified sources; (3) a formal taint lattice with cross-source contradiction detection catching insider threats even when provenance is valid; (4) provenance-aware generation with auditable citations; and (5) NIST SP 800-53 compliance mapping across 15 control families. Evaluation on a 500-passage Natural Questions corpus with 63 attack documents and 200 queries against five adversary tiers achieves 0.0% attack success rate including adaptive attacks (95% CI: [0.0%, 1.9%]) with 0.0% false positive rate. We honestly report that insider in-place replacement attacks achieve 17.5% ASR, identifying the fundamental limit of ingestion-time defense. The cross-source contradiction detector catches subtle numerical manipulation attacks that bypass provenance verification entirely.

Updated: 2026-04-01 02:16:42

标题: RAGShield：在政府检索增强生成系统中对知识库中毒的可追溯验证的深度防御

摘要: 部署在联邦机构用于面向公民服务的RAG系统容易受到知识库中毒攻击的威胁，攻击者会注入恶意文档以操纵输出。最近的研究表明，仅有10个对抗性段落就可以达到98.2%的检索成功率。我们观察到，RAG知识库中毒在结构上类似于软件供应链攻击，并提出了RAGShield，这是一个应用供应链出处验证到RAG知识管道的五层深度防御框架。RAGShield引入了：(1)受C2PA启发的加密文档证明，阻止未签名和伪造文档在摄取时进入；(2)信任加权检索，优先考虑出处经过验证的来源；(3)一个形式化玷污格与跨源矛盾检测，即使出处有效也可以捕捉内部威胁；(4)具有可审计引用的出处感知生成；以及(5)跨15个控制家族的NIST SP 800-53合规映射。在一个包含63个攻击文档和200个查询的500个段落的自然问题语料库上对五个对手等级进行评估，取得了0.0%的攻击成功率，包括自适应攻击（95% CI: [0.0%, 1.9%]）和0.0%的误报率。我们诚实地报告，内部替换攻击实现了17.5%的ASR，这识别了摄入时间防御的根本限制。跨源矛盾检测器捕捉到了绕过完全出处验证的微妙数值操纵攻击。

更新时间: 2026-04-01 02:16:42

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2604.00387v1

RampoNN: A Reachability-Guided System Falsification for Efficient Cyber-Kinetic Vulnerability Detection

Detecting kinetic vulnerabilities in Cyber-Physical Systems (CPS), vulnerabilities in control code that can precipitate hazardous physical consequences, is a critical challenge. This task is complicated by the need to analyze the intricate coupling between complex software behavior and the system's physical dynamics. Furthermore, the periodic execution of control code in CPS applications creates a combinatorial explosion of execution paths that must be analyzed over time, far exceeding the scope of traditional single-run code analysis. This paper introduces RampoNN, a novel framework that systematically identifies kinetic vulnerabilities given the control code, a physical system model, and a Signal Temporal Logic (STL) specification of safe behavior. RampoNN first analyzes the control code to map the control signals that can be generated under various execution branches. It then employs a neural network to abstract the physical system's behavior. To overcome the poor scaling and loose over-approximations of standard neural network reachability, RampoNN uniquely utilizes Deep Bernstein neural networks, which are equipped with customized reachability algorithms that yield orders of magnitude tighter bounds. This high-precision reachability analysis allows RampoNN to rapidly prune large sets of guaranteed-safe behaviors and rank the remaining traces by their potential to violate the specification. The results of this analysis are then used to effectively guide a falsification engine, focusing its search on the most promising system behaviors to find actual vulnerabilities. We evaluated our approach on a PLC-controlled water tank system and a switched PID controller for an automotive engine. The results demonstrate that RampoNN leads to acceleration of the process of finding kinetic vulnerabilities by up to 98.27% and superior scalability compared to other state-of-the-art methods.

Updated: 2026-04-01 02:08:38

标题: RampoNN：一种基于可达性引导的系统伪造方法，用于高效的网络动力脆弱性检测

摘要: 在网络物理系统（CPS）中检测动力学漏洞，即在控制代码中可能引发危险物理后果的漏洞，是一个关键的挑战。这项任务的复杂性在于需要分析复杂软件行为与系统物理动力学之间复杂耦合关系。此外，CPS应用中控制代码的周期性执行会导致执行路径的组合爆炸，必须随时间进行分析，远远超出传统单次运行代码分析的范围。本文介绍了RampoNN，这是一个新颖的框架，根据控制代码、物理系统模型和安全行为的信号时间逻辑（STL）规范，系统地识别动力学漏洞。RampoNN首先分析控制代码，以映射在不同执行分支下可以生成的控制信号。然后利用神经网络抽象物理系统的行为。为了克服标准神经网络可达性的扩展性差和松散的过估计，RampoNN独特地利用了深度Bernstein神经网络，这些神经网络配备了定制的可达性算法，产生数量级更紧密的界限。这种高精度可达性分析允许RampoNN迅速剪枝大量的保证安全行为集并按其违反规范的潜力对剩余的轨迹进行排名。然后，分析结果被用于有效引导一个伪证引擎，将其搜索重点放在最有希望找到实际漏洞的系统行为上。我们在PLC控制的水箱系统和用于汽车发动机的切换PID控制器上评估了我们的方法。结果表明，与其他最先进的方法相比，RampoNN能够将寻找动力学漏洞的过程加速高达98.27%，并具有更好的可扩展性。

更新时间: 2026-04-01 02:08:38

领域: cs.CR,eess.SY

下载: http://arxiv.org/abs/2511.16765v2

Beyond Metadata: Multimodal, Policy-Aware Detection of YouTube Scam Videos

YouTube is a major platform for information and entertainment, but its wide accessibility also makes it attractive for scammers to upload deceptive or malicious content. Prior detection approaches rely largely on textual or statistical metadata, such as titles, descriptions, view counts, or likes, which are effective in many cases but can be evaded through benign-looking text, manipulated statistics, or other obfuscation strategies (e.g., 'Leetspeak'), while ignoring visual cues. In this study, we systematically investigate multimodal approaches for detecting YouTube scams. Our dataset consolidates established scam categories and augments them with full-length videos and policy-grounded reasoning annotations. Experiments show that a text-only model using titles and descriptions (fine-tuned BERT) achieves moderate performance (76.61% F1 score), improving slightly with audio transcripts (77.98% F1 score). Visual analysis with a fine-tuned LLaVA-Video model performs better (79.61% F1 score), while a multimodal framework combining titles, descriptions, and video frames achieves the highest performance (82.96% F1 score). Moreover, the multimodal framework showed greater robustness to adversarial perturbations, with accuracy dropping only 1-3%, compared to 12-38% for modality-specific models. Beyond accuracy, the multimodal framework provides interpretable, policy-grounded reasoning, enhancing transparency and practical utility in automated moderation. Using this approach, we analyzed 6,374 in-the-wild YouTube videos and detected 1,864 scams with explicit reasoning, providing a valuable resource for future research.

Updated: 2026-04-01 01:52:47

标题: 超越元数据：多模态、政策感知的YouTube诈骗视频检测

摘要: YouTube是一个重要的信息和娱乐平台，但其广泛的可访问性也使其成为骗子上传欺诈或恶意内容的吸引力。先前的检测方法主要依赖于文本或统计元数据，如标题、描述、观看次数或点赞数，在许多情况下是有效的，但可以通过看起来良性的文本、操纵统计数据或其他混淆策略（例如“Leetspeak”）来规避，同时忽略视觉线索。在这项研究中，我们系统地研究了检测YouTube骗局的多模态方法。我们的数据集整合了已建立的骗局类别，并通过全长视频和基于政策的推理注释对其进行了增强。实验表明，仅使用标题和描述的文本模型（微调的BERT）达到了中等性能（76.61%的F1分数），在添加音频转录后略有改善（77.98%的F1分数）。通过使用微调的LLaVA-Video模型进行视觉分析表现更好（79.61%的F1分数），而结合标题、描述和视频帧的多模态框架实现了最高性能（82.96%的F1分数）。此外，与特定模型相比，多模态框架对对抗性扰动具有更大的鲁棒性，准确度仅下降1-3%，而模态特定模型下降12-38%。除了准确性外，多模态框架提供了可解释的、基于政策的推理，增强了自动审查的透明性和实用性。利用这种方法，我们分析了6374个在野外的YouTube视频，并检测到1864个带有明确推理的骗局，为未来的研究提供了宝贵的资源。

更新时间: 2026-04-01 01:52:47

领域: cs.CR

下载: http://arxiv.org/abs/2509.23418v2