Arxiv Day: Article

Entropy-Guided Attention for Private LLMs

The pervasiveness of proprietary language models has raised critical privacy concerns, necessitating advancements in private inference (PI), where computations are performed directly on encrypted data without revealing users' sensitive information. While PI offers a promising solution, its practical deployment is hindered by substantial communication and latency overheads, primarily stemming from nonlinear operations. To address this, we introduce an information-theoretic framework to characterize the role of nonlinearities in decoder-only language models, laying a principled foundation for optimizing transformer-architectures tailored to the demands of PI. By leveraging Shannon's entropy as a quantitative measure, we uncover the previously unexplored dual significance of nonlinearities: beyond ensuring training stability, they are crucial for maintaining attention head diversity. Specifically, we find that their removal triggers two critical failure modes: {\em entropy collapse} in deeper layers that destabilizes training, and {\em entropic overload} in earlier layers that leads to under-utilization of Multi-Head Attention's (MHA) representational capacity. We propose an entropy-guided attention mechanism paired with a novel entropy regularization technique to mitigate entropic overload. Additionally, we explore PI-friendly alternatives to layer normalization for preventing entropy collapse and stabilizing the training of LLMs with reduced-nonlinearities. Our study bridges the gap between information theory and architectural design, establishing entropy dynamics as a principled guide for developing efficient PI architectures. The code and implementation are available at https://github.com/Nandan91/entropy-guided-attention-llm

Updated: 2025-01-08 22:22:43

标题: 熵引导的注意力用于私有LLMs

摘要: 专有语言模型的普及引发了关键的隐私问题，需要在私有推理（PI）方面取得进展，其中计算直接在加密数据上执行，而不会泄露用户的敏感信息。虽然PI提供了一个有希望的解决方案，但其实际部署受到了大量通信和延迟开销的阻碍，主要源于非线性操作。为了解决这个问题，我们引入了一个信息理论框架来表征解码器专用语言模型中非线性的作用，为优化针对PI需求的变压器架构奠定了一个有原则的基础。通过利用香农熵作为一个量化指标，我们揭示了以前未被探索的非线性的双重重要性：除了确保训练稳定性之外，它们对于维持注意力头多样性至关重要。具体来说，我们发现它们的去除会引发两种关键的故障模式：深层中的“熵坍缩”会破坏训练稳定性，而早期层中的“熵过载”会导致多头注意力（MHA）表示能力的未充分利用。我们提出了一个以熵为导向的注意机制，配合一种新颖的熵正则化技术，以减轻熵过载。此外，我们探索了对层归一化的PI友好替代方案，以防止熵坍缩并稳定带有减少非线性的LLM的训练。我们的研究弥合了信息理论和架构设计之间的差距，建立了熵动力学作为开发高效PI架构的有原则指南。代码和实现可在https://github.com/Nandan91/entropy-guided-attention-llm获取。

更新时间: 2025-01-08 22:22:43

领域: cs.LG,cs.CR

下载: http://arxiv.org/abs/2501.03489v2

Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware

Malware analysis is a complex process of examining and evaluating malicious software's functionality, origin, and potential impact. This arduous process typically involves dissecting the software to understand its components, infection vector, propagation mechanism, and payload. Over the years, deep reverse engineering of malware has become increasingly tedious, mainly due to modern malicious codebases' fast evolution and sophistication. Essentially, analysts are tasked with identifying the elusive needle in the haystack within the complexities of zero-day malware, all while under tight time constraints. Thus, in this paper, we explore leveraging Large Language Models (LLMs) for semantic malware analysis to expedite the analysis of known and novel samples. Built on GPT-4o-mini model, \msp is designed to augment malware analysis for Android through a hierarchical-tiered summarization chain and strategic prompt engineering. Additionally, \msp performs malware categorization, distinguishing potential malware from benign applications, thereby saving time during the malware reverse engineering process. Despite not being fine-tuned for Android malware analysis, we demonstrate that through optimized and advanced prompt engineering \msp can achieve up to 77% classification accuracy while providing highly robust summaries at functional, class, and package levels. In addition, leveraging the backward tracing of the summaries from package to function levels allowed us to pinpoint the precise code snippets responsible for malicious behavior.

Updated: 2025-01-08 21:22:45

标题: 探索大型语言模型用于Android恶意软件的语义分析和分类

摘要: 恶意软件分析是一个复杂的过程，涉及对恶意软件的功能、起源和潜在影响进行审查和评估。这一繁重的过程通常涉及解剖软件以了解其组成部分、感染矢量、传播机制和载荷。多年来，恶意软件的深度逆向工程变得越来越繁琐，主要是由于现代恶意代码库的快速演变和复杂化。基本上，分析师被要求在严格的时间限制下在零日恶意软件的复杂性中找到那根苦苦寻找的针。因此，在本文中，我们探讨利用大型语言模型(LLMs)进行语义恶意软件分析，以加快已知和新样本的分析。基于GPT-4o-mini模型，\msp旨在通过分层分级摘要链和战略提示工程来增强对Android恶意软件的分析。此外，\msp执行恶意软件分类，区分潜在的恶意软件和良性应用程序，从而在恶意软件逆向工程过程中节省时间。尽管没有专门针对Android恶意软件分析进行微调，但我们通过优化和先进的提示工程展示了\msp可以实现高达77%的分类准确性，同时在功能、类和包级别提供高度稳健的摘要。此外，利用从包到函数级别的摘要的反向追踪使我们能够精确定位负责恶意行为的精确代码片段。

更新时间: 2025-01-08 21:22:45

领域: cs.CR,cs.AI

下载: http://arxiv.org/abs/2501.04848v1

Differentially Private Online Federated Learning with Correlated Noise

We introduce a novel differentially private algorithm for online federated learning that employs temporally correlated noise to enhance utility while ensuring privacy of continuously released models. To address challenges posed by DP noise and local updates with streaming non-iid data, we develop a perturbed iterate analysis to control the impact of the DP noise on the utility. Moreover, we demonstrate how the drift errors from local updates can be effectively managed under a quasi-strong convexity condition. Subject to an $(\epsilon, \delta)$-DP budget, we establish a dynamic regret bound over the entire time horizon, quantifying the impact of key parameters and the intensity of changes in dynamic environments. Numerical experiments confirm the efficacy of the proposed algorithm.

Updated: 2025-01-08 21:05:26

标题: 具有相关噪声的差分隐私在线联邦学习

摘要: 我们介绍了一种新颖的在线联邦学习的差分隐私算法，利用时间相关的噪声来增强效用，同时确保连续发布的模型的隐私性。为了解决由DP噪声和本地更新与流式非独立同分布数据带来的挑战，我们开发了扰动迭代分析来控制DP噪声对效用的影响。此外，我们演示了在准强凸性条件下如何有效管理来自本地更新的漂移误差。在$(\epsilon, \delta)$-DP预算的限制下，我们建立了整个时间范围内的动态遗憾界，量化了关键参数和动态环境变化强度的影响。数值实验证实了所提出算法的有效性。

更新时间: 2025-01-08 21:05:26

领域: cs.LG,cs.CR,cs.DC

下载: http://arxiv.org/abs/2403.16542v3

Fast, Fine-Grained Equivalence Checking for Neural Decompilers

Neural decompilers are machine learning models that reconstruct the source code from an executable program. Critical to the lifecycle of any machine learning model is an evaluation of its effectiveness. However, existing techniques for evaluating neural decompilation models have substantial weaknesses, especially when it comes to showing the correctness of the neural decompiler's predictions. To address this, we introduce codealign, a novel instruction-level code equivalence technique designed for neural decompilers. We provide a formal definition of a relation between equivalent instructions, which we term an equivalence alignment. We show how codealign generates equivalence alignments, then evaluate codealign by comparing it with symbolic execution. Finally, we show how the information codealign provides-which parts of the functions are equivalent and how well the variable names match-is substantially more detailed than existing state-of-the-art evaluation metrics, which report unitless numbers measuring similarity.

Updated: 2025-01-08 19:59:48

标题: 神经解码器的快速、细粒度等效性检查

摘要: 神经反编译器是一种机器学习模型，可以从可执行程序中重建源代码。对于任何机器学习模型的生命周期来说，评估其有效性至关重要。然而，现有的评估神经反编译模型的技术存在显著的弱点，特别是在展示神经反编译器预测的正确性方面。为了解决这个问题，我们引入了codealign，这是一种专为神经反编译器设计的新颖的指令级代码等价技术。我们提供了等效指令之间关系的形式化定义，我们称之为等效对齐。我们展示了codealign如何生成等效对齐，并通过与符号执行进行比较来评估codealign。最后，我们展示了codealign提供的信息 - 函数的哪些部分是等效的以及变量名称匹配的程度 - 明显比现有的最先进的评估指标更详细，后者报告衡量相似性的无单位数字。

更新时间: 2025-01-08 19:59:48

领域: cs.LG,cs.CR,cs.SE

下载: http://arxiv.org/abs/2501.04811v1

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of $d$-dimensional vectors held by $n$ users while ensuring $(\epsilon,\delta)$-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an $O(n)$ utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP -- even in adversarial settings -- while maintaining better resilience to dropouts and attacks compared to distributed DP.

Updated: 2025-01-08 18:20:18

标题: 相关的隐私机制用于差分隐私分布式均值估计

摘要: 差分隐私分布均值估计（DP-DME）是隐私保护联邦学习中的基本构建模块，其中一个中央服务器在确保（ε，δ）-DP的情况下，估计$n$个用户持有的$d$维向量的均值。本文研究中，我们提出了一个通用的DP-DME框架，涵盖了LDP和基于SA的机制作为极端情况。我们的框架为开发和分析利用用户间相关隐私机制的各种DP-DME协议提供了基础。为此，我们提出了CorDP-DME，一种基于相关高斯机制的新型DP-DME机制，填补了具有LDP和分布式DP的DME之间的差距。我们证明了CorDP-DME在效用和对抗性中的有利平衡，并提供了对任何给定隐私参数和辍学/勾结用户阈值下效用的理论保证。我们的结果表明，与LDP相比，（反）相关高斯DP机制可以显着提高均值估计任务的效用，甚至在对抗性环境中也能保持比分布式DP更好的辍学和攻击弹性。

更新时间: 2025-01-08 18:20:18

领域: cs.IT,cs.CR,cs.LG,math.IT

下载: http://arxiv.org/abs/2407.03289v2

VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions

High-performance IO demands low-overhead communication between user- and kernel space. This demand can no longer be fulfilled by traditional system calls. Linux's extended Berkeley Packet Filter (BPF) avoids user-/kernel transitions by just-in-time compiling user-provided bytecode and executing it in kernel mode with near-native speed. To still isolate BPF programs from the kernel, they are statically analyzed for memory- and type-safety, which imposes some restrictions but allows for good expressiveness and high performance. However, to mitigate the Spectre vulnerabilities disclosed in 2018, defenses which reject potentially-dangerous programs had to be deployed. We find that this affects 31% to 54% of programs in a dataset with 844 real-world BPF programs from popular open-source projects. To solve this, users are forced to disable the defenses to continue using the programs, which puts the entire system at risk. To enable secure and expressive untrusted Linux kernel extensions, we propose VeriFence, an enhancement to the kernel's Spectre defenses that reduces the number of BPF application programs rejected from 54% to zero. We measure VeriFence's overhead for all mainstream performance-sensitive applications of BPF (i.e., event tracing, profiling, and packet processing) and find that it improves significantly upon the status-quo where affected BPF programs are either unusable or enable transient execution attacks on the kernel.

Updated: 2025-01-08 17:43:32

标题: VeriFence: 轻量级且精准的用于不受信任的Linux内核扩展的Spectre防御

摘要: 高性能IO需要用户空间和内核空间之间低开销的通信。传统系统调用已无法满足这一需求。Linux的扩展伯克利数据包过滤器(BPF)通过即时编译用户提供的字节码并在内核模式下以接近本机速度执行来避免用户空间和内核空间之间的转换。为了仍然将BPF程序与内核隔离开来，它们会静态分析以确保内存和类型的安全性，这会施加一些限制但允许良好的表达能力和高性能。然而，为了缓解2018年披露的Spectre漏洞，必须部署拒绝潜在危险程序的防御措施。我们发现，在一个包含了来自流行开源项目的844个真实世界BPF程序的数据集中，这影响了31%至54%的程序。为了解决这个问题，用户被迫禁用这些防御措施以继续使用这些程序，这将使整个系统处于风险之中。为了实现安全和具有表现力的不受信任的Linux内核扩展，我们提出VeriFence，这是内核Spectre防御的增强版本，将拒绝的BPF应用程序数量从54%降至零。我们测量了VeriFence对BPF的所有主流性能敏感应用程序（例如事件跟踪、剖析和数据包处理）的开销，并发现它在受影响的BPF程序要么无法使用，要么会对内核进行瞬态执行攻击的现状上有显着改进。

更新时间: 2025-01-08 17:43:32

领域: cs.CR,cs.OS,68M25,D.4.6

下载: http://arxiv.org/abs/2405.00078v3

Goldilocks Isolation: High Performance VMs with Edera

Organizations run applications on cloud infrastructure shared between multiple users and organizations. Popular tooling for this shared infrastructure, including Docker and Kubernetes, supports such multi-tenancy through the use of operating system virtualization. With operating system virtualization (known as containerization), multiple applications share the same kernel, reducing the runtime overhead. However, this shared kernel presents a large attack surface and has led to a proliferation of container escape attacks in which a kernel exploit lets an attacker escape the isolation of operating system virtualization to access other applications or the operating system itself. To address this, some systems have proposed a return to hypervisor virtualization for stronger isolation between applications. However, no existing system has achieved both the isolation of hypervisor virtualization and the performance and usability of operating system virtualization. We present Edera, an optimized type 1 hypervisor that uses paravirtualization to improve the runtime of hypervisor virtualization. We illustrate Edera's usability and performance through two use cases. First, we create a container runtime compatible with Kubernetes that runs on the Edera hypervisor. This implementation can be used as a drop-in replacement for the Kubernetes runtime and is compatible with all the tooling in the Kubernetes ecosystem. Second, we use Edera to provide driver isolation for hardware drivers, including those for networking, storage, and GPUs. This use of isolation protects the hypervisor and other applications from driver vulnerabilities. We find that Edera has runtime comparable to Docker with .9% slower cpu speeds, an average of 3% faster system call performance, and memory performance 0-7% faster. It achieves this with a 648 millisecond increase in startup time from Docker's 177.4 milliseconds.

Updated: 2025-01-08 15:51:02

标题: "金发女孩隔离：使用Edera实现高性能虚拟机"

摘要: Organizations在云基础设施上运行应用程序，这些基础设施在多个用户和组织之间共享。用于这种共享基础设施的流行工具，包括Docker和Kubernetes，通过使用操作系统虚拟化来支持这种多租户。通过操作系统虚拟化（也称为容器化），多个应用程序共享相同的内核，减少了运行时开销。然而，这个共享内核带来了一个巨大的攻击面，导致容器逃逸攻击的大量出现，其中一个内核漏洞让攻击者逃脱操作系统虚拟化的隔离，访问其他应用程序或操作系统本身。为了解决这个问题，一些系统提出了回归到使用hypervisor虚拟化以在应用程序之间实现更强的隔离。然而，目前没有任何现有系统既实现了hypervisor虚拟化的隔离，又具有操作系统虚拟化的性能和易用性。我们提出了Edera，一个优化的类型1 hypervisor，使用paravirtualization来改进hypervisor虚拟化的运行时。我们通过两个用例展示了Edera的可用性和性能。首先，我们创建了一个与Kubernetes兼容的容器运行时，可在Edera hypervisor上运行。这个实现可以作为Kubernetes运行时的替代，并且与Kubernetes生态系统中的所有工具兼容。其次，我们使用Edera为硬件驱动程序提供驱动程序隔离，包括网络、存储和GPU驱动程序。这种隔离保护了hypervisor和其他应用程序免受驱动程序漏洞的影响。我们发现，Edera的运行时与Docker相比，CPU速度稍慢0.9%，系统调用性能平均快3%，内存性能快0-7%。它实现了这一点，启动时间比Docker增加了648毫秒，从Docker的177.4毫秒增加到825.4毫秒。

更新时间: 2025-01-08 15:51:02

领域: cs.CR,cs.OS

下载: http://arxiv.org/abs/2501.04580v1

Scalable Data Notarization Leveraging Hybrid DLTs

Notarization is a procedure that enhance data management by ensuring the authentication of data during audits, thereby increasing trust in the audited data. Blockchain is frequently used as a secure, immutable, and transparent storage, contributing to make data notarization procedures more effective and trustable. Several blockchain-based data notarization protocols have been proposed in literature and commercial solutions. However, these implementations, whether on public or private blockchains, face inherent challenges: high fees on public blockchains and trust issues on private platforms, limiting the adoption of blockchains for data notarization or forcing several trade-offs. In this paper, we explore the use of hybrid blockchain architectures for data notarization, with a focus on scalability issues. Through the analysis of a real-world use case, the data notarization of product passports in supply chains, we propose a novel approach utilizing a data structure designed to efficiently manage the trade-offs in terms of storage occupation and costs involved in notarizing a large collection of data.

Updated: 2025-01-08 15:40:22

标题: 可扩展的数据公证利用混合式分布式账本技术

摘要: 公证是一种增强数据管理的程序，通过在审计过程中确保数据的认证，从而增加对审计数据的信任。区块链经常被用作安全、不可变和透明的存储，有助于使数据公证程序更加有效和可信。文献和商业解决方案中已提出了几种基于区块链的数据公证协议。然而，这些实现，无论是在公有链还是私有链上，都面临固有的挑战：公共链上的高费用和私有平台上的信任问题，限制了区块链用于数据公证或迫使进行多种权衡。本文探讨了混合区块链架构在数据公证中的应用，重点关注可扩展性问题。通过分析供应链中产品护照的数据公证的真实用例，我们提出了一种新颖的方法，利用一种数据结构来有效地管理在存储占用和成本方面的权衡，以进行大量数据的公证。

更新时间: 2025-01-08 15:40:22

领域: cs.CR,cs.DC

下载: http://arxiv.org/abs/2501.04571v1

Cyber-Physical Steganography in Robotic Motion Control

Steganography, the art of information hiding, has continually evolved across visual, auditory and linguistic domains, adapting to the ceaseless interplay between steganographic concealment and steganalytic revelation. This study seeks to extend the horizons of what constitutes a viable steganographic medium by introducing a steganographic paradigm in robotic motion control. Based on the observation of the robot's inherent sensitivity to changes in its environment, we propose a methodology to encode messages as environmental stimuli influencing the motions of the robotic agent and to decode messages from the resulting motion trajectory. The constraints of maximal robot integrity and minimal motion deviation are established as fundamental principles underlying secrecy. As a proof of concept, we conduct experiments in simulated environments across various manipulation tasks, incorporating robotic embodiments equipped with generalist multimodal policies.

Updated: 2025-01-08 14:44:40

标题: 网络物理隐写术在机器人运动控制中的应用

摘要: 隐写术，即信息隐藏的艺术，不断在视觉、听觉和语言领域发展，适应隐写伪装与隐写分析揭示之间不断的互动。本研究旨在通过引入机器人运动控制中的隐写学范式，拓展可行的隐写媒介的视野。基于对机器人对其环境变化的固有敏感性的观察，我们提出了一种方法论，将消息编码为影响机器人动作的环境刺激，并从结果运动轨迹中解码消息。确保最大机器人完整性和最小运动偏差的约束被确立为潜在的保密基本原则。作为一个概念验证，我们在模拟环境中进行实验，涵盖各种操作任务，包括装备有通用多模态策略的机器人实体。

更新时间: 2025-01-08 14:44:40

领域: cs.RO,cs.AI,cs.CR

下载: http://arxiv.org/abs/2501.04541v1

Exploring Power Side-Channel Challenges in Embedded Systems Security

Power side-channel (PSC) attacks are widely used in embedded microcontrollers, particularly in cryptographic applications, to extract sensitive information. However, expanding the applications of PSC attacks to broader security contexts in the embedded systems domain faces significant challenges. These include the need for specialized hardware setups to manage high noise levels in real-world targets and assumptions regarding the attacker's knowledge and capabilities. This paper systematically analyzes these challenges and introduces a novel signal-processing method that addresses key limitations, enabling effective PSC attacks in real-world embedded systems without requiring hardware modifications. We validate the proposed approach through experiments on real-world black-box embedded devices, verifying its potential to expand its usage in various embedded systems security applications beyond traditional cryptographic applications.

Updated: 2025-01-08 14:26:37

标题: 探究嵌入式系统安全中的功耗侧信道挑战

摘要: 功率侧信道（PSC）攻击广泛应用于嵌入式微控制器中，特别是在加密应用中，用于提取敏感信息。然而，将PSC攻击的应用扩展到嵌入式系统领域中更广泛的安全环境面临着重大挑战。这些挑战包括需要专门的硬件设置来管理现实目标中的高噪声水平，以及对攻击者的知识和能力的假设。本文系统地分析了这些挑战，并引入了一种新颖的信号处理方法，解决了关键限制，从而在不需要硬件修改的情况下实现在现实世界的嵌入式系统中进行有效的PSC攻击。我们通过对真实世界黑盒嵌入式设备的实验验证了所提出的方法，验证了其在各种嵌入式系统安全应用中扩展用途的潜力，超越了传统的加密应用。

更新时间: 2025-01-08 14:26:37

领域: cs.CR

下载: http://arxiv.org/abs/2410.11563v2

Multichannel Steganography: A Provably Secure Hybrid Steganographic Model for Secure Communication

This study introduces a novel steganographic model that synthesizes Steganography by Cover Modification (CMO) and Steganography by Cover Synthesis (CSY), enhancing both security and undetectability by generating cover messages or parameters while retaining the original cover's form, thus minimizing detection risks and overcoming the limitations of single-method techniques. Building upon this model, a refined Steganographic Communication Protocol is proposed, enhancing resilience against sophisticated threats such as Multichannel Replay Attacks and Multichannel Man-in-the-Middle Attacks, fortifying the protocol against potential tampering and improving upon prior works. To evaluate the security of the proposed protocol, a novel adversarial model is developed simulating a probabilistic polynomial time (PPT) adversary capable of intercepting communications across multiple channels. This model assesses the adversary's ability to compromise the protocol, providing a comprehensive security analysis. Finally, this study explores the practicality and adaptability of the model to both constrained environments like SMS banking and resource-rich settings such as blockchain transactions, demonstrating their potential to enhance financial services and security. These contributions present a robust and adaptable framework for secure steganographic communication, offering practical solutions for secure communications across diverse environments.

Updated: 2025-01-08 13:58:07

标题: 多通道隐写术：一种可证明安全的混合隐写模型，用于安全通信

摘要: 这项研究介绍了一种新颖的隐写术模型，合成了通过封面修改（CMO）和通过封面合成（CSY）实现的隐写术，通过生成封面消息或参数同时保留原始封面的形式，增强了安全性和不可检测性，从而最小化检测风险并克服单一方法技术的局限性。在这个模型的基础上，提出了一个精细的隐写通信协议，增强了对复杂威胁（如多通道重放攻击和多通道中间人攻击）的抵抗力，加强了协议对潜在篡改的防范，并改进了之前的工作。为了评估所提议的协议的安全性，开发了一个新颖的对抗模型，模拟了一个能够拦截多个通道上的通信的概率多项式时间（PPT）对手。该模型评估了对手破坏协议的能力，提供了全面的安全分析。最后，本研究探讨了该模型在短信银行和区块链交易等资源丰富的环境以及受限环境中的实用性和适应性，展示了它们增强金融服务和安全性的潜力。这些贡献提供了一个坚固且适应性强的框架，用于安全的隐写通信，在不同环境中提供了安全通信的实用解决方案。

更新时间: 2025-01-08 13:58:07

领域: cs.CR,cs.MM

下载: http://arxiv.org/abs/2501.04511v1

Understanding, Implementing, and Supporting Security Assurance Cases in Safety-Critical Domains

The increasing demand for connectivity in safety-critical domains has made security assurance a crucial consideration. In safety-critical industry, software, and connectivity have become integral to meeting market expectations. Regulatory bodies now require security assurance cases (SAC) to verify compliance, as demonstrated in ISO/SAE-21434 for automotive. However, existing approaches for creating SACs do not adequately address industry-specific constraints and requirements. In this thesis, we present CASCADE, an approach for creating SACs that aligns with ISO/SAE-21434 and integrates quality assurance measures. CASCADE is developed based on insights from industry needs and a systematic literature review. We explore various factors driving SAC adoption, both internal and external to companies in safety-critical domains, and identify gaps in the existing literature. Our approach addresses these gaps and focuses on asset-driven methodology and quality assurance. We provide an illustrative example and evaluate CASCADE's suitability and scalability in an automotive OEM. We evaluate the generalizability of CASCADE in the medical domain, highlighting its benefits and necessary adaptations. Furthermore, we support the creation and management of SACs by developing a machine-learning model to classify security-related requirements and investigating the management of security evidence. We identify deficiencies in evidence management practices and propose potential areas for automation. Finally, our work contributes to the advancement of security assurance practices and provides practical support for practitioners in creating and managing SACs.

Updated: 2025-01-08 13:02:08

标题: 理解、实施和支持在安全关键领域中的安全保证案例

摘要: 在安全关键领域对连接性需求的增加已经使得安全保障成为一个至关重要的考虑因素。在安全关键行业中，软件和连接性已成为满足市场期望的不可或缺的部分。监管机构现在要求进行安全保障案例（SAC）的验证以确保符合要求，就像ISO/SAE-21434标准为汽车行业所要求的那样。然而，现有的用于创建SAC的方法并没有充分考虑行业特定的限制和需求。在本论文中，我们提出了CASCADE，一个用于创建符合ISO/SAE-21434标准并整合质量保证措施的SAC的方法。CASCADE是基于行业需求和系统性文献综述的见解开发的。我们探讨了推动SAC采用的各种因素，包括安全关键领域公司内部和外部的因素，并确定了现有文献中的差距。我们的方法解决了这些差距，并侧重于资产驱动方法论和质量保证。我们提供了一个示例，并评估了CASCADE在汽车OEM中的适用性和可扩展性。我们评估了CASCADE在医疗领域的普适性，突出其优点和必要的调整。此外，我们通过开发一个机器学习模型来对安全相关需求进行分类，并研究安全证据的管理，以支持SAC的创建和管理。我们确定了证据管理实践中的不足之处，并提出了潜在的自动化领域。最后，我们的工作推动了安全保障实践的进步，并为从业者在创建和管理SAC方面提供了实际支持。

更新时间: 2025-01-08 13:02:08

领域: cs.SE,cs.CR

下载: http://arxiv.org/abs/2501.04479v1

Analyzing Consumer IoT Traffic from Security and Privacy Perspectives: a Comprehensive Survey

The Consumer Internet of Things (CIoT), a notable segment within the IoT domain, involves the integration of IoT technology into consumer electronics and devices, such as smart homes and smart wearables. Compared to traditional IoT fields, CIoT differs notably in target users, product types, and design approaches. While offering convenience to users, it also raises new security and privacy concerns. Network traffic analysis, a widely used technique in the security community, has been extensively applied to investigate these concerns about CIoT. Compared to network traffic analysis in other fields such as mobile apps and websites, CIoT presents unique characteristics, introducing new challenges and research opportunities. Researchers have made significant contributions in this area. To aid researchers in understanding the application of traffic analysis tools for studying CIoT security and privacy risks, this survey reviews 303 publications on traffic analysis within the CIoT security and privacy domain from January 2018 to June 2024, focusing on three research questions. Our work: 1) outlines the CIoT traffic analysis process and highlights its differences from general network traffic analysis. 2) summarizes and classifies existing research into four categories according to its application objectives: device fingerprinting, user activity inference, malicious traffic detection, and measurement. 3) explores emerging challenges and potential future research directions based on each step of the CIoT traffic analysis process. This will provide new insights to the community and guide the industry towards safer product designs.

Updated: 2025-01-08 12:40:27

标题: 从安全和隐私的角度分析消费者物联网流量：一项综合调查

摘要: 消费者物联网（CIoT）是物联网领域中一个显著的部分，涉及将物联网技术整合到消费类电子产品和设备中，例如智能家居和智能可穿戴设备。与传统的物联网领域相比，CIoT在目标用户、产品类型和设计方法上有显著差异。虽然为用户提供了便利，但也引发了新的安全和隐私问题。网络流量分析是安全社区中广泛使用的技术，已被广泛应用于研究有关CIoT的这些问题。与移动应用程序和网站等其他领域的网络流量分析相比，CIoT具有独特的特征，引入了新的挑战和研究机会。研究人员在这一领域取得了重要的贡献。为了帮助研究人员了解流量分析工具在研究CIoT安全和隐私风险方面的应用，本调查回顾了从2018年1月到2024年6月在CIoT安全和隐私领域内的流量分析方面的303篇出版物，重点关注三个研究问题。我们的工作：1）概述了CIoT流量分析过程，并强调了其与一般网络流量分析的差异。2）根据其应用目标将现有研究总结和分类为四类：设备指纹识别、用户活动推断、恶意流量检测和测量。3）根据CIoT流量分析过程的每一步探讨新兴挑战和潜在未来研究方向。这将为社区提供新的见解，并引导行业朝着更安全的产品设计方向发展。

更新时间: 2025-01-08 12:40:27

领域: cs.CR,cs.AI,cs.LG

下载: http://arxiv.org/abs/2403.16149v4

Towards a scalable AI-driven framework for data-independent Cyber Threat Intelligence Information Extraction

Cyber Threat Intelligence (CTI) is critical for mitigating threats to organizations, governments, and institutions, yet the necessary data are often dispersed across diverse formats. AI-driven solutions for CTI Information Extraction (IE) typically depend on high-quality, annotated data, which are not always available. This paper introduces 0-CTI, a scalable AI-based framework designed for efficient CTI Information Extraction. Leveraging advanced Natural Language Processing (NLP) techniques, particularly Transformer-based architectures, the proposed system processes complete text sequences of CTI reports to extract a cyber ontology of named entities and their relationships. Our contribution is the development of 0-CTI, the first modular framework for CTI Information Extraction that supports both supervised and zero-shot learning. Unlike existing state-of-the-art models that rely heavily on annotated datasets, our system enables fully dataless operation through zero-shot methods for both Entity and Relation Extraction, making it adaptable to various data availability scenarios. Additionally, our supervised Entity Extractor surpasses current state-of-the-art performance in cyber Entity Extraction, highlighting the dual strength of the framework in both low-resource and data-rich environments. By aligning the system's outputs with the Structured Threat Information Expression (STIX) format, a standard for information exchange in the cybersecurity domain, 0-CTI standardizes extracted knowledge, enhancing communication and collaboration in cybersecurity operations.

Updated: 2025-01-08 12:35:17

标题: 朝向一个可扩展的基于人工智能的框架，用于数据无关的网络威胁情报信息提取

摘要: 网络威胁情报（CTI）对于减轻对组织、政府和机构的威胁至关重要，然而必要的数据通常分散在各种格式中。基于人工智能的CTI信息提取（IE）解决方案通常依赖于高质量的注释数据，这些数据并不总是可用。本文介绍了0-CTI，一个设计用于高效CTI信息提取的可扩展人工智能框架。利用先进的自然语言处理（NLP）技术，特别是基于Transformer的架构，提出的系统处理CTI报告的完整文本序列，提取命名实体及其关系的网络本体。我们的贡献是开发了0-CTI，第一个支持监督学习和零样本学习的模块化框架，与现有的依赖大量注释数据的最新模型不同，我们的系统通过零样本方法实现了完全无数据运行，同时支持实体和关系提取，使其适应各种数据可用性情景。此外，我们的监督实体提取器在网络实体提取方面超越了当前最先进的性能，突显了该框架在低资源和数据丰富环境中的双重优势。通过将系统的输出与结构化威胁信息表达（STIX）格式对齐，这是网络安全领域信息交换的标准，0-CTI标准化了提取的知识，增强了网络安全运营中的沟通和协作。

更新时间: 2025-01-08 12:35:17

领域: cs.CR,cs.AI,cs.CL

下载: http://arxiv.org/abs/2501.06239v1

A Taxonomy of Functional Security Features and How They Can Be Located

Security must be considered in almost every software system. Unfortunately, selecting and implementing security features remains challenging due to the variety of security threats and possible countermeasures. While security standards are intended to help developers, they are usually too abstract and vague to help implement security features, or they merely help configure such. A resource that describes security features at an abstraction level between high-level (i.e., rather too general) and low-level (i.e., rather too specific) security standards could facilitate secure systems development. To realize security features, developers typically use external security frameworks, to minimize implementation mistakes. Even then, developers still make mistakes, often resulting in security vulnerabilities. When security incidents occur or the system needs to be audited or maintained, it is essential to know the implemented security features and, more importantly, where they are located. This task, commonly referred to as feature location, is often tedious and error-prone. Therefore, we have to support long-term tracking of implemented security features. We present a study of security features in the literature and their coverage in popular security frameworks. We contribute (1) a taxonomy of 68 functional implementation-level security features including a mapping to widely used security standards, (2) an examination of 21 popular security frameworks concerning which of these security features they provide, and (3) a discussion on the representation of security features in source code. Our taxonomy aims to aid developers in selecting appropriate security features and frameworks and relating them to security standards when they need to choose and implement security features for a software system.

Updated: 2025-01-08 12:17:30

标题: 功能安全特性分类及其位置如何确定

摘要: 安全性必须在几乎每个软件系统中考虑。不幸的是，由于安全威胁的多样性和可能的对策，选择和实施安全功能仍然具有挑战性。虽然安全标准旨在帮助开发人员，但它们通常过于抽象和模糊，无法帮助实施安全功能，或者仅帮助配置这些功能。在高级（即相当普遍）和低级（即相当具体）安全标准之间描述安全功能的资源可以促进安全系统的发展。为了实现安全功能，开发人员通常使用外部安全框架，以最小化实施错误。即使如此，开发人员仍然会犯错误，通常导致安全漏洞。当发生安全事件或系统需要进行审计或维护时，了解已实施的安全功能是至关重要的，更重要的是知道它们的位置。这项任务通常称为特性定位，往往是繁琐且容易出错的。因此，我们必须支持对已实施的安全功能进行长期跟踪。我们对文献中的安全功能及其在流行安全框架中的覆盖范围进行了研究。我们贡献了（1）一个包含68个功能实现级安全功能的分类法，包括与广泛使用的安全标准的映射，（2）对21个流行安全框架提供的这些安全功能的研究，以及（3）对源代码中安全功能的表示进行讨论。我们的分类法旨在帮助开发人员选择适当的安全功能和框架，并在他们需要为软件系统选择和实施安全功能时将其与安全标准联系起来。

更新时间: 2025-01-08 12:17:30

领域: cs.CR,cs.SE

下载: http://arxiv.org/abs/2501.04454v1

Rethinking Byzantine Robustness in Federated Recommendation from Sparse Aggregation Perspective

To preserve user privacy in recommender systems, federated recommendation (FR) based on federated learning (FL) emerges, keeping the personal data on the local client and updating a model collaboratively. Unlike FL, FR has a unique sparse aggregation mechanism, where the embedding of each item is updated by only partial clients, instead of full clients in a dense aggregation of general FL. Recently, as an essential principle of FL, model security has received increasing attention, especially for Byzantine attacks, where malicious clients can send arbitrary updates. The problem of exploring the Byzantine robustness of FR is particularly critical since in the domains applying FR, e.g., e-commerce, malicious clients can be injected easily by registering new accounts. However, existing Byzantine works neglect the unique sparse aggregation of FR, making them unsuitable for our problem. Thus, we make the first effort to investigate Byzantine attacks on FR from the perspective of sparse aggregation, which is non-trivial: it is not clear how to define Byzantine robustness under sparse aggregations and design Byzantine attacks under limited knowledge/capability. In this paper, we reformulate the Byzantine robustness under sparse aggregation by defining the aggregation for a single item as the smallest execution unit. Then we propose a family of effective attack strategies, named Spattack, which exploit the vulnerability in sparse aggregation and are categorized along the adversary's knowledge and capability. Extensive experimental results demonstrate that Spattack can effectively prevent convergence and even break down defenses under a few malicious clients, raising alarms for securing FR systems.

Updated: 2025-01-08 11:47:25

标题: 重新思考来自稀疏聚合视角的拜占庭强韧性在联邦推荐中的应用

摘要: 为了在推荐系统中保护用户隐私，基于联邦学习的联邦推荐（FR）应运而生，将个人数据保留在本地客户端并协作更新模型。与联邦学习不同，FR具有独特的稀疏聚合机制，其中每个项目的嵌入仅由部分客户端更新，而不是一般联邦学习中全客户端的密集聚合。最近，作为联邦学习的一个重要原则，模型安全性受到越来越多的关注，特别是对于拜占庭攻击，恶意客户可以发送任意更新。探索FR的拜占庭鲁棒性问题尤为关键，因为在应用FR的领域，如电子商务，恶意客户可以通过注册新账户轻易注入。然而，现有的拜占庭作品忽视了FR的独特稀疏聚合，使它们不适用于我们的问题。因此，我们首次努力从稀疏聚合的角度研究FR的拜占庭攻击，这是非常困难的：如何在稀疏聚合下定义拜占庭鲁棒性并设计有限知识/能力下的拜占庭攻击并不清楚。在本文中，我们通过将单个项目的聚合定义为最小的执行单元，重新制定了稀疏聚合下的拜占庭鲁棒性。然后，我们提出了一系列有效的攻击策略，名为Spattack，利用稀疏聚合中的漏洞，并根据对手的知识和能力对其进行分类。大量实验结果表明，Spattack可以有效阻止收敛甚至在几个恶意客户下破坏防御，引起了对保护FR系统的警示。

更新时间: 2025-01-08 11:47:25

领域: cs.CR,cs.AI,cs.DC,cs.LG

下载: http://arxiv.org/abs/2501.03301v2

Modern Hardware Security: A Review of Attacks and Countermeasures

With the exponential rise in the use of cloud services, smart devices, and IoT devices, advanced cyber attacks have become increasingly sophisticated and ubiquitous. Furthermore, the rapid evolution of computing architectures and memory technologies has created an urgent need to understand and address hardware security vulnerabilities. In this paper, we review the current state of vulnerabilities and mitigation strategies in contemporary computing systems. We discuss cache side-channel attacks (including Spectre and Meltdown), power side-channel attacks (such as Simple Power Analysis, Differential Power Analysis, Correlation Power Analysis, and Template Attacks), and advanced techniques like Voltage Glitching and Electromagnetic Analysis to help understand and build robust cybersecurity defense systems and guide further research. We also examine memory encryption, focusing on confidentiality, granularity, key management, masking, and re-keying strategies. Additionally, we cover Cryptographic Instruction Set Architectures, Secure Boot, Root of Trust mechanisms, Physical Unclonable Functions, and hardware fault injection techniques. The paper concludes with an analysis of the RISC-V architecture's unique security challenges. The comprehensive analysis presented in this paper is essential for building resilient hardware security solutions that can protect against both current and emerging threats in an increasingly challenging security landscape.

Updated: 2025-01-08 10:14:19

标题: 现代硬件安全：攻击与对策综述

摘要: 随着云服务、智能设备和物联网设备的指数增长，先进的网络攻击变得日益复杂和普遍。此外，计算架构和内存技术的快速演变已经迫使我们迫切需要了解和解决硬件安全漏洞。本文审查了当代计算系统中的漏洞和缓解策略的现状。我们讨论了缓存侧信道攻击（包括Spectre和Meltdown）、功耗侧信道攻击（如简单功耗分析、差分功耗分析、相关功耗分析和模板攻击）以及高级技术，如电压故障和电磁分析，以帮助理解和建立强大的网络安全防御系统，并指导进一步的研究。我们还审查了内存加密，重点关注机密性、粒度、密钥管理、掩蔽和重新密钥策略。此外，我们涵盖了加密指令集架构、安全启动、信任根机制、物理不可克隆函数和硬件故障注入技术。本文最后分析了RISC-V架构的独特安全挑战。本文中呈现的综合分析对建立弹性硬件安全解决方案至关重要，这些解决方案可以抵御日益具有挑战性的安全环境中的当前和新兴威胁。

更新时间: 2025-01-08 10:14:19

领域: cs.CR,cs.AR

下载: http://arxiv.org/abs/2501.04394v1

Forecasting Anonymized Electricity Load Profiles

In the evolving landscape of data privacy, the anonymization of electric load profiles has become a critical issue, especially with the enforcement of the General Data Protection Regulation (GDPR) in Europe. These electric load profiles, which are essential datasets in the energy industry, are classified as personal behavioral data, necessitating stringent protective measures. This article explores the implications of this classification, the importance of data anonymization, and the potential of forecasting using microaggregated data. The findings underscore that effective anonymization techniques, such as microaggregation, do not compromise the performance of forecasting models under certain conditions (i.e., forecasting aggregated). In such an aggregated level, microaggregated data maintains high levels of utility, with minimal impact on forecasting accuracy. The implications for the energy sector are profound, suggesting that privacy-preserving data practices can be integrated into smart metering technology applications without hindering their effectiveness.

Updated: 2025-01-08 09:18:47

标题: 预测匿名化电力负荷配置

摘要: 在数据隐私保护的不断发展中，电力负载剖面的匿名化已经成为一个关键问题，特别是随着欧洲实施《通用数据保护条例》(GDPR)。这些电力负载剖面是能源行业中的重要数据集，被归类为个人行为数据，需要严格的保护措施。本文探讨了这一分类的影响、数据匿名化的重要性，以及利用微聚合数据进行预测的潜力。研究结果强调，有效的匿名化技术，如微聚合，不会在某些条件下（即聚合预测）损害预测模型的性能。在这种聚合水平上，微聚合数据保持高水平的效用，对预测准确性的影响很小。这对能源部门的影响深远，表明保护隐私的数据实践可以整合到智能计量技术应用中，而不会影响其有效性。

更新时间: 2025-01-08 09:18:47

领域: cs.CR,cs.AI,cs.LG,I.2.0; J.2.7

下载: http://arxiv.org/abs/2501.06237v1

Toxicity Detection towards Adaptability to Changing Perturbations

Toxicity detection is crucial for maintaining the peace of the society. While existing methods perform well on normal toxic contents or those generated by specific perturbation methods, they are vulnerable to evolving perturbation patterns. However, in real-world scenarios, malicious users tend to create new perturbation patterns for fooling the detectors. For example, some users may circumvent the detector of large language models (LLMs) by adding `I am a scientist' at the beginning of the prompt. In this paper, we introduce a novel problem, i.e., continual learning jailbreak perturbation patterns, into the toxicity detection field. To tackle this problem, we first construct a new dataset generated by 9 types of perturbation patterns, 7 of them are summarized from prior work and 2 of them are developed by us. We then systematically validate the vulnerability of current methods on this new perturbation pattern-aware dataset via both the zero-shot and fine tuned cross-pattern detection. Upon this, we present the domain incremental learning paradigm and the corresponding benchmark to ensure the detector's robustness to dynamically emerging types of perturbed toxic text. Our code and dataset are provided in the appendix and will be publicly available at GitHub, by which we wish to offer new research opportunities for the security-relevant communities.

Updated: 2025-01-08 09:18:05

标题: 对于适应变化扰动的毒性检测

摘要: 毒性检测对于维护社会和平至关重要。尽管现有方法在正常毒性内容或特定干扰方法生成的内容上表现良好，但它们容易受到不断演变的干扰模式的影响。然而，在现实场景中，恶意用户往往会创建新的干扰模式来愚弄检测器。例如，一些用户可能会通过在提示开头添加“我是科学家”来规避大型语言模型（LLMs）的检测器。本文引入了一个新问题，即连续学习越狱干扰模式，进入毒性检测领域。为了解决这个问题，我们首先构建了一个由9种干扰模式生成的新数据集，其中7种总结自先前的工作，另外2种是我们自己开发的。然后，我们通过零样本和精调跨模式检测系统地验证了当前方法在这个新的干扰模式感知数据集上的脆弱性。基于此，我们提出了领域增量学习范式和相应的基准，以确保检测器对动态出现的各种扰乱毒性文本类型的稳健性。我们的代码和数据集将在附录中提供，并将在GitHub上公开，希望为安全相关社群提供新的研究机会。

更新时间: 2025-01-08 09:18:05

领域: cs.CR,cs.AI,cs.CL,cs.LG

下载: http://arxiv.org/abs/2412.15267v2

Real-world actor-based image steganalysis via classifier inconsistency detection

In this paper, we propose a robust method for detecting guilty actors in image steganography while effectively addressing the Cover Source Mismatch (CSM) problem, which arises when classifying images from one source using a classifier trained on images from another source. Designed for an actor-based scenario, our method combines the use of Detection of Classifier Inconsistencies (DCI) prediction with EfficientNet neural networks for feature extraction, and a Gradient Boosting Machine for the final classification. The proposed approach successfully determines whether an actor is innocent or guilty, or if they should be discarded due to excessive CSM. We show that the method remains reliable even in scenarios with high CSM, consistently achieving accuracy above 80% and outperforming the baseline method. This novel approach contributes to the field of steganalysis by offering a practical and efficient solution for handling CSM and detecting guilty actors in real-world applications.

Updated: 2025-01-08 08:58:59

标题: 通过分类器不一致性检测的真实世界基于演员的图像隐写分析

摘要: 在本文中，我们提出了一种用于检测图像隐写中有罪行为者的强大方法，同时有效地解决了Cover Source Mismatch（CSM）问题，该问题在使用训练于另一来源图像的分类器对来自一来源的图像进行分类时会出现。针对基于行为者的场景，我们的方法结合了Detection of Classifier Inconsistencies（DCI）预测与EfficientNet神经网络进行特征提取，以及梯度提升机用于最终分类。所提出的方法成功地确定了一个行为者是无辜还是有罪，或者因过多的CSM而应被丢弃。我们展示了即使在CSM较高的情况下，该方法仍然可靠，始终保持在80%以上的准确性，并且胜过基线方法。这种新颖的方法为隐写分析领域做出了贡献，提供了处理CSM和检测现实世界应用中有罪行为者的实用高效解决方案。

更新时间: 2025-01-08 08:58:59

领域: cs.CR

下载: http://arxiv.org/abs/2501.04362v1

AutoDFL: A Scalable and Automated Reputation-Aware Decentralized Federated Learning

Blockchained federated learning (BFL) combines the concepts of federated learning and blockchain technology to enhance privacy, security, and transparency in collaborative machine learning models. However, implementing BFL frameworks poses challenges in terms of scalability and cost-effectiveness. Reputation-aware BFL poses even more challenges, as blockchain validators are tasked with processing federated learning transactions along with the transactions that evaluate FL tasks and aggregate reputations. This leads to faster blockchain congestion and performance degradation. To improve BFL efficiency while increasing scalability and reducing on-chain reputation management costs, this paper proposes AutoDFL, a scalable and automated reputation-aware decentralized federated learning framework. AutoDFL leverages zk-Rollups as a Layer-2 scaling solution to boost the performance while maintaining the same level of security as the underlying Layer-1 blockchain. Moreover, AutoDFL introduces an automated and fair reputation model designed to incentivize federated learning actors. We develop a proof of concept for our framework for an accurate evaluation. Tested with various custom workloads, AutoDFL reaches an average throughput of over 3000 TPS with a gas reduction of up to 20X.

Updated: 2025-01-08 08:05:18

标题: AutoDFL：一种可扩展且自动化的具备声誉意识的去中心化联邦学习

摘要: 区块链联邦学习（BFL）结合了联邦学习和区块链技术的概念，以增强合作式机器学习模型中的隐私、安全性和透明度。然而，实施BFL框架在可扩展性和成本效益方面面临挑战。有声誉意识的BFL面临更多挑战，因为区块链验证者需要处理联邦学习交易以及评估FL任务和汇总声誉的交易。这导致区块链拥堵加剧和性能下降。为了提高BFL效率，同时增加可扩展性并降低链上声誉管理成本，本文提出了AutoDFL，一个可扩展和自动化的有声誉意识去中心化联邦学习框架。AutoDFL利用zk-Rollups作为Layer-2扩展解决方案，以提升性能同时保持与底层Layer-1区块链相同级别的安全性。此外，AutoDFL引入了一个自动和公平的声誉模型，旨在激励联邦学习参与者。我们为我们的框架开发了一个概念验证。通过与各种自定义工作负载进行测试，AutoDFL的平均吞吐量达到3000个TPS以上，燃气减少高达20倍。

更新时间: 2025-01-08 08:05:18

领域: cs.DC,cs.CR,cs.ET,cs.LG

下载: http://arxiv.org/abs/2501.04331v1

VerifBFL: Leveraging zk-SNARKs for A Verifiable Blockchained Federated Learning

Blockchain-based Federated Learning (FL) is an emerging decentralized machine learning paradigm that enables model training without relying on a central server. Although some BFL frameworks are considered privacy-preserving, they are still vulnerable to various attacks, including inference and model poisoning. Additionally, most of these solutions employ strong trust assumptions among all participating entities or introduce incentive mechanisms to encourage collaboration, making them susceptible to multiple security flaws. This work presents VerifBFL, a trustless, privacy-preserving, and verifiable federated learning framework that integrates blockchain technology and cryptographic protocols. By employing zero-knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARKs) and incrementally verifiable computation (IVC), VerifBFL ensures the verifiability of both local training and aggregation processes. The proofs of training and aggregation are verified on-chain, guaranteeing the integrity and auditability of each participant's contributions. To protect training data from inference attacks, VerifBFL leverages differential privacy. Finally, to demonstrate the efficiency of the proposed protocols, we built a proof of concept using emerging tools. The results show that generating proofs for local training and aggregation in VerifBFL takes less than 81s and 2s, respectively, while verifying them on-chain takes less than 0.6s.

Updated: 2025-01-08 07:32:54

标题: VerifBFL: 利用zk-SNARKs实现可验证的区块链联邦学习

摘要: 基于区块链的联邦学习（FL）是一种新兴的去中心化机器学习范式，它使模型训练不依赖于中央服务器。尽管一些BFL框架被认为是保护隐私的，但它们仍然容易受到各种攻击，包括推断和模型中毒。此外，大多数解决方案都采用所有参与实体之间的强烈信任假设，或者引入激励机制以鼓励合作，使它们容易受到多个安全漏洞的影响。本文提出了VerifBFL，这是一个无需信任、保护隐私且可验证的联邦学习框架，它集成了区块链技术和加密协议。通过采用零知识简洁非交互式知识论证（zk-SNARKs）和可逐步验证的计算（IVC），VerifBFL确保了本地训练和聚合过程的可验证性。训练和聚合的证明在链上进行验证，确保了每个参与者贡献的完整性和可审计性。为了保护训练数据免受推断攻击，VerifBFL利用差分隐私。最后，为了展示提出的协议的效率，我们使用新兴工具构建了一个概念验证。结果显示，在VerifBFL中生成本地训练和聚合的证明分别少于81秒和2秒，而在链上验证它们少于0.6秒。

更新时间: 2025-01-08 07:32:54

领域: cs.CR,cs.DC,cs.ET,cs.LG

下载: http://arxiv.org/abs/2501.04319v1

Fast, Secure, Adaptable: LionsOS Design, Implementation and Performance

We present LionsOS, an operating system for security- and safety-critical embedded systems. LionsOS is based on the formally verified seL4 microkernel and designed with verification in mind. It uses a static architecture and features a highly modular design driven by strict separation of concerns and a focus on simplicity. We demonstrate that LionsOS outperforms Linux.

Updated: 2025-01-08 05:01:24

标题: 快速、安全、可适应：LionsOS设计、实现和性能

摘要: 我们介绍了LionsOS，这是一个专为安全和安全关键嵌入式系统设计的操作系统。LionsOS基于经过形式验证的seL4微内核，并且设计时考虑了验证。它采用静态架构，具有高度模块化的设计，严格分离关注点并专注于简单性。我们展示了LionsOS优于Linux。

更新时间: 2025-01-08 05:01:24

领域: cs.OS,cs.CR,D.4.7; D.4.8

下载: http://arxiv.org/abs/2501.06234v1

Watch Out for Your Guidance on Generation! Exploring Conditional Backdoor Attacks against Large Language Models

Mainstream backdoor attacks on large language models (LLMs) typically set a fixed trigger in the input instance and specific responses for triggered queries. However, the fixed trigger setting (e.g., unusual words) may be easily detected by human detection, limiting the effectiveness and practicality in real-world scenarios. To enhance the stealthiness of backdoor activation, we present a new poisoning paradigm against LLMs triggered by specifying generation conditions, which are commonly adopted strategies by users during model inference. The poisoned model performs normally for output under normal/other generation conditions, while becomes harmful for output under target generation conditions. To achieve this objective, we introduce BrieFool, an efficient attack framework. It leverages the characteristics of generation conditions by efficient instruction sampling and poisoning data generation, thereby influencing the behavior of LLMs under target conditions. Our attack can be generally divided into two types with different targets: Safety unalignment attack and Ability degradation attack. Our extensive experiments demonstrate that BrieFool is effective across safety domains and ability domains, achieving higher success rates than baseline methods, with 94.3 % on GPT-3.5-turbo

Updated: 2025-01-08 03:56:26

标题: 小心你对生成的指导！探索针对大型语言模型的有条件后门攻击

摘要: 大规模语言模型（LLMs）上的主流后门攻击通常在输入实例中设置固定触发器，并针对触发的查询指定特定响应。然而，固定的触发器设置（例如，不寻常的单词）可能会被人类检测到，从而限制在现实世界情境中的有效性和实用性。为了增强后门激活的隐秘性，我们提出了一种针对LLMs的新的中毒范例，通过指定生成条件来触发，这些条件是用户在模型推断期间通常采用的策略。中毒模型在正常/其他生成条件下表现正常，而在目标生成条件下却变得有害。为了实现这一目标，我们引入了BrieFool，一种高效的攻击框架。它利用生成条件的特性通过有效的指令采样和中毒数据生成，从而影响LLMs在目标条件下的行为。我们的攻击通常可以分为两种类型，具有不同的目标：安全不对齐攻击和能力降级攻击。我们广泛的实验证明，BrieFool在安全领域和能力领域都是有效的，成功率高于基线方法，GPT-3.5-turbo达到94.3％。

更新时间: 2025-01-08 03:56:26

领域: cs.CL,cs.CR,cs.LG

下载: http://arxiv.org/abs/2404.14795v5

Location Privacy Threats and Protections in 6G Vehicular Networks: A Comprehensive Review

Location privacy is critical in vehicular networks, where drivers' trajectories and personal information can be exposed, allowing adversaries to launch data and physical attacks that threaten drivers' safety and personal security. This survey reviews comprehensively different localization techniques, including widely used ones like sensing infrastructure-based, optical vision-based, and cellular radio-based localization, and identifies inadequately addressed location privacy concerns. We classify Location Privacy Preserving Mechanisms (LPPMs) into user-side, server-side, and user-server-interface-based, and evaluate their effectiveness. Our analysis shows that the user-server-interface-based LPPMs have received insufficient attention in the literature, despite their paramount importance in vehicular networks. Further, we examine methods for balancing data utility and privacy protection for existing LPPMs in vehicular networks and highlight emerging challenges from future upper-layer location privacy attacks, wireless technologies, and network convergences. By providing insights into the relationship between localization techniques and location privacy, and evaluating the effectiveness of different LPPMs, this survey can help inform the development of future LPPMs in vehicular networks.

Updated: 2025-01-08 02:47:39

标题: 6G车载网络中的位置隐私威胁与保护：综合评述

摘要: 位置隐私在车载网络中至关重要，驾驶员的轨迹和个人信息可能会被暴露，从而使对手能够发动数据和实体攻击，威胁驾驶员的安全和个人安全。本调查全面审查了不同的定位技术，包括广泛使用的基于感知基础设施、光学视觉和蜂窝无线电定位，并确定了未充分解决的位置隐私问题。我们将位置隐私保护机制（LPPMs）分类为用户端、服务器端和用户-服务器接口为基础，并评估它们的有效性。我们的分析显示，在文献中用户-服务器接口为基础的LPPMs受到了不足的关注，尽管它们在车载网络中至关重要。此外，我们研究了在车载网络中平衡数据效用和隐私保护的现有LPPMs的方法，并强调了未来高层位置隐私攻击、无线技术和网络融合带来的新挑战。通过提供对定位技术和位置隐私之间关系的见解，并评估不同LPPMs的有效性，本调查可以帮助指导未来车载网络中LPPMs的发展。

更新时间: 2025-01-08 02:47:39

领域: cs.CR

下载: http://arxiv.org/abs/2305.04503v2

A note on the differential spectrum of a class of locally APN functions

Let $\gf_{p^n}$ denote the finite field containing $p^n$ elements, where $n$ is a positive integer and $p$ is a prime. The function $f_u(x)=x^{\frac{p^n+3}{2}}+ux^2$ over $\gf_{p^n}[x]$ with $u\in\gf_{p^n}\setminus\{0,\pm1\}$ was recently studied by Budaghyan and Pal in \cite{Budaghyan2024ArithmetizationorientedAP}, whose differential uniformity is at most $5$ when $p^n\equiv3~(mod~4)$. In this paper, we study the differential uniformity and the differential spectrum of $f_u$ for $u=\pm1$. We first give some properties of the differential spectrum of any cryptographic function. Moreover, by solving some systems of equations over finite fields, we express the differential spectrum of $f_{\pm1}$ in terms of the quadratic character sums.

Updated: 2025-01-08 02:17:06

标题: 关于一类局部APN函数的微分谱的注解

摘要: 让$\gf_{p^n}$表示包含$p^n$个元素的有限域，其中$n$是一个正整数，$p$是一个素数。最近，Budaghyan和Pal在\cite{Budaghyan2024ArithmetizationorientedAP}中研究了在$\gf_{p^n}[x]$上的函数$f_u(x)=x^{\frac{p^n+3}{2}}+ux^2$，其中$u\in\gf_{p^n}\setminus\{0,\pm1\}$，其微分均匀度至多为$5$，当$p^n\equiv3~(mod~4)$时。在本文中，我们研究了$u=\pm1$时$f_u$的微分均匀度和微分谱。我们首先给出了任何加密函数的微分谱的一些性质。此外，通过在有限域上解一些方程组，我们用二次字符和来表达$f_{\pm1}$的微分谱。

更新时间: 2025-01-08 02:17:06

领域: cs.IT,cs.CR,math.IT

下载: http://arxiv.org/abs/2501.04233v1

Proof-of-Learning with Incentive Security

Most concurrent blockchain systems rely heavily on the Proof-of-Work (PoW) or Proof-of-Stake (PoS) mechanisms for decentralized consensus and security assurance. However, the substantial energy expenditure stemming from computationally intensive yet meaningless tasks has raised considerable concerns surrounding traditional PoW approaches, The PoS mechanism, while free of energy consumption, is subject to security and economic issues. Addressing these issues, the paradigm of Proof-of-Useful-Work (PoUW) seeks to employ challenges of practical significance as PoW, thereby imbuing energy consumption with tangible value. While previous efforts in Proof of Learning (PoL) explored the utilization of deep learning model training SGD tasks as PoUW challenges, recent research has revealed its vulnerabilities to adversarial attacks and the theoretical hardness in crafting a byzantine-secure PoL mechanism. In this paper, we introduce the concept of incentive-security that incentivizes rational provers to behave honestly for their best interest, bypassing the existing hardness to design a PoL mechanism with computational efficiency, a provable incentive-security guarantee and controllable difficulty. Particularly, our work is secure against two attacks, and also improves the computational overhead from $\Theta(1)$ to $O(\frac{\log E}{E})$. Furthermore, while most recent research assumes trusted problem providers and verifiers, our design also guarantees frontend incentive-security even when problem providers are untrusted, and verifier incentive-security that bypasses the Verifier's Dilemma. By incorporating ML training into blockchain consensus mechanisms with provable guarantees, our research not only proposes an eco-friendly solution to blockchain systems, but also provides a proposal for a completely decentralized computing power market in the new AI age.

Updated: 2025-01-08 02:10:31

标题: 学习证明与激励安全

摘要: 大多数并发区块链系统在分散式共识和安全保障方面严重依赖工作量证明（PoW）或权益证明（PoS）机制。然而，由于计算密集但毫无意义的任务导致的巨大能量消耗引起了对传统PoW方法的重大关注。而PoS机制虽然没有能源消耗，但存在安全和经济问题。为了解决这些问题，有用工作证明（PoUW）的范式旨在使用具有实际意义的挑战作为PoW，从而赋予能量消耗具体价值。虽然以前有关学习证明（PoL）的努力探索了将深度学习模型训练SGD任务用作PoUW挑战，但最近的研究揭示了其对敌对攻击的脆弱性以及在设计拜占庭安全PoL机制方面的理论困难。在本文中，我们引入了激励安全的概念，激励理性证明者为了最大利益而诚实行事，绕过了设计PoL机制的现有困难，实现了计算效率、可证明的激励安全保证和可控难度。特别地，我们的工作能够抵御两种攻击，并将计算开销从$\Theta(1)$提高到$O(\frac{\log E}{E})$。此外，虽然大多数最近的研究假设有信任的问题提供者和验证者，但我们的设计还确保了在问题提供者不受信任时的前端激励安全性，以及绕过验证者困境的验证者激励安全性。通过将机器学习培训纳入具有可证明保证的区块链共识机制中，我们的研究不仅提出了区块链系统的环保解决方案，还为新人工智能时代提供了一个完全分散的计算能力市场的提议。

更新时间: 2025-01-08 02:10:31

领域: cs.CR,cs.AI,cs.ET,cs.GT,cs.LG

下载: http://arxiv.org/abs/2404.09005v7