Dr. Bo Chen (Computer Science)
Dr. Xinyu Lei (Computer Science)
Dr. Kaichen Yang (Electrical and Computer Engineering)
Dr. Niusen Chen (Computer Science)
"UNICORN - A Unified Backdoor Trigger Inversion Framework"
Presenter - Shiwei Ding
Feb 12, 2024, 12 PM - 1 PM @ EERC 315
The backdoor attack, where the adversary uses inputs stamped with triggers (e.g.,
a patch) to activate pre-planted malicious behaviors, is a severe threat to Deep Neural Network (DNN) models. Trigger inversion is an effective way of identifying backdoor models and understanding embedded adversarial behaviors. Achallenge of trigger inversion is that there are many ways of constructing the trigger. Existing methods cannot generalize to various types of triggers by making certain assumptions or attack-specific constraints. The fundamental reason is that existing work does not consider the trigger’s design space in their formulation of the inversion problem. This work formally defines and analyzes the triggers injected in different spaces and the inversion problem. Then, it proposes a unified framework to invert backdoor triggers based on the formalization of triggers and the identified inner behaviors of backdoor models from our analysis. Our prototype UNICORN is general and effective in inverting backdoor triggers in DNNs.
*Some food and beverages provided at events
Presenter - Haoyang Chen
The fast-growing surveillance systems will make image captioning, i.e., automatically generating text descriptions of images, an essential technique to process the huge volumes of videos efficiently, and correct captioning is essential to ensure the text authenticity. While prior work has demonstrated the feasibility of fooling computer vision models with adversarial patches, it is unclear whether the vulnerability can lead to incorrect captioning, which involves natural language processing after image feature extraction. In this paper, we design CAPatch, a physical adversarial patch that can result in mistakes in the final captions, i.e., either create a completely different sentence or a sentence with keywords missing, against multi-modal image captioning systems. To make CAPatch effective and practical in the physical world, we propose a detection assurance and attention enhancement method to increase the impact of CAPatch and a robustness improvmment method to address the patch distortions caused by image printing and capturing. Evaluations on three commonly used image captioning systems (Show-and-Tell, Self-critical Sequence Training: Att2in, and Bottom-up Top-down) demonstrate the effectiveness of CAPatch in both the digital and physical worlds, whereby volunteers wear printed patches in various scenarios, clothes, lighting conditions. With a size of 5% of the image, physically printed CAPatch can achieve continuous attacks with an attack success rate higher than 73.1% over a video recorder.
Presenter- Doni Obidov
In the rapidly evolving domain of language model (LM) development, ensuring the integrity and security of training datasets is crucial. This study introduces a sophisticated form of data poisoning, categorized as a backdoor attack, which subtly undermines LMs. Diverging from traditional methodologies that rely on textual alterations within the training corpus, our approach is grounded in the strategic manipulation of labels for a select subset of the dataset. This discreet yet potent form of attack demonstrates its efficacy through the implementation of both single-word and multi-word trigger mechanisms. The novelty of our method lies in its unobtrusiveness, effectively executing backdoor attacks without the need for conspicuous text modifications. Our findings reveal significant vulnerabilities in LM training processes, underscoring the need for enhanced security measures in dataset preparation and model training. This paper not only elucidates the feasibility of label-based backdoor attacks but also serves as a crucial reminder of the often-overlooked subtleties in dataset security that can have profound implications on model integrity.
Presenter- Haoyang Chen
Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure that uses diffusion models for adversarial purification: Given an adversarial example, we first diffuse it with a small amount of noise following a forward diffusion process, and then recover the clean image through a reverse generative process. To evaluate our method against strong adaptive attacks in an efficient and scalable way, we propose to use the adjoint method to compute full gradients of the reverse generative process. Extensive experiments on three image datasets including CIFAR10, ImageNet and CelebA-HQ with three classifier architectures including ResNet, WideResNet and ViT demonstrate that our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods, often by a large margin.
Presenter- Niusen Chen
With increasing development of connected and autonomous vehicles, the risk of cyber threats on them is also increasing. Compared to traditional computer systems, a CAV attack is more critical, as it does not only threaten confidential data or system access, but may endanger the lives of drivers and passengers. To control a vehicle, the attacker may inject malicious control messages into the vehicle’s controller area network. To make this attack persistent, the most reliable method is to inject malicious code into an electronic control unit’s firmware. This allows the attacker to inject CAN messages and exhibit significant control over the vehicle, posing a safety threat to anyone in proximity. In this work, we have designed a defensive framework which allows restoring compromised ECU firmware in real-time. Our framework combines existing intrusion detection methods with a 'firmware recovery mechanism using trusted hardware components equipped in ECUs. Especially,the firmware restoration utilizes the existing FTL in the flash storage device. This process is highly efficient by minimizing the necessary restored information. Further, the recovery is managed via a trusted application running in TrustZone secure world. Both the FTL and TrustZone are secure when the ECU firmware is compromised. Steganography is used to hide communications during recovery. We have implemented and evaluated our prototype implementation in a testbed simulating the real-world in-vehicle scenario.
Presenter - Dr. Ronghua Xu
The fast integration of the fifth-generation (5G) communication, Artificial Intelligence (AI), and the Internet of Things (IoT) technologies is envisioned to enable Next Generation Networks (NGNs) that provides diverse intelligent services for Smart Cities. However, the ubiquitous proliferation of highly connected end devices and user-defined applications bring serious services provisioning, security, privacy, and management challenges on the centralized framework adopted by conventional networking systems. My research aims for a large-dimensional, autonomous and intelligent network infrastructure that integrates Machine Learning (ML), Blockchain, and Network Slicing (NS) atop of the sixth-generation (6G) communication networks to provide decentralized, secure, scalable, resilient, and efficient network services and a dynamic resource management for complex and heterogeneous IoT ecosystems, like Metaverse, smart transportation, and Unmanned Aerial Vehicle (UAV) systems, etc. Therefore, this presentation will introduce “a Secure-by-Design Federated Microchain Fabric for Internet-of-Things (IoT) System”, which laid down a solid foundation for constructing the secure and decentralized networking infrastructure under multi-domain IoT scenarios. From the system architecture aspect, I will specially focus on microDFL, which is a novel hierarchical IoT network fabric for decentralized federated learning (DFL) atop of the federation of lightweight Microchains. Under the framework of federated Microchain, I will explain two lightweight microchains for IoT systems, called Econledger and Fairledger, which adopt efficient consensus protocols to improve performance at the network of edge. Following that, a novel epoch randomness-enabled consensus committee configuration scheme call ECOM has been designed to enhance scalability and security of the small scale microchain, and a smart contract enabled inter-ledger protocol has been implemented to improve interoperation during cross-chain operations. After that, I will explain a novel concept of a dynamic edge resource federation framework by joint combination of a federated microchain fabric with network slicing technology, which shed the light on future opportunities that guarantee scalability, dynamicity, and security for multi-domain IoT ecosystems atop of NGNs. Moreover, I will also talk on applying of the key ideas of microchain into different IoT scenarios, like IoT network security, data marketplaces, and urban air mobility systems. Finally, I will conclude my talk by presenting vision towards NGNs that provide ubiquitous and pervasively network access for users.
Presenter - Dr. Xinyun Liu
A generative AI model can generate extremely realistic-looking content, posing growing challenged to the authenticity of information. To address the challenges, watermark has been leveraged to detect AI-generated content before it is released. Content is detected as AI-generated if a similar watermark can be decoded from it. In this work, we perform a systematic study on the robustness of such watermark-based AI-generative content detection. We focus on AI-generated images. Our work shows that an attacker can post-process a watermarked image via adding a small, human-imperceptible perturbation to it, such that the post-processed image evades detection while maintaining its visual quality. We show the effectiveness of our attack both theoretically and empirically. Moreover, to evade detection, our adversarial post-processing method adds much smaller perturbations to AI-generated images and thus better maintain their visual quality than existing popular post-processing methods such as JPEG compression Gaussian blur, and brightness/contrast. Our work shows the insufficiency of existing watermark-based detection of AI-generated content, highlighting the urgent needs of new methods.