Articles tagged with: #provenance Clear filter
Policy-Governed RAG - Research Design Study

Policy-Governed RAG - Research Design Study

cs.CR updates on arXiv.org arxiv.org

arXiv:2510.19877v1 Announce Type: new Abstract: A policy-governed RAG architecture is specified for audit-ready generation in regulated workflows, organized as a triptych: (I) Contracts/Control (SHRDLU-like), which governs output adherence to legal and internal policies; (II) Manifests/Trails (Memex-like), which cryptographically anchors all cited source evidence to ensure verifiable provenance; and (III) Receipts/Verification (Xanadu-like), which provides the final, portable proof of...

Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID

Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID

cs.CR updates on arXiv.org arxiv.org

arXiv:2508.20228v2 Announce Type: replace Abstract: Recent advances in LLM watermarking methods such as SynthID-Text by Google DeepMind offer promising solutions for tracing the provenance of AI-generated text. However, our robustness assessment reveals that SynthID-Text is vulnerable to meaning-preserving attacks, such as paraphrasing, copy-paste modifications, and back-translation, which can significantly degrade watermark detectability. To address these limitations, we propose SynGuard, a...

Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach

Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach

cs.CR updates on arXiv.org arxiv.org

arXiv:2510.17854v1 Announce Type: cross Abstract: Rapid advancement in generative AI and large language models (LLMs) has enabled the generation of highly realistic and contextually relevant digital content. LLMs such as ChatGPT with DALL-E integration and Stable Diffusion techniques can produce images that are often indistinguishable from those created by humans, which poses challenges for digital content authentication. Verifying the integrity and origin of digital data to ensure it remains...

OCR-APT: Reconstructing APT Stories from Audit Logs using Subgraph Anomaly Detection and LLMs

OCR-APT: Reconstructing APT Stories from Audit Logs using Subgraph Anomaly Detection and LLMs

cs.CR updates on arXiv.org arxiv.org

arXiv:2510.15188v1 Announce Type: new Abstract: Advanced Persistent Threats (APTs) are stealthy cyberattacks that often evade detection in system-level audit logs. Provenance graphs model these logs as connected entities and events, revealing relationships that are missed by linear log representations. Existing systems apply anomaly detection to these graphs but often suffer from high false positive rates and coarse-grained alerts. Their reliance on node attributes like file paths or IPs leads...

Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique

Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique

cs.CR updates on arXiv.org arxiv.org

arXiv:2510.09655v1 Announce Type: new Abstract: We address the problem of auditing whether sensitive or copyrighted texts were used to fine-tune large language models (LLMs) under black-box access. Prior signals-verbatim regurgitation and membership inference-are unreliable at the level of individual documents or require altering the visible text. We introduce a text-preserving watermarking framework that embeds sequences of invisible Unicode characters into documents. Each watermark is split...