1、LixinFanCheeSengChanQiangYangEditorsDigital Watermarking forMachine Learning ModelTechniques,Protocols and ApplicationsDigital Watermarking for Machine Learning ModelLixin Fan Chee Seng Chan Qiang Yang Editors Digital Watermarking for Machine Learning Model Techniques,Protocols and ApplicationsEdito
2、rs Lixin Fan AI Lab WeBank Shenzhen,China Chee Seng Chan Department of Artificial Intelligence Universiti Malaya Kuala Lumpur,Malaysia Qiang Yang Department of CS and Engineering Hong Kong University of Science and Tech Hong Kong,China ISBN 978-981-19-7553-0ISBN 978-981-19-7554-7(eBook)https:/doi.or
3、g/10.1007/978-981-19-7554-7 The Editor(s)(if applicable)and The Author(s),under exclusive license to Springer Nature Singapore Pte Ltd.2023 This work is subject to copyright.All rights are solely and exclusively licensed by the Publisher,whether the whole or part of the material is concerned,specifi
4、cally the rights of reprinting,reuse of illustrations,recitation,broadcasting,reproduction on microfilms or in any other physical way,and transmission or information storage and retrieval,electronic adaptation,computer software,or by similar or dissimilar methodology now known or hereafter developed
5、.The use of general descriptive names,registered names,trademarks,service marks,etc.in this publication does not imply,even in the absence of a specific statement,that such names are exempt from the relevant protective laws and regulations and therefore free for general use.The publisher,the authors
6、,and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication.Neither the publisher nor the authors or the editors give a warranty,expressed or implied,with respect to the material contained herein or for any errors o
7、r omissions that may have been made.The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.The registered company address is:152 Beach Road,#21-
8、01/04 Gateway East,Singapore 189721,Singapore Preface In a modern digital economy,we care about the value that data can generate.Such values are oftentimes created by machine learning models empowered by enormous amount of data of multiple forms.For example,using the health-checkup data,medical doct
9、ors can train a stroke prediction model that can accurately predict the likelihood of a patient getting a stroke.A computer vision model in an autonomous vehicle can tell whether a traffic light is in red or green even in the foggy weather.An economic model can give explanations on why the oil price
10、s are volatile in a particular period of time.One can say that data are equivalent to raw materials such as coal and oil in the traditional economy,and in this analogy,machine learning models are the machines and vehicles that produce the value for the digital economy.Similar to the finance and good
11、s that need to be tracked and managed,as well as to be protected by law,in the foreseeable future,models need to be protected,managed and audited as well.Specifically,when we use a model purchased from a third party,we need to be certain that the model comes from a legitimate place.When we trade mod
12、els in a market place,we need to have a fair methodology to ascertain the value of the model in a certain business context.When a model misbehaves,for instance if a stroke prediction model fails to predict a fatal stroke,we need to have the means to trace back the responsible party that should handl
13、e the loss of life.When users with different roles,such as regulators,engineers or end users,inquire about the model,we need to have a way to audit the models history as well as give a fair explanation of the models performance.Furthermore,when models are built out of multiple parties data,it is imp
14、ortant to be able to filter out semi-honest parties who can use various opportunities to peek at other parties data out of curiosity.To be able to track and manage models,a typical way is to embed a signature known as a watermark into a model.Furthermore,care should be taken to prevent the watermark
15、ing information from being altered.It is challenging to insert and manage watermarks technically for complex models that involve millions or even billions of model parameters.The technology of model watermarking is the central focus of this book.The watermarking technology must answer how to best ba
16、lance the need to embed the watermarks and hide them from potential tampering while v viPreface allowing the model training and inference to be efficient and effective.While there are watermarking algorithms for image data to confirm the ownership of images,and lately NFT technologies for digital ar
17、ts,the watermarking techniques for models are novel and more challenging.This is partly due to the fact that models engage in an entire software product lifecycle in which there is a training process and an application process.There are issues related to ownership verification,transfer and model rev
18、ision,mixtures and merges,model tracing,legal obligation,responsibility,rewards,and incentives.Once established,the model watermarking techniques will become a cornerstone of the future digital economy.This book is the result of the most recent frontline research in AI contributed by a group of rese
19、archers who are active in fields including machine learning,data and model management,federated learning and many fielded applications of these technologies.This book is in general suitable for readers with interests in machine learning and big data.In particular,the preliminary chapters provide an
20、introduction and brief review of requirements for model ownership verification using watermarking.Chapters in Part II of the book elaborate on techniques that are developed for various machine learning models as well as security requirements.Part III of the book covers applications of model watermar
21、king techniques in federated learning settings and model auditing use cases.We hope the book will bring to the readers a new look into the digital future of human society,one that follows widely accepted human values of modern people and society.We also expect this introductory book a good reference
22、 book for students studying artificial intelligence and a handbook for engineers and researchers in industry.To our best knowledge,this book is the first in its kind that showcases how to use digital watermarks to verify ownership of machine learning models.Nevertheless,the book would have been impo
23、ssible without kind assistance from many people.Thanks to everyone on the Springer editorial team,and special thanks to Celine,the ever-patient Editorial Director.The authors would like to thank their families for their constant support.Shenzhen,ChinaLixin Fan Kuala Lumpur,MalaysiaChee Seng Chan Hon
24、g Kong,ChinaQiang Yang June,2022 Contents Part I Preliminary 1 Introduction.3 Lixin Fan,Chee Seng Chan,and Qiang Yang 2 Ownership Verification Protocols for Deep Neural Network Watermarks.11 Fangqi Li and Shilin Wang Part II Techniques 3 Model Watermarking for Deep Neural Networks of Image Recovery.
25、37 Yuhui Quan and Huan Teng 4 The Robust and Harmless Model Watermarking.53 Yiming Li,Linghui Zhu,Yang Bai,Yong Jiang,and Shu-Tao Xia 5 Protecting Intellectual Property of Machine Learning Models via Fingerprinting the Classification Boundary.73 Xiaoyu Cao,Jinyuan Jia,and Neil Zhenqiang Gong 6 Prote
26、cting Image Processing Networks via Model Watermarking.93 Jie Zhang,Dongdong Chen,Jing Liao,Weiming Zhang,and Nenghai Yu 7 Watermarks for Deep Reinforcement Learning.117 Kangjie Chen 8 Ownership Protection for Image Captioning Models.143 Jian Han Lim 9 Protecting Recurrent Neural Network by Embeddin
27、g Keys.167 Zhi Qin Tan,Hao Shan Wong,and Chee Seng Chan vii viiiContents Part III Applications 10 FedIPR:Ownership Verification for Federated Deep Neural Network Models.193 Bowen Li,Lixin Fan,Hanlin Gu,Jie Li,and Qiang Yang 11 Model Auditing for Data Intellectual Property.211 Bowen Li,Lixin Fan,Jie
28、Li,Hanlin Gu,and Qiang Yang Contributors Yang Bai Tsinghua University,Beijing,China Xiaoyu Cao Duke University,Durham,NC,USA Chee Seng Chan Universiti Malaya,Kuala Lumpur,Malaysia Dongdong Chen Microsoft Research,Redmond,WA,USA Kangjie Chen Nanyang Technological University,Singapore,Singapore Lixin
29、Fan WeBank AI Lab,Shenzhen,China Neil Zhenqiang Gong Duke University,Durham,NC,USA Hanlin Gu WeBank AI Lab,Shenzhen,China Jinyuan Jia Duke University,Durham,NC,USA Yong Jiang Tsinghua University,Beijing,China Jie Li Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shangha
30、i,China Bowen Li Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai,China Fangqi Li Shanghai Jiao Tong University,Shanghai,China Yiming Li Tsinghua University,Beijing,China Jing Liao City University of Hong Kong,Hong Kong,China Jian Han Lim Universiti Malaya,Kuala
31、Lumpur,Malaysia Yuhui Quan South China University of Technology and Pazhou Laboratory,Guangzhou,China Zhi Qin Tan Universiti Malaya,Kuala Lumpur,Malaysia Huan Teng South China University of Technology,Guangzhou,China ix xContributors Shilin Wang Shanghai Jiao Tong University,Shanghai,China Hao Shan
32、Wong Universiti Malaya,Kuala Lumpur,Malaysia Shu-Tao Xia Tsinghua University,Beijing,China Qiang Yang Hong Kong University of Science and Technology,Clear Water Bay,Hong Kong Nenghai Yu University of Science and Technology of China,Heifei,China Jie Zhang University of Science and Technology of China
33、,Heifei,China Weiming Zhang University of Science and Technology of China,Heifei,China Linghui Zhu Tsinghua University,Beijing,China About the Editors Lixin Fan is currently the Chief Scientist of Artificial Intelligence at WeBank,Shenzhen,China.His research interests include machine learning and deep learning,privacy computing and federated learning,computer vision and pattern recognition,image a
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1