MINISTRY OF EDUCATION VIETNAM ACADEMY OF SCIENCE
AND TRAINING AND TECHNOLOGY
GRADUATE UNIVERSITY OF SCIENCE AND TECHNOLOGY
Vu Duy Hien
DEVELOPING EFFICIENT AND SECURE MULTI-PARTY
SUM COMPUTATION PROTOCOLS
AND THEIR APPLICATIONS
DISSERTATION ON INFORMATION SYSTEM
Hanoi – 2024
BỘ GIÁO DỤC VIỆN HÀN LÂM KHOA HỌC
VÀ ĐÀO TẠO VÀ CÔNG NGHỆ VIỆT NAM
HỌC VIỆN KHOA HỌC VÀ CÔNGNGHỆ
Vũ Duy Hiến
NGHIÊN CỨU PHÁT TRIỂN MỘT SỐ GIAO THỨC TÍNH
TỔNG BẢO MẬT HIỆU QUẢ TRONG MƠ HÌNH DỮ LIỆU
PHÂN TÁN ĐẦY ĐỦ VÀ ỨNG DỤNG
LUẬN ÁN TIẾN SĨ NGÀNH HỆ THỐNG THÔNG TIN
Hà Nội – 2024
BỘ GIÁO DỤC VIỆN HÀN LÂM KHOA HỌC
VÀ ĐÀO TẠO VÀ CÔNG NGHỆ VIỆT NAM
HỌC VIỆN KHOA HỌC VÀ CÔNGNGHỆ
Vũ Duy Hiến
NGHIÊN CỨU PHÁT TRIỂN MỘT SỐ GIAO THỨC TÍNH
TỔNG BẢO MẬT HIỆU QUẢ TRONG MƠ HÌNH DỮ LIỆU
PHÂN TÁN ĐẦY ĐỦ VÀ ỨNG DỤNG
LUẬN ÁN TIẾN SĨ NGÀNH HỆ THỐNG THÔNG TIN
Mã số: 9 48 01 04
Xác nhận củaHọc Người hướng dẫn 1 Người hướng dẫn 2
việnKhoahọcvàCôngng (Ký, ghi rõ họ tên) (Ký, ghi rõ họ tên)
hệ
GS. TSKH. Hồ Tú Bảo PGS. TS. Lương Thế Dũng
Hà Nội - 2024
1
PLEDGE
I promise that the thesis: ”Developing efficient and secure multi-
partysumcomputationprotocolsandtheirapplications”ismyoriginalresearchworkunde
rthe guidance of the academic supervisors. All contents of the thesis
were
writtenbasedonpapersandarticlespublishedindistinguishedinternationalconferen
cesandjournals published by the reputed publishers. The source of the
references
inthisthesisareexplitlycited.Myresearchresultswerepublishedjointlywithotherauth
ors andwereagreeduponbytheco-
authorswhenincludedinthethesis.Newresultsanddiscussionspresentedinthethesis
areperfectlyhonestandtheyhavenotyetpublishedbyanyotherauthorsbeyondmypublic
ations.Thisthesishasbeenfinishedduring
thetimeIworkasaPhDstudentatGraduateUniversityofScienceandTechnology,Vietna
mAcademy of Science andTechnology.
Hanoi,2024
PhDstudent
Vu Duy Hien
ACKNOWLEDGEMENTS
Scientificresearchisaninterestingjourneywherethethesisisoneofthefirst
resultsthatresearchershavereached.Onthatjourney,Ihavemetmanykindpeoplewh
ohavesupportedformetofinishthisthesis.
Firstofall,IwouldliketothankmygreatsupervisorsProf.Dr.HoTuBaoandAssoc.
Prof.Dr.LuongTheDungwhohaveprovidedvaluableadvicetome.Withouttheirsuppor
tandguidance,Iwouldnotabletocompletemythesis.Ihavelearnedalotofthingsfro
mmysupervisors.
IamthankfultoGraduateUniversityofScienceandTechnology,colleagues
atBankingAcademyofVietnam,friends,andcollaboratorswhoalwaysencour- age me along
my researchjourney.
I also thankthe CAMEL cafe (No.104/1Viet Hung street, Long Bien
dis- trict, Ha Noi)where my publications and thesis had been born in.
Finally, I want to send the most special thank tomy big family, my wife,
and
our childrenwho always have my back.
Hanoi,2024
PhDstudent
Vu Duy Hien
CONTENTS
INTRODUCTION................................................................................1
1 OVERVIEW OF SECURE MULTI-PARTY SUM COMPUTATION 7
1.1 Background of securemulti-partycomputation 7
. . . . . . . . . . ..
1.1.1 Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.2 Basic concept . . . . . . . . . . . . . . . . . . . . . . . . .. 10
1.1.3 Definitionofsecurity. . . . . . . . . . . . . . . . . . . . . . 11
1.1.4 Cryptographicpreliminaries.. . . . . . . . . . . . . . . . . 18
1.2 Securemulti-partysumcomputationproblem.. . . . . . . . . . . . 22
1.2.1 Problemformulation.. . . . . . . . . . . . . . . . . . . . . 22
1.2.2 Relatedwork. . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 35
2 PROPOSING EFFICIENT SECUREMULTI-PARTYSUMCOMPUTA-
TIONPROTOCOLS 36
2.1 Analysis of typical secure multi-party sum computation protocols 36
2.1.1 Simple secure multi-party sumcomputationprotocol..........36
2.1.2 Securemulti-partysumcomputationprotocolofUrabeetal...............38
2.1.3 Secure multi-party sum computation protocol of Hao etal.,
2010inanelectronicvotingsystem.......................................40
2.1.4 Privacy-preserving frequency computation protocol ofYang
etal............................................................................44
2.1.5 Furtherdiscussion..............................................................47
2.2 Proposed secure multi-party sumcomputationprotocols..................49
2.2.1 Privacy-preserving frequency computation protocol based
on ellipticcurveElGamalcryptosystem
50
2.2.2 An efficient approach for secure multi-party sum
computationwithout pre-establishingsecure/authenticatedchannels
61
2.2.3 Secure multi-sumcomputation protocol............................78
2.3 Conclusion...............................................................................91
3 DEVELOPING NEW SOLUTIONS BASED ON SECUREMULTI-
PARTYSUMCOMPUTATIONPROTOCOLSFORPRACTICALPROBLEMS
93
3.1 Anefficientsolutionforthesecureelectronicvotingschemewithout
pre-establishingauthenticatedchannel...........................................93
3.1.1 Introduction..................................................................93
3.1.2 Relatedwork...................................................................94
3.1.3 Preliminaries.................................................................96
3.1.4 Asecureend-to-endelectronicvotingscheme...............................97
3.1.5 Securityanalysis..............................................................99
3.1.6 Experimentalevaluation...................................................102
3.2 An efficient and practical solution for privacy-
preservingNaiveBayes classificationinthehorizontaldatasetting
103
3.2.1 Introduction.................................................................104
3.2.2 Relatedwork.................................................................107
3.2.3 Preliminaries...............................................................109
3.2.4 Newprivacy-preservingNaiveBayes classifier for the
hori-zontalpartitiondatasetting
112
3.2.5 Privacyanalysis..............................................................115
3.2.6 Accuracyanalysis...........................................................115
3.2.7 Experimentalevaluation...................................................115
3.3 Conclusion.............................................................................120
CONCLUSION......................................................................................122
BIBLIOGRAPHY..........................................................................124
APPENDICES.............................................................................137
PUBLICATIONL I S T ............................................................... 140
LIST OF ABBREVIATIONS
BoW..................Bag-of-Words
CDH..................ComputationalDiffie-Hellman
DDH..................DecisionalDiffie-Hellman
DD-PKE............Public-keyencryptionwithadouble-decryptionalgorithm
DNA...................Deoxyribonucleicacid
DRE...................Direct-recordingelectronic
DSS....................Digital signaturestandard
E2E....................End-to-end
LWE..................Learn witherror
NSC...................NationaluniversityofSingaporeshorttextmessagescorpus
PPFC.................Privacy-preserving frequencycomputation
PPML................Privacy-preserving machinelearning
PPNBC..............Privacy-preservingNaiveBayesclassification
PSI.....................Private setintersection
RAM..................Random AccessMachines
SMC..................Secure multi-partycomputation
SMS...................Secure multi-partysum
SSC....................Secure sumcomputation
TF-IDF..............Termfrequency –inversedocumentfrequency
UK.....................UnitedKingdom
ZKP...................Zero knowledgeproof
LIST OF TABLES
2.1 ThebriefcomparisonsofthecomputationalcomplexityamongthreetypicalSMSpr
otocols
48
2.2 Thecomputationalcomplexitycomparisonsamongtheproposedpro-
tocolandthetypicalprotocols...........................................................56
2.3 ThecommunicationcostcomparisonsamongthetypicalPPFCprotocols.57
2.4 Thestoreddatavolumeoftheminercomparisonsamongthetypical
PPFCprotocols(inmegabytes)..........................................................62
2.5 Thecomparisonsofeachuser’scomputationalcomplexityamongtheproposed
protocolandthetypicalprotocols.
72
2.6 Theminer’scomputational complexity comparisons among
the pro-posedprotocolandthetypicalprotocols.
72
2.7 The comparisons of eachuser’scommunication cost among
the pro-posedprotocolandthetypicalprotocols.
74
2.8 Thecomparisonsoftheminer’scommunicationcostamongthepro-
posedprotocolandthetypicalprotocols.
74
2.9 The stored data volume of the miner comparisons among
the pro-posedprotocolandthetypicalprotocols(inmegabytes).
78
2.10 Thecomputationalcomplexitycomparisonsamongthenewproposal
andthetypicalsolutions...................................................................86
2.11The communication cost comparison among thenewproposal
and the typicalsolutions.
87
2.12 Therunningtimefortheminertocomputethesumvaluescompar-
isonsamongthecomparedsolutions(inseconds)....................................91
2.13The stored data volume of the miner comparisons among thecom-
paredsolutions(inmegabytes)...........................................................91
3.1 Spam short-messagesdatasetinformation...................................118
3.2 Therunningtimecomparisonsamongthenewproposalandthetypi-
calPPNBCsolutionsontherealdataset(inseconds).........................................119
LIST OF FIGURES
1.1 Thedistributedcomputingmodelinasecuremanner.....................................8
1.2 An example of the authentication method without
knowinguser’spassword
8
1.3 Anexampleofmonitoringuser’spasswords.............................................9
1.4 AnexampleoftheDNApattern-matchingproblem.......................................9
1.5 Thesecureelectronicsealed-bidauctionmodel........................................10
1.6 Therealandidealmodelsindistributedcomputingfield..................................15
1.7 The computational model of the secure multi-party sum
computation problem
22
1.8 Thesingle-candidateendtoenddecentralizede-votingmodel.............................23
1.9 Anexampleoftheprivacy-preservingfrequentitemsetminingproblem23
2.1 Thecomputationalmodelofthesimplesecuremulti-partysumcom-
putationprotocol
37
2.2 TherunningtimeofeachusercomparisonsamongthetypicalPPFC
protocols.............................................................................59
2.3 Thetimefortheminer/theservercomputingthepublickeyscompar-
isonsamongthetypicalPPFCprotocols.......................................................60
2.4 The time for the miner/the server computing the frequency
value com-parisons among the typicalPPFCprotocols.
61
2.5 Therunningtimeofeachusercomparisonsamongtheproposedpro-
tocolandthetypicalprotocols...........................................................75
2.6 Thetimeofthepre-computationphasecomparisonsamongthepro-
posedprotocolandthetypicalprotocols.
76
2.7 The time of the user authentication phase comparisons
among theproposedprotocolandthetypicalprotocols.
77
2.8 The time of the securen-parties sum phase comparisons
among theproposedprotocolandthetypicalprotocols.
78
2.9 Thenumberofprivatekeyscomparisonsamongthecomparedsolutions.88
2.10 The total running time of each user comparisons among the
compared solutions.
89
2.11 Therunningtimefortheminertocomputethepublickeyscompar-
isonsamongthecomparedsolutions.....................................................90
3.1 Thesingle-candidateE2Edecentralizedelectronicvotingmodel.........................96
3.2 The total running time of each voter comparisons between
thenewsolution andHao’sscheme.
103
3.3 Thevotingserver’stotalrunningtimecomparisonsbetweenthenewsolution
andHao’sscheme.
104
3.4 The horizontally distributedcomputingmodel................................111
3.5 Anexampleofdatatransformation......................................................112
1
INTRODUCTION
A. Motivation
Nowadays, the development of information technology and
communication,especiallythebirthofwebapplicationsorinformationsystemsh
ascreatedalarge
amountofdataownedbyorganizationsorindividuals.Thishasspurredthedevel-
opmentofthedistributedcomputingfieldwherethedataownersperformtogether
computationaltasksbasedontheircooperativedata[1,2].Basically,thedistributedcomput
ingfieldhasbroughtalotofsubstantialbenefitstoorganizationsandindivid-uals,
such asreducing significantly costs, understanding comprehensively
customers,and making good business decisions.However,in fact,
because ofprivacypolicy
orbusinesssecrets,participantsofdistributedcomputingsystemsoftenwishtoob-
taincooperativetasks’correctoutputwithoutrevealingtheirinputdata.Forinstance,
somebankscooperatetogethertoimprovemachinelearning-
basedcreditscoringtoolusingtheircustomers’data,buttheyarenotreadytosharetheircu
stomers’dataforanyone.Similarly,althoughtherearesomehospitalswhowanttojoi
ntlydevelopdis-
easediagnosismethodsbasedonalargeuniteddatabase,howevertheydonotwanttopro
videtheirpatients’datatoothers.ThesechallengeshadmotivatedthebirthofSECURE MULTI-
PARTY COMPUTATIONarea (SMC, for short) that has been considered asa
subfield of moderncryptography.
In essence,Secure Multi-party Computationrefers distributed
computingmethods in security concerns [1, 3]. Particularly, in a secure
multi-party computation
model,thereareseveralparties,inwhicheachparticipantownsaprivateinput.These
participantswishtoobtaintheresultofthespecificfunctionfoverallprivateinputs
whileeachpartyrevealsnothingabouthis/herinputbuttheoutputresult.Unliketraditi
onalcryptographyfield,theadversaryofSMCproblemsingeneralandtheSMSproblemi
nparticularcanbeinsidethesystemofparticipants.Theattacksofthead-
2
versarymaybetolearnthehonestparticipants’privateinputortocausetheoutputstobeinco
rrect[1].Asaresult,the”secure”termheremeans:(1)theoutput’scor-
rectness is guaranteed, and(2) each party’s input is privately kept by himself/herself.
Nowadays,SMChasbecomeaninterestingtopicthathasattractedmoreandmor
eattentionfromresearchcommunity.AvarietyofSMCproblemshavebeenfor-
mulatedandtheirsolutionshavebeenproposedintoSMCprotocols,suchassecurecompar
ison protocols[4,5],secure multi-party sum computation protocols [6–8],
and securedotproductprotocols[6,9–
11].Furthermore,suchSMCprotocolshavebeenap-
pliedtovariouspracticalproblems,suchassecureonlineauction[14],securee-voting
systems[12,13],privacy-preserving queries system [15], privacy-
preserving financial data analytic [16], privacy-preserving online
advertising [17], andprivacy-preservingmachine learning/data mining[18–
20].
This thesis has investigated one of the most important and
popularSMCprob-lems[6]thatisthesecuremulti-
partysumcomputationone(SMS,forshort).IntheSMSproblem,itisassumedthatwher
etherearesomeparties,inwhicheachparty
ownsaprivatevalueashis/herinput,andthepartieswishtoobtainthesumofall
inputsbuttheyrevealnothingabouttheirinputsbeyondthesumvalue.S i m i l a r l y to
SMCproblemsingeneral,thebirthofSMSonehasbeenbasedonthesecurityrequirem
entsofspecificdistributedcomputingproblems.Currently,alotofproto-
colshavebeenpropoundedfortheSMSproblem,andtheyhaveawideapplicabilityin
various practical computing tasks, such as privacy-preserving recommendationsys-
tem [21], privacy-preserving multi-party data analytics [22], secure
electronicvotingsystem [12, 13], privacy-preserving association rule
mining [6, 7], privacy-
preservingclassification[23],securedatacollectionforthesmartgrid[24],andse
cureauc-tion [25,26].
ForSMCproblemsingeneral,andSMSoneinparticular,theprotocolsmustbese
cure(mainlyincludingthepreservationoftheprivacyoftheparticipants’local
inputsandthecorrectnessofthehonestparties’outputs[3])enoughtopreventtheadve
rsary’s harmful behaviors. Besides,SMSprotocols should be good
performance(i.e.lowcomputationalcomplexityandcommunicationcost)tobeimple
mentedinreal-life applications. This is perfectly understandable,
because a lot of
practicalSMSproblemsrequiretoperformcomputationaltasksasquicklyaspossible,s
uch
as secure e-voting, secure online auction.SMSprotocols-basedprivacy-
preservationsolutions such as privacy-preserving Apriori algorithm for
mining associationrules,privacy-
preservingNaiveBayesclassifier,andsecuregradientdescentalgorithmhaveto
executeSMSprotocol multiple times to compute necessary mediate values.
More-
over,inmanydistributedcomputingscenarios,participantsusedeviceslimitedinco
mputationalability,storage capacity, and connectivity, e.g. smartphones,
tablets.Thus,itissignificanttodevelopSMSprotocolshavingbothhighsecuritylevelan
dgoodperformance.
B. Researchobjectives
Asmentionedbefore,firstofall,SMSprotocolsneedtobesecure.Todothis,SMSproto
colseither(1)requireeachparticipanttosplithis/herprivatevalueintoa
numberofparts,andhethensharesthemwithallothersusingsecurecommunica-
tionchannelsor(2)usehomomorphiccryptosystemssuchasElGamalencryption
scheme[27]orPailliercryptosystem[28].C o n s i d e r i n g theapproach(1),suchpr
o-tocolsobviouslyhavehighcostofcommunication,andtheyareunsuitableformulti-
partycomputationalmodelswithalargenumberofparticipants.I n contrast,SMSprot
ocolsbasedonthesecondapproach(2)oftenhavepriceycostofcomputation.
Asaresult,itcanbestatedthatthebiggestchallengefordesigningSMSprotocolsishowtocre
ateSMSprotocolshavingbothhighsecuritylevelandgoodperformance.Thus,theresearch
objectivesofthisthesisinclude:
• Designing efficient and secure multi-party sum computation
protocols
thathavethecapabilitytopreservetheprivacyoftheparties’localinputs
andthecorrectnessofthehonestparties’outputs,aswellasgoodperformance.
• DevelopingSMS-based solutions for practical problems
thathavebeencur-
rentlysolvedbyexistingSMSprotocolsbutarenotyetsecureandefficient.
C. Maincontributions
The scientific story of this thesis is narrated as follows: