Python: how to generate a unique identifier for an XML document? -


xslt has generate-id(xml-document) function. can create unique identifier xml document.

in python how generate unique identifier xml document?

note: unique identifier should based on content of xml document, not file name of xml document. example, xml document

<root>    <comment>hello, world</comment> </root> 

and xml document

<document>    <test>blah, blah</test> </document> 

must generate different identifiers, if file names identical.

i have graph of xml documents. need way recognize, "hey, i've seen xml document." don't want compare entire xml documents. instead, want compare uuids correspond xml.

a colleague sent me answer:

for mapping text id, we’ve used md5 1 hash digest. give md5() function xml document (string) , return 32-character identifier.

more details:


genid.py

import sys import stdio hashlib import md5  def digest_md5(obj):     if type(obj) unicode:         obj = obj.encode('utf8')     return md5(obj).hexdigest()  s = sys.stdin.readline() stdio.writeln(digest_md5(s)) 

then turned exe file.

then dos command prompt typed command:

type input.txt | genid 

where input.txt is:

<document>hello, world</document> 

and got output:

df6f8283335bf3f657a89733e3d36b84 

beautiful!


Comments

Popular posts from this blog

python - No exponential form of the z-axis in matplotlib-3D-plots -

php - Best Light server (Linux + Web server + Database) for Raspberry Pi -

c# - "Newtonsoft.Json.JsonSerializationException unable to find constructor to use for types" error when deserializing class -