Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

xml - Parsing line and changing some text in place

How do I parse a log file (not a full xml file, but it has some portion of xml data) for ExtData tags, which has some name-value pair, I need to mask it like this : For eg:

<ExtData>Name="Jason" Value="Special"</ExtData>
to
<ExtData>Name="Jason" Value="XXXXXXX"</ExtData>

I need to mask ExtData tag value like above only when Name is Jason or some set of name, and not for every Name.

For eg: if "DummyName" is not in set of names, than I do not want to change this below line.

<ExtData>Name="DummyName" Value="Garbage"</ExtData>

For eg: if "DummyName" is not in set of names, than I do not want to change this below line. (Please note that the value is "Jason")

<ExtData>Name="DummyName" Value="Jason"</ExtData>

For eg: if "DummyJasonName" is not in set of names, than I do not want to change this below line. (Note "Jason" in between "Dummy" and "Name")

<ExtData>Name="DummyJasonName" Value="Garbage"</ExtData>

I need to do all this in bash/shell script.

Bottom line is, I want to read a file, say, via sed/awk/match command. Check for ExtData tag in the line. If matched, Read the text between ExtData tag and /ExtData tag. In this multiline text, extract Name. If Name is from a set of names, then mask its corresponding "Value" data with equal number of 'X'.

Please let me know how to achieve the above task.

Update, the input line can actually span over multiple lines.

<ExtData>Name="Jason" 
Value="Special"
    </ExtData>

Or like this too:

<ExtData>
     Name="Jason" 
  Value="Special"
    </ExtData>

Thanks !! Puneet

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

In a bash shell, you can create a copy of the file with the info removed using this

sed 's#(<ExtData>Name="Jason" Value=").*("</ExtData>)#1XXXXX2#' xml.txt > xml_xxx.txt

Note that it's not the "official" way to change a xml file. Lots of format changes could occur that would render this script useless, but if you know that your XML file has 1 info per line formatted like that, it will work, exactly like for a text file and it's quick.

(also the question is tagged sed and bash, if it wasn't that would involve heavy xml parsing using libxml2, saxon or other libraries that can parse XML nodes)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...