R

To manipulate XML data:

  • Load the appropriate packages for parsing XML doc :
    dir()
    setwd(dir="FOLDER"); getwd ()
    library(xml2)
    library(XML2R)   
  • Load XML file and define where to find attributes:
    doc <- xmlTreeParse("FILE_NAME.xml" , useInternalNodes=TRUE, encoding="UTF-8")
    ns =  c(ns =  "http://www.tei-c.org/ns/1.0")
    namespaces = ns
    getNodeSet(doc,"//* and //@*", ns) 
    doc
The function getNodeSet uses a XPath syntax to match nodes in doc (file), according to a valide value of namespace (ns) which are identified by the URI reference http://www.tei-c.org/ns/1.0: