XPath Ranking and Graph Compression
Description
This paper presents a collection of algorithms addressing two related problems in web data management: automatic wrapper generation for structured data extraction and efficient compression of large-scale graph structures. For the extraction problem, we develop a method for on-the-fly wrapper creation that leverages XPath expressions ranked by their discriminative power over HTML and XML document collections. For the compression problem, we propose techniques for reducing the storage requirements of adjacency list representations, with particular focus on the structural properties exhibited by web graphs and social networks. Additionally, we investigate the relationship between maximum flow and minimum cut in capacitated networks, presenting bounds on the max-flow min-cut gap and approximation algorithms for the multicut problem. Finally, we address fault tolerance in mesh-connected architectures through deep emulations that enable a fault-free mesh to be simulated on a mesh containing random faults.
Files
XPath_Ranking_Graph_Compression.pdf
Files
(395.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:424456ba56a4f0d952d02f6b4158d9f4
|
395.4 kB | Preview Download |