Published May 11, 2012 | Version v1
Standard Open

XPath Ranking and Graph Compression

Authors/Creators

  • 1. ROR icon Google (United States)

Contributors

  • 1. ROR icon Google (United States)

Description

This paper presents a collection of algorithms addressing two related problems in web data management: automatic wrapper generation for structured data extraction and efficient compression of large-scale graph structures. For the extraction problem, we develop a method for on-the-fly wrapper creation that leverages XPath expressions ranked by their discriminative power over HTML and XML document collections. For the compression problem, we propose techniques for reducing the storage requirements of adjacency list representations, with particular focus on the structural properties exhibited by web graphs and social networks. Additionally, we investigate the relationship between maximum flow and minimum cut in capacitated networks, presenting bounds on the max-flow min-cut gap and approximation algorithms for the multicut problem. Finally, we address fault tolerance in mesh-connected architectures through deep emulations that enable a fault-free mesh to be simulated on a mesh containing random faults.

Files

XPath_Ranking_Graph_Compression.pdf

Files (395.4 kB)

Name Size Download all
md5:424456ba56a4f0d952d02f6b4158d9f4
395.4 kB Preview Download