| dc.contributor.author |
Weatherston, M. |
|
| dc.contributor.author |
Obregon, A. |
|
| dc.contributor.author |
Li, Longzhuang |
|
| dc.contributor.author |
Liu, Yonghuai |
|
| dc.date.accessioned |
2007-12-05T12:51:32Z |
|
| dc.date.available |
2007-12-05T12:51:32Z |
|
| dc.date.issued |
2007-08-13 |
|
| dc.identifier.citation |
Weatherston , M , Obregon , A , Li , L & Liu , Y 2007 , ' Visual Segmentation-Based Data Record Extraction From Web Documents ' . in : International Conference on Information Reuse and Itegration . pp. 502-507 , International Conference on Information Reuse and Itegration , Las Vegas , United Kingdom , 13-15 August . |
en |
| dc.identifier.citation |
conference |
en |
| dc.identifier.isbn |
1-4244-1500-4 |
|
| dc.identifier.isbn |
1-4244-1500-4 |
|
| dc.identifier.other |
PURE: 575484 |
|
| dc.identifier.other |
dspace: 2160/386 |
|
| dc.identifier.uri |
http://hdl.handle.net/2160/386 |
|
| dc.identifier.uri |
http://ieeexplore.ieee.org/iel5/4296570/4296571/04296670.pdf |
en |
| dc.description |
Li, Longzhuang, Liu, Yonghuai, Obregon, A., Weatherston, M. Visual Segmentation-Based Data Record Extraction From Web Documents. Proceedings of IEEE International Conference on Information Reuse and Integration, 2007, pp. 502-507. Sponsorship: IEEE |
en |
| dc.description.abstract |
Semi-structured data records contained in the Web pages provide useful information for shopping agents and metasearch engines. In this paper, we present a visual segmentation-based data record extraction (VSDR) method to extract data records from those Web pages. VSDR method first segments a Web page into semantic blocks using the spatial closeness and visual resemblance of data records, then neighboring and non-neighboring data records are extracted based on a compress and collapse technique. Experimental results slum that unlike the existing methods which only generate good results on their test domains, VSDR is a general data record extraction method that is able to produce quite stable and good results on a wide range of Web pages. |
en |
| dc.format.extent |
6 |
en |
| dc.language.iso |
eng |
|
| dc.relation.ispartof |
International Conference on Information Reuse and Itegration |
en |
| dc.title |
Visual Segmentation-Based Data Record Extraction From Web Documents |
en |
| dc.type |
Text |
en |
| dc.type.publicationtype |
Conference proceeding |
en |
| dc.identifier.doi |
http://dx.doi.org/10.1109/IRI.2007.4296670 |
|
| dc.contributor.institution |
Department of Computer Science |
en |
| dc.contributor.institution |
Vision, Graphics and Visualisation Group |
en |