ARIADNE

Ariadne: A System for Constructing Mediators for Internet Sources

Jose-Luis Ambite, Naveen Ashish, Greg Barish, Craig A. Knoblock, Steve Minton, Pragnesh J. Modi, Ion Muslea, Andrew Philpot and Sheila Tejada 

Information Sciences Institute, Integrated Media systems Center and 
Department of Computer Science
University of Southern California, Los Angeles, CA 

Introduction

We present Ariadne, a system for constructing mediators for internet sources. Using such mediators we can extract, query and integrate data from multiple Web sources. Ariadne uses essentially the same high level architecture as database mediators where the mediator communicates through different information sources using wrappers around the sources and provides integrated access to the information in the different sources.

Browse through the SIGMOD Demo poster slides

The poster slides give an overview of the Ariadne system architecture and provide a brief description of the different research problems in building such a system.

Technology Demonstration

We present screenshots showing the various system components and applications of Ariadne in use that we demonstrated at SIGMOD 1998. We first present a demo of a semi-automatic wrapper generation toolkit that is part of the Ariadne system for building wrappers for semi-structured Web sources with minimal user effort. We then demonstrate three information mediators in different application domains that we constructed using Ariadne.

Semi-automatic Wrapper Generation.

We demonstrate the process of creating a wrapper for a semi-structured Web source using our wrapper generation toolkit. Through a graphical user interface (GUI) the user highlights fields of information that he wants to extract from a page. After providing examples on a few pages the system attempts to learn landmarks that enable it to extract the different fields from a page from the Web source. This demo shows the results of applying our tools to build an a wrapper for the Zagat Restaurant Guide information source. The original source consists of a set of HTML pages, each page containing the description of an individual restaurant; that is, for each reviewed restaurant, the Zagat Guide provides the following attributes: Name, Cuisine, Address, Phone Number, Rating, and Review.

In order to create a wrapper for the information source, one has first to generate the extraction rules that are used to extract the attributes of interest. Our Semi-automatic Wrapper Induction Tool allows the user to mark up a few instances of the data to be extracted (say, restaurant Name and Cuisine) and then to invoke the STALKER learning algorithm, which will generate the appropriate extraction rules based on the examples labeled by the user.

Browse Wrapper Generation Demo


Restaurant Locator mediator

This demo applies technology we are developing for the Ariadne System. It shows the results of applying our tools to build an information mediator which one could use to find a restaurant in the greater Los Angeles region. This application uses the Zagat Survey/Los Angeles to obtain information about restaurants. It uses the ETAK geocoder to convert street addresses into latitude/longitude pairs. Finally, it uses the US Census Bureau's Tiger Mapping Service to place all selected restaurants onto a dynamically generated map of the Southern California area.

Browse a cached sample run of the Ariadne Restaurant mediator


Countries Information Mediator

This demo shows the results of applying our tools to building an information mediator for the CIA World Factbook. The original CIA World Factbook consists of an HTML page for each country in the world. While this source provides a very rich source of data about individual countries, it is not provided in a form that makes it easy to compare information across countries.

To create a more usable source of data we applied our work on semi-automatic wrapper generation to create the software required to provide direct access to the original Factbook. We built a simple HTML interface that allows users to submit their queries from the Web. You can compare the same information from several different countries by selecting multiple countries and the attributes that you want to compare.

Browse a cached sample run of the Ariadne Factbook Mediator


Travel Weather Mediator

This demo shows the results of applying our tools to building an information mediator for performing flight delay predictions based on real-time weather reports from the YAHOO! weather service.

This application is able to query the YAHOO! weather service and get weather reports for various cities across the USA. This weather data is then used as input to another information source that is able to predict if a given flight will be delayed based on the weather forecast at the arriving and departing cities. The predictor is a preconstructed Naive Bayes classifier that was trained on historical weather from the US Weather Service and historical flight delay data from the FAA.

This demo highlights Ariadne's ability to provide integrated access to very disparate information sources. Users are able to pose queries that span the multiple sources and the system is able to answer them by automatically retrieving the relevant information.

Browse a cached sample run of the Ariadne Travel Weather Mediator.