Australasian Language Technology Workshop 2013 | ||||||
Tutorials at ALTA 2013We are pleased to announce that ALTA2013 will include pre-workshop tutorials on 4 December 2013. Schedules for these tutorials are available in the programme. Working with the HCS vLabDominique Estival (University of Western Sydney) and Steve Cassidy (Macquarie University) The Human Communication Science Virtual Laboratory (HCSvLab) will run a half-day workshop. The HCSvLab provides an on-line infrastructure for accessing human communication corpora (speech, text, music, sounds, video, etc.) and for using specialised tools for searching, analysing and annotating that data. The aims of the HCS vLab are to:
Applying Wikipedia as a machine-readable knowledge baseDavid Milne (CSIRO) What if your search engine, recommender or clustering algorithm could consult Wikipedia as easily as we do, to understand more about the documents they encounter? This is not a far-fetched idea. While clearly intended for human readers, the raw structure of the Wikipedia bears striking resemblance to traditional knowledge bases and provides many footholds for algorithms to extract machine-readable knowledge. In this tutorial, we will work with Wikipedia to augment and enhance other textual information sources. This is broken down into three key problems:
For each of the three problems described above we will provide live demonstrations and hands-on activities. For the extraction problem, we present an extremely large thesaurus-like structure that has been automatically generated from Wikipedia, and show how it can be reasoned over (in a rough fashion) by machines. For the connection task, we demonstrate an algorithm that can automatically detect and disambiguate Wikipedia topics when they are mentioned in any textual document, and intelligently predict those that are most likely of interest to the reader. For the final problem, we present several end-user applications that combine the work described above with slick visualisation techniques, to provide enhanced browsing and searching experiences. All of the presented systems are open source and publicly available on the web. | ||||||