Assignment 1: Reading Case Studies from the Handbook of Annotation
Each set of annotation papers below cover a particularly kind of annotation but from different perspectives and/or different research groups. Read both papers in the set you have been assigned (see class email announcement). On Tuesday, Jan 24th you will be given time in class to discuss the papers in groups and come up with a short presentation (~10 minutes) describing the core ideas of the project, including the following:
Assignment Details- What was the goal of the task?
- Which specific aspect of natural language did they try to capture?
- What were the hurdles and blocks while capturing such phenomena?
- How successful were their work, and how can we tell it’s successful or not?
- What kinds of application can use such annotation?
Presentations will be Friday, January 27. You do not have to talk about the details of the annotation or tools. We will be revisiting the papers later in the semester where you will be doing a deeper dive into the work.
Paper Sets: The papers are available on Latte.
Set 1 Propbank and Framenet
25. VerbNet/PropBank-based Sense Annotation
Meredith Green, Claire Bonial, Orin Hargraves, Jinying Chen, Lyndsie Clark, and Martha Palmer
27. FrameNet: Frame Semantic Annotation in Practice
Collin Baker
Set 2 Sentiment Analysis
28. MPQA Opinion Corpus
Theresa Wilson, Janyce Wiebe, and Claire Cardie
29. The JDPA Sentiment Corpus for the Automotive Domain
Jason S. Kessler and Nicolas Nicolov
Set 3 Space
36. ISO-Space: Annotating Static and Dynamic Spatial Information
James Pustejovsky
37. Spatial Role Labeling Annotation Scheme
Parisa Kordjamshidi, Martijn van Otterlo, and Marie-Francine Moens
Set 4 Discourse
44. The Penn Discourse Treebank: An Annotated Corpus of Discourse Relations
Rashmi Prasad, Bonnie Webber, and Aravind Joshi
46. Annodis and Related Projects: Case Studies on the Annotation of Discourse Structure
Nicholas Asher, Farah Benamara, Philippe Muller, Stergos Afantenos, and Mai Ho Dac
Set 5 Speech/Dialog
47. NICT Kyoto Dialogue Corpus
Kiyonori Ohtake and Etsuo Mizukami
48. Case Study: The Austalk Corpus
Steve Cassidy, Dominique Estival, and Felicity Cox
Assignment 2: Annotation of dialog and reviews
Each group will be assigned an annotation effort from the set below. Read the paper and look through the guidelines. We’ll provide you some data to look at before Friday’s class. Your goal in class will be to try to annotate that data given the guidelines and create a presentation covering the goal of the annotation, a description and assessment of the guidelines (including your experience trying to follow them), an a quick summary of the evaluation results and what kinds of machine learning algorithms were tried.
Presentations on Tuesday, March 7
- about 10-15 mins.
- Annotation Goal
- Task Description
- Description of corpus
- Annotation Guidelines and your experience applying it
- Results: inter annotator agreement, evaluation results, 1-2 examples of ML algorithms used on the data (not all of these will be available for every project. Just include what you can find)
- What you learned that applies to your project
SemEval 2014 Task 4: Aspect Based Sentiment Analysis
Overview Paper
Guidelines
Workshop Proceedings
Project Site
SemEval 2016 Task 5 Aspect Based Sentiment Analysis (ABSA-16)
Overview Paper
Guidelines
Workshop Proceedings
Project Site
ICSI corpus
Overview of the corpus
Paper about the CALO project
Project Site
Dialog Acts
Paper on Dialog Act Corpus
Dialog Act Guidelines
Paper on Dialog Act classification across multiple corpora
Topic Segmentation
Topic Annotation Guidelines
Paper on topic segmentation
Paper From AMI corpus work, but it’s pretty good.
Annotating dialogs in the AMI meeting corpus (for reference)
AMI Meeting CorpusCorpus
Overview
Overview paper
Dialog Act
Annotation Guidelines for Dialog Act and Addressee
Paper
Paper
Topic Segmentation
Annotation Guidelines for Topic Segmentation
Paper
Assignment N (maybe): Conference papers on annotation
We will be reading, discussing and presenting on these papers in the future.
Assignment Details
Pyry Takala, Pekka Malo, Ankur Sinha, Oskar Ahlgren, “Gold-standard for Topic-specific Sentiment Analysis of Economic Texts, LREC 2015
Carlson, Lynn, Daniel Marcu, and Mary Ellen Okurowski. “Building a discourse-tagged corpus in the framework of rhetorical structure theory.” Current and new directions in discourse and dialogue. Springer Netherlands, 2003. 85-112.
Miltsakaki, Eleni, et al. “Annotating discourse connectives and their arguments.” Proceedings of the HLT/NAACL Workshop on Frontiers in Corpus Annotation. 2004.
Pustejovsky, James, Jessica L. Moszkowicz, and Marc Verhagen. “Using ISO-Space for annotating spatial information.” Proceedings of the International Conference on Spatial Information Theory. 2011.
Agarwal, Apoorv, et al. “Sentiment analysis of twitter data.” Proceedings of the workshop on languages in social media. Association for Computational Linguistics, 2011.
Poesio, Massimo. “Discourse annotation and semantic annotation in the GNOME corpus.” Proceedings of the 2004 ACL Workshop on Discourse Annotation. Association for Computational Linguistics, 2004.
Asher, Nickolas, et al. “Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus”
Jennifer D’Souza, Vincent Ng, “Annotating Inter-Sentence Temporal Relations in Clinical Notes”, LREC 2014
Snow, Rion, et al. “Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks.” Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 2008.