Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Wordnet::Senserelate:: Allwords: a broad coverage word sense tagger that maximizes semantic relatedness

Published in The Annual Conference of the North American chapter of the association for computational linguistics, 2009

This paper is about a method to identify meaning of words in a given context.

Recommended citation: Pedersen, Ted and Kolhatkar, Varada: WordNet:: SenseRelate:: AllWords - A Broad Coverage Word Sense Tagger that Maximizes Semantic Relatedness. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Demonstration Session, pp. 17–20 (2009) http://www.aclweb.org/anthology/N09-5005

Resolving this-issue anaphora

Published in In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

This paper is about resolving complex anaphora in the form of demonstratives followed by a noun phrase.

Recommended citation: Varada Kolhatkar and Graeme Hirst. (2009). Resolving "this-issue" anaphora. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. pages 1255 -- 1265, Jeju Island, Korea, July. Association for Computational Linguistics. http://www.aclweb.org/anthology/D12-1115

Annotating anaphoric shell nouns with their antecedents

Published in The 7th Linguistic Annotation Workshop and Interoperability with Discourse, 2013

This paper is about annotating complex anaphoric expressions such as this issue or this fact.

Recommended citation: Varada Kolhatkar, Heike Zinsmeister, and Graeme Hirst. 2013. Annotating anaphoric shell nouns with their antecedents. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pages 112–121, Sofia, Bulgaria, August. Association for Computational Linguistics. http://www.aclweb.org/anthology/W13-2314

Interpreting anaphoric shell nouns using antecedents of cataphoric shell nouns as training data

Published in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

This paper is about interpreting hard cases of anaphoric shell nouns (e.g., this issue) using relatively easy cases of cataphora-like shell nouns (e.g., the issue whether X).

Recommended citation: Varada Kolhatkar, Heike Zinsmeister, and Graeme Hirst. 2013. Interpreting anaphoric shell nouns using antecedents of cataphoric shell nouns as training data. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 300–310, Seattle, Washington, USA, October. Association for Computational Linguistics. http://www.anthology.aclweb.org/D/D13/D13-1030.pdf

Resolving shell nouns

Published in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Recommended citation: Varada Kolhatkar and Graeme Hirst. 2014. Resolving shell nouns. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 499–510, Doha, Qatar, October. Association for Computational Linguistics. http://www.aclweb.org/anthology/D14-1056

A Unified Framework for Evaluating the Risk of Re-Identification of Text de-Identification Tools

Published in Journal of Biomedical Informatics, 2016

Recommended citation: M. Scaiano, G. Middleton, L. Arbuckle, V. Kolhatkar, L. Peyton, M. Dowling, D. S. Gipson, and K. El Emam. “A Unified Framework for Evaluating the Risk of Re-Identification of Text de-Identification Tools.” Journal of Biomedical Informatics 63 (2016): 174–183. https://www.ncbi.nlm.nih.gov/pubmed/27426236

Constructive Language in News Comments

Published in Proceedings of the First Workshop on Abusive Language Online, 2017

This paper is about identifying constructive comments in news comments.

Recommended citation: Constructive Language in News Comments. In Proceedings of the First Workshop on Abusive Language Online. Association for Computational Linguistics, Vancouver, BC, Canada, pages 11-17. http://aclweb.org/anthology/W17-3002

Using New York Times Picks to Identify Constructive Comments

Published in Proceedings of NLP meets journalism workshop, 2017

In this paper we use New York Times Picks as training examples for constructiveness and build computational models to identify constructive comments in news comments.

Recommended citation: Varada Kolhatkar and Maite Taboada. 2017. Using New York Times Picks to Identify Constructive Comments. In Proceedings of NLP meets journalism workshop, Association for Computational Linguistics, Copenhagen, Denmark, pages 100-105. https://aclanthology.info/papers/W17-4218/w17-4218

Anaphora With Non-nominal Antecedents in Computational Linguistics: a Survey

Published in Computational Linguistics Journal, 2018

This article provides an extensive overview of the literature related to the phenomenon of non-nominal-antecedent anaphora (also known as abstract anaphora or discourse deixis).

Recommended citation: Varada Kolhatkar, Adam Roussel, Stefanie Dipper, and Heike Zinsmeister. 2018. Anaphora with non-nominal antecedents in computational linguistics: A survey. Computational Linguistics, 44(3). https://www.mitpressjournals.org/doi/pdf/10.1162/coli_a_00327

talks

Journobots: friend or foe?

Published: July 03, 2019

One of the key challenges facing online communities, from social networks to the comment sections of news sites, is low quality contributions. To help with this, comment moderators are often employed to identify the most informative and constructive comments, and shield readers from low-quality or abusive content. For example, The New York Times employs a staff of full-time moderators to review comments. Exemplary comments representing a range of views are highlighted and tagged as NYT Picks. With vast number of online comments, and growing challenges of how social networks manage toxic language, the role of moderators is becoming much more demanding. There is thus a growing interest in developing automation to help filter and organize online comments for both moderators and readers. We are developing computational methods to identify “constructive” comments automatically. In particular, we have created an annotated corpus of constructive comments and we have been developing feature-based and deep-learning methods to identify constructive comments posted on opinion articles and editorials. Our models achieve up to 84% accuracy.

teaching

CPSC 322: Introduction to Artificial Intelligence Permalink

Computer Science, University of British Columbia, 2019

This course provides an introduction to the field of artificial intelligence (AI). The major topics covered include reasoning and representation, search, constraint satisfaction problems, planning, logic, reasoning under uncertainty, and planning under uncertainty.

DSCI 575: Advanced Machine Learning in the Context of NLP Permalink

Master of Data Science, University of British Columbia, 2020

Advanced machine learning methods in the context of natural language processing (NLP) applications. Word embeddings, Markov chains, hidden Markov models, topic modeling, recurrent neural networks.

DSCI 571: Supervised Machine Learning I Permalink

Master of Data Science, University of British Columbia, 2020

Welcome to DSCI 571, an introductory supervised machine learning course! In this course we will focus on basic machine learning concepts such as data splitting, cross-validation, generalization error, overfitting, the fundamental trade-off, the golden rule, and data preprocessing. You will also be exposed to common machine learning algorithms such as decision trees, K-nearest neighbours, SVMs, naive Bayes, and logistic regression using the scikit-learn framework.

DSCI 573: Feature and Model Selection Permalink

Master of Data Science, University of British Columbia, 2020

This course is about evaluating and selecting features and models. It covers the following topics: evaluation metrics, feature engineering, feature selection, the role of regularization, loss functions, and feature importances.

DSCI 563: Unsupervised Learning

Master of Data Science, University of British Columbia, 2021

This course is about identifying underlying structure in data. We will talk about clustering, dimensionality reduction, word embeddings, and recommendation systems.

CPSC 330: Applied Machine Learning Permalink

Computer Science, University of British Columbia, 2022

This course provides a broad introduction to applied machine learning. The topics covered include machine learning terminology and fundamentals, preprocessing, sklearn pipelines and column transformers, building supervised machine learning pipelines, introduction to unsupervised machine learning, introduction to specialized fields such as natural language processing, computer vision, time series, survival analysis, communication, responsible use of machine learning technology, and model deployment.

Varada Kolhatkar

Sitemap

Pages

Posts

portfolio

publications

talks

teaching