nlp

WordBurner beta

Posted on 2022-04-18 at 23:08:52 UTC-0400

Update 2022-04-27: The beta is over, but the apk is still installable with the instructions below and any feedback sent from inside the app will be received by me. I’m going to be working on this more over the summer, and eventually publishing it on the app store. :) Ever since learning Spanish, it has been a dream of mine to create a vocabulary study app that meets my needs. Duolingo won’t cover advanced vocabulary, Anki requires manually-generated decks, and other apps have expensive subscription plans.

PaLM

Posted on 2022-04-11 at 12:17:25 UTC-0400

This was a paper I presented about in Bang Liu’s research group meeting on 2022-04-11. You can view the slides I used here.

It's not just size that matters: small language models are also few-shot learners

Posted on 2022-02-18 at 13:13:54 UTC-0500

We presented this paper as a mini-lecture in Bang Liu’s IFT6289 course in winter 2022. You can view the slides we used here.

A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification

Posted on 2022-02-02 at 15:35:00 UTC-0500

This post was created as an assignment in Bang Liu’s IFT6289 course in winter 2022. The structure of the post follows the structure of the assignment: summarization followed by my own comments. paper summarization Word embeddings have gotten so good that state-of-the-art sentence classification can often be achieved with just a one-layer convolutional network on top of those embeddings. This paper dials in on the specifics of training that convolutional layer for this downstream sentence classification task.

Learning transferable visual models from natural language supervision (CLIP)

Posted on 2022-02-02 at 12:35:03 UTC-0500

This post was created as an assignment in Irina Rish’s neural scaling laws course (IFT6167) in winter 2022. The post contains no summarization, only questions and thoughts. This concept of wide vs. narrow supervision (rather than binary “supervised” and “unsupervised”) is an interesting and flexible way to think about the way these training schemes leverage data. The zero-shot CLIP matches the performance of 4-shot CLIP, which is a surprising result. What do the authors mean when they make this guess about zero-shot’s advantage: