Skip to content

Wikiann Prompts #772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: eval-hackathon
Choose a base branch
from

Conversation

jzf2101
Copy link
Collaborator

@jzf2101 jzf2101 commented May 13, 2022

No description provided.

@jzf2101 jzf2101 requested a review from awebson May 25, 2022 16:20
Copy link
Contributor

@awebson awebson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jess! NER is just hard to be adapted to a prompt form because the task itself is kind of a non-natural. Normal humans don't just carry around a vector for each word that can be fed into a classifier. Recall that we talked about this with @yongzx on Slack:

Unless we are talking about entity classification (where we already know what is the entity and we want to figure out its type). NER on the other hand is entity extraction + entity classification.

Your prompt gives the model the gold span as opposed to let models extract the span. So that's already non-original task. Further, several of your prompts ask for a binary classification, not multiple choice among location, person, organization. So those are also non-original task prompts. Right?

Lastly, are you sure the metrics are "AUC, Accuracy, COQA F1, and Other"?

41b109ce-7504-4263-b455-315fa7a7f3c2: !Template
answer_choices: Yes ||| No
id: 41b109ce-7504-4263-b455-315fa7a7f3c2
jinja: 'Given, "{{'' ''.join(tokens)}}", is {{spans[0].split('':'')[1]}} a location?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the '' ''.join(tokens) here joining using multiple spaces?

@awebson
Copy link
Contributor

awebson commented May 30, 2022

Also, the targets of your *_bool prompts should be, e.g., "False" as a single target sequence. Right now it's saying this example has "F", "a", "l", "s", "e" as five separate correct answers.

スクリーンショット 2022-05-29 20 34 22

53bd1751-f0e6-4ca8-84d4-80c1c09a230d: !Template
answer_choices: True ||| False
id: 53bd1751-f0e6-4ca8-84d4-80c1c09a230d
jinja: 'Given the following information:
Copy link
Contributor

@awebson awebson May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As opposed to "Given", or "Given the following information:", maybe it's more natural to say something like "In the sentence…"?

@yongzx
Copy link
Contributor

yongzx commented Jun 1, 2022

I am working on this right now, but I ran into some weird GH issue (been Googling the whole day to debug it) where there's wikiann en templates in the folder, but streamlit doesn't load it. Do you know how to resolve this?

Screen Shot 2022-06-01 at 3 42 26 PM
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants