Skip to content

Add null prompt for gsarti/flores_101 #768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

cjlovering
Copy link

@cjlovering cjlovering commented May 7, 2022

gsarti/flores_101 is a dataset with 102 languages used for language modeling.

  • Added the 102 language subsets
  • The prompt is empty -- just the sentence itself -- as this dataset will be used for language modeling (LM).
  • The metric is set to Other; downstream applications (eval harness) will select the LM metric.
  • It was necessary to add gsarti to the user list because the dataset is listed under that user in hugging face datasets.

@cjlovering cjlovering marked this pull request as ready for review May 7, 2022 18:09
@cjlovering
Copy link
Author

Given that this is a null prompt and we're currently using it as as a language modeling dataset, we can skip this for now. Using promptsource isn't necessary.

@stephenbach
Copy link
Member

Can this PR be closed if we're not prompting it?

@stephenbach stephenbach self-assigned this May 17, 2022
@rbawden
Copy link
Contributor

rbawden commented May 23, 2022

Hi there! For information, I have just prompted this dataset (automatically for MT). Linking the PR here: #779

@cjlovering cjlovering closed this May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants