Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

OpenELM-270M-Instruct-ft-imdb

This repository contains the OpenELM-270M-Instruct-ft-imdb model, a fine-tuned version of the OpenELM-270M model specifically adapted for sentiment analysis on movie reviews using the IMDb dataset. This fine-tuning enhances the model's ability to understand and generate text related to movie reviews, particularly focusing on sentiment classification.

Model Details

Model Description

The OpenELM-270M-Instruct-ft-imdb model is a fine-tuned variant of the OpenELM-270M model. The original OpenELM models, developed by Apple Research, are part of a family of efficient language models designed for high performance across a range of NLP tasks. This particular model has been fine-tuned to better handle sentiment analysis tasks, particularly within the domain of movie reviews.

  • Developed by: Apple Research (Original Model)
  • Fine-tuned by: Amirreza Mohseni
  • Model type: Transformer-based Language Model (Causal LM)
  • Language(s) (NLP): English
  • License: Apple Sample Code License
  • Finetuned from model: apple/OpenELM-270M-Instruct

Model Sources

Uses

Direct Use

This model can be directly used for generating or analyzing text within the domain of movie reviews. It excels in sentiment analysis tasks, where the goal is to determine the sentiment (positive, negative, or neutral) expressed in the text.

Downstream Use

The model can be further fine-tuned for other tasks related to text generation or classification, especially those in the realm of sentiment analysis in media and entertainment.

Out-of-Scope Use

This model may produce biased or inappropriate results if applied to domains outside of sentiment analysis or movie reviews.

Bias, Risks, and Limitations

The model inherits potential biases from both the IMDb dataset and the original training data of the OpenELM models. As a result, it might generate content that is biased or inappropriate, particularly if used in unintended contexts. Users should employ appropriate content filtering and thoroughly evaluate the model before deploying it in sensitive applications.

Recommendations

Users should be aware of the model's limitations and biases. It is recommended to use content filtering mechanisms and to test the model's outputs extensively before deploying it in sensitive or public-facing applications.

How to Get Started with the Model

To use this model, you can load it with the Hugging Face transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AmirMohseni/OpenELM-270M-Instruct-ft-imdb")
tokenizer = AutoTokenizer.from_pretrained("AmirMohseni/OpenELM-270M-Instruct-ft-imdb")

prompt = "The movie was absolutely fantastic because"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on the IMDb dataset, which consists of 50,000 highly polarized movie reviews labeled as positive or negative. This dataset is widely used for sentiment analysis tasks.

Training Procedure

Preprocessing

Text data from the IMDb dataset was tokenized using the LLaMA tokenizer, focusing on retaining sentiment-rich content to improve the analysis of opinions in movie reviews.

Training Hyperparameters

  • Training regime: Mixed Precision (fp16)
  • Batch size: 1
  • Learning rate: 1e-5
  • Epochs: 1

Citation

If you find this work useful, please cite the original work by Apple Research:

@article{mehtaOpenELMEfficientLanguage2024,
  title = {{OpenELM}: {An} {Efficient} {Language} {Model} {Family} with {Open} {Training} and {Inference} {Framework}},
  shorttitle = {{OpenELM}},
  url = {https://arxiv.org/abs/2404.14619v1},
  language = {en},
  urldate = {2024-04-24},
  journal = {arXiv.org},
  author = {Mehta, Sachin and Sekhavat, Mohammad Hossein and Cao, Qingqing and Horton, Maxwell and Jin, Yanzi and Sun, Chenfan and Mirzadeh, Iman and Najibi, Mahyar and Belenko, Dmitry and Zatloukal, Peter and Rastegari, Mohammad},
  month = apr,
  year = {2024},
}

Model Card Authors

This model card was generated by Amirreza Mohseni.

Model Card Contact

For questions or comments about this model card, please contact Amirreza Mohseni.

Downloads last month
0
Safetensors
Model size
272M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .