Edit model card

Model Card for gherke/mistral-7b-quantized-lora-finetuned

This is a financial sentiment analysis model, fine-tuned for fine-tuned for sentiment analysis to return a sentiment score between -1 (very negative) and 1 (very positive). It was specifically trained to analyze financial news and assess its impact on financial market trends.

Model Details

Model Description

This is a quantized version of the Mistral-7B model, fine-tuned using the LoRA (Low-Rank Adaptation) technique for sentiment analysis tasks. The model was trained on the takala/financial_phrasebank dataset to detect sentiment related to economic or market-relevant information. The output is a single sentiment score, with values between -1 and 1, representing very negative to very positive sentiment respectively.

  • Developed by: Gabriella Herke
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: Gabriella Herke
  • Model type: Causal Language Model fine-tuned for Sentiment Analysis
  • Language(s) (NLP): English
  • License: [More Information Needed]
  • Finetuned from model [optional]: thesven/Mistral-7B-Instruct-v0.3-GPTQ

Model Sources [optional]

Uses

Direct Use

The model can be used for analyzing financial news and producing a sentiment score that indicates the potential impact on financial market trends.

Downstream Use [optional]

The model can be incorporated into larger financial analysis pipelines or trading bots to assess market sentiment.

Out-of-Scope Use

The model should not be used for general sentiment analysis outside of the financial context, as it was specifically trained on financial news.

Bias, Risks, and Limitations

The model is limited by the nature of its training dataset (takala/financial_phrasebank), which may not be representative of all financial scenarios or market conditions. It may produce biased results if applied to other sectors.

Recommendations

Users (both direct and downstream) should be aware of the risks, biases, and limitations of the model. Careful consideration is needed when applying the sentiment scores in automated trading decisions, as biases in the data can lead to incorrect assessments.

How to Get Started with the Model

Use the code below to get started with the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "gherke/mistral-7b-quantized-lora-finetuned"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Training Details

Training Data

The model was fine-tuned on the takala/financial_phrasebank dataset, which contains financial news phrases labeled for sentiment (positive, negative, neutral).

Training Procedure

Preprocessing [optional]

The financial news phrases were tokenized and preprocessed using the Hugging Face tokenizer, with truncation applied for long texts.

Training Hyperparameters

  • Training regime: 4-bit quantized training with LoRA adaptation
  • Learning Rate: 2e-4
  • Batch Size: 8
  • Number of Epochs: 20

Speeds, Sizes, Times [optional]

Training was performed using an 8-bit paged AdamW optimizer, with gradient accumulation steps set to 4.

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on the test split of the takala/financial_phrasebank dataset.

Factors

The evaluation was performed based on the model's ability to accurately predict sentiment labels in financial contexts.

Metrics

Mean Squared Error (MSE) and correlation with human-labeled sentiment scores were used to evaluate model performance.

Results

The model achieved reasonable accuracy in predicting sentiment scores within the financial domain, performing well on positive and negative examples but showing some difficulty in identifying neutral cases.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: NVIDIA A100 GPU
  • Hours used: 15 hours
  • Cloud Provider: AWS
  • Compute Region: Europe (London)
  • Carbon Emitted: Approximately 25 kg CO2eq

Technical Specifications [optional]

Model Architecture and Objective

The model is based on the Mistral-7B architecture, with LoRA applied for efficient fine-tuning in the sentiment analysis task.

Compute Infrastructure

Hardware

The model was trained on a single NVIDIA A100 GPU with 40 GB VRAM.

Software

  • Transformers Library: Hugging Face Transformers v4.31.0
  • PEFT Library: v0.12.0

More Information [optional]

For further questions or inquiries about this model, please reach out to Gabriella Herke.

Model Card Contact

For more information, contact Gabriella Herke at [contact information].

Framework versions

  • PEFT 0.12.0
Downloads last month
10
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for gherke/mistral-7b-quantized-lora-finetuned