results

This model is a fine-tuned version of defog/llama-3-sqlcoder-8b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen	Nll Loss	Log Odds Ratio	Log Odds Chosen
0.9481	0.2	72	0.9541	-0.0776	-0.0797	0.7143	0.0021	-0.7975	-0.7765	-0.3031	-0.3167	0.8875	-0.6703	0.0480
0.7313	0.4	144	0.7089	-0.0551	-0.0596	0.8292	0.0045	-0.5962	-0.5513	-0.1005	-0.1135	0.6459	-0.6312	0.1330
0.547	0.6	216	0.4407	-0.0292	-0.0367	0.8882	0.0075	-0.3670	-0.2924	-0.0064	-0.0109	0.3866	-0.5408	0.3609
0.2547	0.8	288	0.3018	-0.0164	-0.0250	0.8882	0.0085	-0.2498	-0.1644	0.0633	0.0592	0.2551	-0.4664	0.5805
0.3407	1.0	360	0.2568	-0.0126	-0.0217	0.8944	0.0091	-0.2167	-0.1258	0.1307	0.1283	0.2132	-0.4354	0.6768