Is possible to make the model only return the response without the prompt?
#61
by
mduran159
- opened
With the example of the code you posted I can only make it return the entire prompt with the model response at the end, how it's usual in this models. But when you use a pipeline you can avoid all that stuff, is there a way to make it work like an LLM with pipeline to make it return only the model response/answer?
I'm not sure what's conventional as this is the most I've used transformers, but you can always strip it based on the special tokens right?
start_token = "<|start_header_id|>assistant<|end_header_id|>"
end_token = "<|eot_id|>"
start_index = processor_output.find(start_token) + len(start_token)
end_index = processor_output.rfind(end_token)
if start_index != -1 and end_index != -1 and start_index < end_index:
content = processor_output[start_index:end_index].strip()
I know skip_special_tokens
can be flagged while decoding, but they seem to provide good structure.