Going multimodal: How Prezi is leveraging the Hub and the Expert Support Program to accelerate their ML roadmap
•
11
Here is the full thesis if you're interested: https://research.vu.nl/ws/portalfiles/portal/355675396/dissertationlaurerfinal+-+66c885c7e9d0b.pdf
Here is the collection of my most recent models: https://huggingface.co/collections/MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f
#!pip install "huggingface_hub>=0.25.0"
from huggingface_hub import InferenceClient
client = InferenceClient(
base_url="https://huggingface.co/api/integrations/dgx/v1",
api_key="MY_FINEGRAINED_ENTERPRISE_ORG_TOKEN" # see docs: https://huggingface.co/blog/inference-dgx-cloud#create-a-fine-grained-token
)
output = client.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-405B-Instruct-FP8",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Count to 10"},
],
max_tokens=1024,
)
print(output)
@HAMRONI can you share the full inference code that caused this error? you can open a discussion in the model repo