要义 MLX 针对mac的ARM处理器优化的大模型训练架构
看了那么多资料主要还是PIP安装 huggingface_hub mlx-lm transformers torch numpy
然后对应大模型文件微调
针对数据 Completion(问答类型,一问一答) chat(角色问答) text(单文本类型,训练特定文本) 三种类型 都是JSON格式
chat解析:
{
"messages": [
{
"role": "user",
"content": "How do I use this product?"
},
{
"role": "assistantA",
"content": "To use this product, first insert the batteries and then press the power button."
},
{
"role": "assistantB",
"content": "To operate this product, make sure it's charged, and then follow the instructions in the manual."
}
]
}
message包裹下为信息,role为user的content为询问,assistantB和assistantA给出不同的content
completion解析
[{
"prompt": "What is the capital of France?",
"completion": "Paris."
},
{
"prompt": "A?",
"completion": "BBBB"
}]
TEXT格式
{"text": "table: 1-1000181-1\ncolumns: State/territory, Text/background colour, Format, Current slogan, Current series, Notes\nQ: Tell me what the notes are for South Australia \nA: SELECT Notes FROM 1-1000181-1 WHERE Current slogan = 'SOUTH AUSTRALIA'"}