Edge LLMs Challenge: Training from Scratch
Task Description
Train language model from scratch without using pre-trained LLMs.
Participation Requirements
- The model must run on a device with 12 GB DRAM
- The model must be submitted in FP16 or FP32 format (no quantization allowed)
- Only C4 and Alpaca datasets allowed for training and fine-tuning
- You may not use pre-trained LLMs
- You may not quantize the model
Datasets
Only C4 and Alpaca datasets allowed for training and fine-tuning: - load_dataset('AlgorithmicResearchGroup/edge_llm_training', 'c4_combined_dataset') - load_dataset('AlgorithmicResearchGroup/edge_llm_training', 'alpaca_cleand')
Dataset structure:
alpaca:
DatasetDict({
train: Dataset({
features: ['output', 'input', 'instruction'],
num_rows: 51760
})
})
c4_combined_dataset:
Dataset({
features: ['text'],
num_rows: 989000
})
Evaluation Process
You may run the following command to evaluate your model:
lm_eval --model hf \
--model_args pretrained="<path_to_your_model>" \
--tasks mmlu \
--device cuda:0 \
--batch_size 8
Hardware Constraints
- One A100 40GB GPU
Time Constraints
- 24 Hour Time Limit
Additional Resources
Starter code: https://github.com/TianjinYellow/EdgeDeviceLLMCompetition-Starting-Kit?tab=readme-ov-file#submission-requirements
Recommended Libraries
Huggingface Transformers