Skip to content

Edge LLMs Challenge: Training from Scratch

Task Description

Train language model from scratch without using pre-trained LLMs.

Participation Requirements

  • The model must run on a device with 12 GB DRAM
  • The model must be submitted in FP16 or FP32 format (no quantization allowed)
  • Only C4 and Alpaca datasets allowed for training and fine-tuning
  • You may not use pre-trained LLMs
  • You may not quantize the model

Datasets

Only C4 and Alpaca datasets allowed for training and fine-tuning: - load_dataset('AlgorithmicResearchGroup/edge_llm_training', 'c4_combined_dataset') - load_dataset('AlgorithmicResearchGroup/edge_llm_training', 'alpaca_cleand')

Dataset structure:

alpaca: 
DatasetDict({
    train: Dataset({
        features: ['output', 'input', 'instruction'],
        num_rows: 51760
    })
})
c4_combined_dataset:
Dataset({
    features: ['text'],
    num_rows: 989000
})

Evaluation Process

You may run the following command to evaluate your model:

lm_eval --model hf \
        --model_args pretrained="<path_to_your_model>" \
        --tasks mmlu  \
        --device cuda:0 \
        --batch_size 8

Hardware Constraints

  • One A100 40GB GPU

Time Constraints

  • 24 Hour Time Limit

Additional Resources

Starter code: https://github.com/TianjinYellow/EdgeDeviceLLMCompetition-Starting-Kit?tab=readme-ov-file#submission-requirements

Huggingface Transformers