bc
← tags

tag

#pytorch

projects
GenLMcompleted

Genomic language model pretrained from scratch on 13 Drosophila species genomes — 12-layer dilated CNN with a custom character-level tokenizer, trained to predict variant effects across the genome.

ChottaLLMcompleted

Small-scale LLM built from scratch in PyTorch — character-level tokenizer, transformer architecture, trained on custom text corpora.

GPT-style autoregressive language model implemented in PyTorch — full training loop, attention, positional encoding, and character-level generation.

Distributed Data Parallel training implemented from first principles in PyTorch — gradient synchronization, process groups, and multi-GPU scaling without using the DDP wrapper.