Sebastian Jaszczur
Uniwersytet Warszawski
H-index: 3
Europe-Poland
Top articles of Sebastian Jaszczur
Scaling Laws for Fine-Grained Mixture of Experts
arXiv preprint arXiv:2402.07871
2024/2/12
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
arXiv preprint arXiv:2401.04081
2024/1/8
Sebastian Jaszczur
H-Index: 2
Structured Packing in LLM Training Improves Long Context Utilization
arXiv preprint arXiv:2312.17296
2023/12/28
Sebastian Jaszczur
H-Index: 2
Piotr Miłoś
H-Index: 9
Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation
arXiv preprint arXiv:2310.15961
2023/10/24
Sebastian Jaszczur
H-Index: 2
Marek Cygan
H-Index: 25
Sparse attention neural networks
2022/8/11
Sparse is Enough in Scaling Transformers
Advances in Neural Information Processing Systems
2021/12/6
Sebastian Jaszczur
H-Index: 2
Neural heuristics for SAT solving
arXiv preprint arXiv:2005.13406
2020/5/27
Sebastian Jaszczur
H-Index: 2