Dr V. Holiday 2023 Viewing Guide
Charles F. Vardeman II
Center for Research Computing, University of Notre Dame
2023-12-01
Goal: To help prepare you for those difficult holiday conversations…
Like: How does ChatGPT work?
Pay attention to the section on LLM security at the end of the talk.
Making LLMs “uncool” (Language Warning)
Making Large Language Models Uncool Again: Youtube
Uncool “takeaways”
- ~30b parameter models a missed opportunity
- We are “fine-tuning” wrong
- Uncertainty future directions for small (fine-tuned) vs large (API) models
- LLM architecture not the path to AG(S)I
Deep dive into understanding LLMs
What is ChatGPT doing…and why does it work? Youtube
One Year of ChatGPT!
Chen, Hailin, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut, Ruochen Zhao, Caiming Xiong, and Shafiq Joty. 2023. “ChatGPT’s One-Year Anniversary: Are Open-Source Large Language Models Catching Up?” arXiv. http://arxiv.org/abs/2311.16989.
LLM Capabilities
Chen, Hailin, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut, Ruochen Zhao, Caiming Xiong, and Shafiq Joty. 2023. “ChatGPT’s One-Year Anniversary: Are Open-Source Large Language Models Catching Up?” arXiv. http://arxiv.org/abs/2311.16989.
Agent Capabilities
Chen, Hailin, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut, Ruochen Zhao, Caiming Xiong, and Shafiq Joty. 2023. “ChatGPT’s One-Year Anniversary: Are Open-Source Large Language Models Catching Up?” arXiv. http://arxiv.org/abs/2311.16989.
Mixture of Experts (MoE)
MoE
Determinism vs Stochasticity?
- Determinism through RAG and Tool use
- Tool code created by LLM based Co-pilots or Agents
- What “Programming Language” should we use for the deterministic part?
- We don’t have an integrated paradigm for what “systems engineering” means in a age of AIs