Center for Research Computing, University of Notre Dame
2023-09-05
Jeremy Jordan. “Effective Testing for Machine Learning Systems.,”
August 19, 2020. https://www.jeremyjordan.me/testing-ml/.
Building Stuff?
Building Agents based on Large Language Models!
Zhou, Ce, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, et al.
“A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT.”
arXiv, May 1, 2023.https://doi.org/10.48550/arXiv.2302.09419.
“An autoregressive large language model (AR-LLM) is a type of neural network model that can generate natural language text. It has a very large number of parameters (billions or trillions) that are trained on a huge amount of text data from various sources. The main goal of an AR-LLM is to predict the next word or token based on the previous words or tokens in the input text. For example, if the input text is”The sky is”, the AR-LLM might predict “blue” as the next word. AR-LLMs can also generate text from scratch by sampling words from a probability distribution. For example, if the input text is empty, the AR-LLM might generate “Once upon a time, there was a princess who lived in a castle.” as the output text.”1
Abstract: We show that transformer-based large language models are computationally universal when augmented with an external memory. Any deterministic language model that conditions on strings of bounded length is equivalent to a finite automaton, hence computationally limited. However, augmenting such models with a read-write memory creates the possibility of processing arbitrarily large inputs and, potentially, simulating any algorithm. We establish that an existing large language model, Flan-U-PaLM 540B, can be combined with an associative read-write memory to exactly simulate the execution of a universal Turing machine, \(U_{15,2}\). A key aspect of the finding is that it does not require any modification of the language model weights. Instead, the construction relies solely on designing a form of stored instruction computer that can subsequently be programmed with a specific set of prompts.
Schuurmans, Dale. “Memory Augmented Large Language Models Are Computationally Universal.”
arXiv, January 9, 2023. https://doi.org/10.48550/arXiv.2301.04589.
We will focus on Conversational Agents…
Andrej Karpathy, “State of GPT” | BRK216HFS, Microsoft Build, 2023.
Andrej Karpathy, “State of GPT” | BRK216HFS, Microsoft Build, 2023.
“Prompt engineering is the process of designing and refining the prompts or input stimuli for a language model to generate specific types of output. Prompt engineering involves selecting appropriate keywords, providing context, and shaping the input in a way that encourages the model to produce the desired response and is a vital technique to actively shape the behavior and output of foundation models.”1
Ouyang, Long, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al.
“Training Language Models to Follow Instructions with Human Feedback.”
arXiv, March 4, 2022. https://doi.org/10.48550/arXiv.2203.02155.
Henry Zeng, Lauryn Gayhardt, Jill Grant “What is Azure Machine Learning prompt flow
(preview) - Azure Machine Learning,” Jul. 02, 2023.
http://tiny.cc/kelavz (accessed Sep. 04, 2023).
Henry Zeng, Lauryn Gayhardt, Jill Grant “What is Azure Machine Learning prompt flow
(preview) - Azure Machine Learning,” Jul. 02, 2023.
http://tiny.cc/kelavz (accessed Sep. 04, 2023).
“NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. Guardrails (or”rails” for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.”1
Building Trustworthy, Safe, and Secure LLM Conversational Systems: The core value of using NeMo Guardrails is the ability to write rails to guide conversations. You can choose to define the behavior of your LLM-powered application on specific topics and prevent it from engaging in discussions on unwanted topics.
Connect models, chains, services, and more via actions: NeMo Guardrails provides the ability to connect an LLM to other services (a.k.a. tools) seamlessly and securely.
“NeMo Guardrails.” NVIDIA Corporation, Sep. 05, 2023. Accessed: Sep. 05, 2023. [Online].
Available: https://github.com/NVIDIA/NeMo-Guardrails
Kojima, Takeshi, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa.
“Large Language Models Are Zero-Shot Reasoners.” arXiv, January 29, 2023.
Besta, Maciej, Nils Blach, Ales Kubicek, Robert Gerstenberger, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, et al
“Graph of Thoughts: Solving Elaborate Problems with Large Language Models.” arXiv, August 21, 2023.
“Prompt Engineering Guide – Nextra.”
https://www.promptingguide.ai/ (accessed Sep. 04, 2023).
“Foundation models are computationally expensive and trained on a large, unlabeled corpus. Fine-tuning a pre-trained foundation model is an affordable way to take advantage of their broad capabilities while customizing a model on your own small, corpus. Fine-tuning is a customization method that involved further training and does change the weights of your model…”
“…There are two main approaches that you can take for fine-tuning depending on your use case and chosen foundation model. If you’re interested in fine-tuning your model on domain-specific data, see Domain adaptation fine-tuning. If you’re interested in instruction-based fine-tuning using prompt and response examples, see Instruction-based fine-tuning.”1
“Foundation models are usually trained offline, making the model agnostic to any data that is created after the model was trained. Additionally, foundation models are trained on very general domain corpora, making them less effective for domain-specific tasks. You can use Retrieval Augmented Generation (RAG) to retrieve data from outside a foundation model and augment your prompts by adding the relevant retrieved data in context. For more information about RAG model architectures”1
“Retrieval Augmented Generation (RAG) - Amazon SageMaker.” Accessed September 4, 2023.
“Custom Retriever Combining KG Index and VectorStore Index
S. Patil, “Gorilla: Large Language Model Connected with Massive APIs [Project Website].” Sep. 04, 2023.
Accessed: Sep. 04, 2023. [Online]. Available: https://github.com/ShishirPatil/gorilla
S. Patil, “Gorilla: Large Language Model Connected with Massive APIs [Project Website].” Sep. 04, 2023.
Accessed: Sep. 04, 2023. [Online]. Available: https://github.com/ShishirPatil/gorilla
JSON-Grammar
root ::= object
value ::= object | array | string | number | ("true" | "false" | "null") ws
object ::=
"{" ws (
string ":" ws value
("," ws string ":" ws value)*
)? "}" ws
array ::=
"[" ws (
value
("," ws value)*
)? "]" ws
string ::=
"\"" (
[^"\\] |
"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes
)* "\"" ws
number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws
# Optional space: by convention, applied in this grammar after literal chars when allowed
ws ::= ([ \t\n] ws)?
“speculative : add grammar support by ggerganov · Pull Request #2991 · ggerganov/llama.cpp,”
GitHub. https://github.com/ggerganov/llama.cpp/pull/2991 (accessed Sep. 04, 2023).
Andrej Karpathy, “State of GPT” | BRK216HFS, Microsoft Build, 2023.
Matt Bronstein and Rajko Radovanovic, “Supporting the Open Source AI Community,”
Andreessen Horowitz, Aug. 30, 2023.
http://tiny.cc/uflavz (accessed Sep. 03, 2023).