DEFINITIONS

Term	Description
GPT	Chat GPT stands for Chat Generative Pre-Trained Transformer and was developed an AI research company; Open AI. It is an artificial intelligence (AI) chatbot technology that can process our natural human language and generate a response. GPT is extremely vital in ChatGPT as it helps develop texts from datasets and provides outputs to users’ questions in a human-like form. An initial training process took place to text corpus GPT and enable it to understand and learn to predict the next word in a passage. GPT is a Transformer-based architecture and training procedure for natural language processing tasks. Source - mylearning.org - click for full article
LLaMA	Large Language Model Meta is a large language model (LLM) released by Meta AI in February 2023. A variety of model sizes were trained ranging from 7 billion to 65 billion parameters. LLaMA's developers reported that the 13 billion parameter model's performance on most NLP benchmarks exceeded that of the much larger GPT-3 (with 175 billion parameters) and that the largest model was competitive with state of the art models such as PaLM and Chinchilla. Whereas the most powerful LLMs have generally been accessible only through limited APIs (if at all), Meta released LLaMA's model weights to the research community under a noncommercial license. LLaMA uses the transformer architecture, the standard architecture for language modelling since 2018. LLaMA's developers focused their effort on scaling the model's performance by increasing the volume of training data, rather than the number of parameters, reasoning that the dominating cost for LLMs is from doing inference on the trained model rather than the computational cost of the training process. LLaMA was trained on 1.4 trillion tokens, drawn from publicly available data sources. Source - Wikipedia - click for full article
LLM	A large language model (LLM) is a language model consisting of a neural network with many parameters ( typically billions of weights or more), trained on large quantities of unlabeled text using self-supervised learning or semi-supervised learning. LLMs emerged around 2018 and perform well at a wide variety of tasks. This has shifted the focus of natural language processing research away from the previous paradigm of training specialized supervised models for specific tasks. Though the term large language model has no formal definition, it often refers to deep learning models having a parameter count on the order of billions or more. LLMs are general purpose models which excel at a wide range of tasks, as opposed to being trained for one specific task (such as sentiment analysis, named entity recognition, or mathematical reasoning). The skill with which they accomplish tasks, and the range of tasks at which they are capable, seems to be a function of the amount of resources (data, parameter-size, computing power) devoted to them, in a way that is not dependent on additional breakthroughs in design. Though trained on simple tasks along the lines of predicting the next word in a sentence , neural language models with sufficient training and parameter counts are found to capture much of the syntax and semantics of human language. In addition, large language models demonstrate considerable general knowledge about the world, and are able to "memorize" a great quantity of facts during training. Source - Wikipedia - click for full article
Sentiment Analysis	Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. Source - Wikipedia - click for full article
Transformer	A transformer is a deep learning model. It is distinguished by its adoption of self-attention, differentially weighting the significance of each part of the input (which includes the recursive output) data. It is used primarily in the fields of natural language processing (NLP)[1] and computer vision (CV). Like recurrent neural networks (RNNs), transformers are designed to process sequential input data, such as natural language, with applications towards tasks such as translation and text summarization. However, unlike RNNs, transformers process the entire input all at once. The attention mechanism provides context for any position in the input sequence. For example, if the input data is a natural language sentence, the transformer does not have to process one word at a time. This allows for more parallelization than RNNs and therefore reduces training times. Transformers were introduced in 2017 by a team at Google Brain and are increasingly becoming the model of choice for NLP problems,[3] replacing RNN models such as long short-term memory (LSTM). Compared to RNN models, transformers are more amenable to parallelization, allowing training on larger datasets. This led to the development of pretrained systems such as BERT (Bidirectional Encoder Representations from Transformers) and the original GPT (generative pre-trained transformer). Source - Wikipedia - click for full article
NER	Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. Most research on NER/NEE systems has been structured as taking an unannotated block of text, such as this one: @pn Jim bought 300 shares of Acme Corp. in 2006. @pz And producing an annotated block of text that highlights the names of entities: @pz [Jim]Person bought 300 shares of [Acme Corp.]Organization in [2006]Time. @pz In this example, a person name consisting of one token, a two-token company name and a temporal expression have been detected and classified.State-of-the-art NER systems for English produce near-human performance. @px Source - Wikipedia - click for full article
RNN	A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs. This makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. Recurrent neural networks are theoretically Turing complete and can run arbitrary programs to process arbitrary sequences of inputs. Source - Wikipedia - click for full article
Tokenization	Tokenization is a fundamental pre-processing step for most natural language processing (NLP) applications. It involves splitting text into smaller units called tokens (e.g., words or word segments) in order to turn an unstructured input string into a sequence of discrete elements that is suitable for a machine learning (ML) model. ln deep learning–based models (e.g., BERT), each token is mapped to an embedding vector to be fed into the model. Source - ai.googleblog.com - click for full article
Tokenizer	The GPT family of models process text using tokens, which are common sequences of characters found in text. The models understand the statistical relationships between these tokens, and excel at producing the next token in a sequence of tokens. A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words). It is much easier for the progran to handle numbers rathern than text strings. This is why text is converted into tokens instaed. click for full article

Compiled on 06-08-2023 18:42:45

Copyright - Creative Commons Attribution 4.0 International License