Machine Learning - Learning/Language Models
- ChatGPT gets code questions wrong 52% of the timewww.theregister.com ChatGPT gets code questions wrong 52% of the time
But its suggestions are so annoyingly plausible
- Med-PaLMsites.research.google Med-PaLM
Med-PaLM is a large language model from Google Research that has been adapted for medical purposes.
Med-PaLM is a large language model (LLM) designed to provide high quality answers to medical questions.
Med-PaLM harnesses the power of Google’s large language models, which we have aligned to the medical domain and evaluated using medical exams, medical research, and consumer queries. Our first version of Med-PaLM, preprinted in late 2022 and published in Nature in July 2023, was the first AI system to surpass the pass mark on US Medical License Exam (USMLE) style questions. Med-PaLM also generates accurate, helpful long-form answers to consumer health questions, as judged by panels of physicians and users.
We introduced our latest model, Med-PaLM 2, at Google Health’s annual health event The Check Up, in March, 2023. Med-PaLM 2 achieves an accuracy of 86.5% on USMLE-style questions, a 19% leap over our own state of the art results from Med-PaLM. According to physicians, the model's long-form answers to consumer medical questions improved substantially. In the coming months, Med-PaLM 2 will also be made available to a select group of Google Cloud customers for limited testing, to explore use cases and share feedback, as we investigate safe, responsible, and meaningful ways to use this technology.
- NousResearch/Nous-Hermes-Llama2-13b · Hugging Facehuggingface.co NousResearch/Nous-Hermes-Llama2-13b · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Nous-Hermes-Llama2-13b is currently the highest ranked 13B LLaMA finetune on the Open LLM Leaderboard.
Model Description
Nous-Hermes-Llama2-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Emozilla leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.
This Hermes model uses the exact same dataset as Hermes on Llama-1. This is to ensure consistency between the old Hermes and new, for anyone who wanted to keep Hermes as similar to the old one, just more capable.
This model stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 4096 sequence length on an 8x a100 80GB DGX machine.
Announcements
- https://twitter.com/NousResearch/status/1682458324804009987
- https://twitter.com/Teknium1/status/1682459395853279232
- InvokeAI 3.0 released
YouTube Video
Click to view this content.
cross-posted from: https://lemmy.world/post/1954892
> It's looking really good! Major features include controlnet, support for SDXL, and a whole bunch of other cool things. > > Download: https://github.com/invoke-ai/InvokeAI/releases/tag/v3.0.0
- georgesung/llama2_7b_chat_uncensored · Hugging Facehuggingface.co georgesung/llama2_7b_chat_uncensored · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
- Llama 2 -Meta AIai.meta.com Llama 2 - Meta AI
Llama 2 — The next generation of our open source large language model, available for free for research and commercial use.
- Microsoft LongNet: One BILLION Tokens LLM — David Shapiro ~ AI (06.07.2023)
YouTube Video
Click to view this content.
cross-posted from: https://lemmy.fmhy.ml/post/649641
> We could have AI models in a couple years that hold the entire internet in their context window.
- GitHub - wgryc/phasellm: Large language model evaluation and workflow framework from Phase AI.github.com GitHub - wgryc/phasellm: Large language model evaluation and workflow framework from Phase AI.
Large language model evaluation and workflow framework from Phase AI. - GitHub - wgryc/phasellm: Large language model evaluation and workflow framework from Phase AI.
Docs: https://phasellm.com/docs/phasellm/eval.html
This project provides a unified framework to test generative language models on a large number of different evaluation tasks.
Features:
- 200+ tasks implemented. See the task-table for a complete list.
- Support for models loaded via transformers (including quantization via AutoGPTQ), - GPT-NeoX, and Megatron-DeepSpeed, with a flexible tokenization-agnostic interface.
- Support for commercial APIs including OpenAI, goose.ai, and TextSynth.
- Support for evaluation on adapters (e.g. LoRa) supported in HuggingFace's PEFT library.
- Evaluating with publicly available prompts ensures reproducibility and comparability between papers.
- Task versioning to ensure reproducibility when tasks are updated.
- NousResearch/Redmond-Hermes-Coder · Hugging Facehuggingface.co NousResearch/Redmond-Hermes-Coder · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Model Description
Redmond-Hermes-Coder 15B is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.
This model was trained with a WizardCoder base, which itself uses a StarCoder base model.
The model is truly great at code, but, it does come with a tradeoff though. While far better at code than the original Nous-Hermes built on Llama, it is worse than WizardCoder at pure code benchmarks, like HumanEval.
It comes in at 39% on HumanEval, with WizardCoder at 57%. This is a preliminary experiment, and we are exploring improvements now.
However, it does seem better at non-code than WizardCoder on a variety of things, including writing tasks.
Model Training
The model was trained almost entirely on synthetic GPT-4 outputs. This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), CodeAlpaca, Evol_Instruct Uncensored, GPT4-LLM, and Unnatural Instructions.
Additional data inputs came from Camel-AI's Biology/Physics/Chemistry and Math Datasets, Airoboros' (v1) GPT-4 Dataset, and more from CodeAlpaca. The total volume of data encompassed over 300,000 instructions.
- OpenChat_8192 - The first model to beat 100% of ChatGPT-3.5
Models
Datasets
Repos
Related Papers
Credit:
Archive:
@Yampeleg The first model to beat 100% of ChatGPT-3.5 Available on Huggingface
🔥 OpenChat_8192
🔥 105.7% of ChatGPT (Vicuna GPT-4 Benchmark)
Less than a month ago the world witnessed as ORCA [1] became the first model to ever outpace ChatGPT on Vicuna's benchmark.
Today, the race to replicate these results open-source comes to an end.
Minutes ago OpenChat scored 105.7% of ChatGPT.
But wait! There is more!
Not only OpenChat beated Vicuna's benchmark, it did so pulling off a LIMA [2] move!
Training was done using 6K GPT-4 conversations out of the ~90K ShareGPT conversations.
The model comes in three versions: the basic OpenChat model, OpenChat-8192 and OpenCoderPlus (Code generation: 102.5% ChatGPT)
This is a significant achievement considering that it's the first (released) open-source model to surpass the Vicuna benchmark. 🎉🎉
-
OpenChat: https://huggingface.co/openchat/openchat
-
OpenChat_8192: https://huggingface.co/openchat/openchat_8192 (best chat)
-
OpenCoderPlus: https://huggingface.co/openchat/opencoderplus (best coder)
-
Dataset: https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset
-
Code: https://github.com/imoneoi/openchat
Congratulations to the authors!!
---
[1] - Orca: The first model to cross 100% of ChatGPT: https://arxiv.org/pdf/2306.02707.pdf [2] - LIMA: Less Is More for Alignment - TL;DR: Using small number of VERY high quality samples (1000 in the paper) can be as powerful as much larger datasets: https://arxiv.org/pdf/2305.11206
-
- Model Catalog
https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4/edit?usp=sharing
- GitHub - Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Modelsgithub.com GitHub - Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models - GitHub - Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Models
- ChatGLM model on huggingface
- chatGLM
- https://huggingface.co/THUDM/chatglm2-6b
- https://huggingface.co/THUDM/chatglm-6b
- yifever/sleeper-agent · Hugging Facehuggingface.co yifever/sleeper-agent · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
- DragGan - Interactive Point-based Manipulation on the Generative Image Manifold - a Hugging Face Space by radameshuggingface.co DragGan - Drag Your GAN - a Hugging Face Space by DragGan
Discover amazing ML apps made by the community
- TheBloke/orca_mini_13B-GPTQ · Hugging Facehuggingface.co TheBloke/orca_mini_13B-GPTQ · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/TheBloke/orca_mini_13B-GGML https://huggingface.co/lmsys/vicuna-33b-v1.3
- 101 fundamentals for aspiring the model makers
https://twitter.com/FrnkNlsn/status/1520585408215924736
https://www.researchgate.net/publication/327304999_An_Elementary_Introduction_to_Information_Geometry
https://www.researchgate.net/publication/357097879_The_Many_Faces_of_Information_Geometry
https://franknielsen.github.io/IG/index.html
https://franknielsen.github.io/GSI/
https://www.youtube.com/watch?v=w6r_jsEBlgU&embeds_referring_euri=https%3A%2F%2Ftwitter.com%2F&source_ve_path=MjM4NTE&feature=emb_title
- Chatbot Arena Leaderboard Week 8: Introducing MT-Bench and Vicuna-33B | LMSYS Org
https://lmsys.org/blog/2023-06-22-leaderboard/
- TheBloke/mpt-30B-instruct-GGML · Hugging Facehuggingface.co TheBloke/mpt-30B-instruct-GGML · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
- MPT-30B: Raising the bar for open-source foundation modelswww.mosaicml.com MPT-30B: Raising the bar for open-source foundation models
Introducing MPT-30B, a new, more powerful member of our Foundation Series of open-source models, trained with an 8k context length on NVIDIA H100 Tensor Core GPUs.
https://huggingface.co/mosaicml
https://twitter.com/MosaicML/status/1671894543070035970
- Stability AI launches SDXL 0.9: A Leap Forward in AI Image Generation — Stability AIstability.ai Stability AI launches SDXL 0.9: A Leap Forward in AI Image Generation — Stability AI
Discover SDXL 0.9, Stability AI's cutting-edge release in the Stable Diffusion suite. Unleashing remarkable image and composition precision, this upgrade revolutionizes generative AI imagery. From hyper-realistic media production to design and industrial advancements, explore the limitless possibili
- artificialguybr/Liberte at mainhuggingface.co artificialguybr/Liberte at main
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
- Commercial use allowed! openlm-research/open_llama_13b · Hugging Facehuggingface.co openlm-research/open_llama_13b · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
these really are like pokemon
- conceptofmind/Hermes-Open-Llama-7b-8k · Hugging Facehuggingface.co conceptofmind/Hermes-Open-Llama-7b-8k · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
- Robin V2 Launches: Achieves Unparalleled Performance on OpenLLM!medium.com Robin V2 Launches: Achieves Unparalleled Performance on OpenLLM!
A stunning arrival! The fully upgraded Robin Series V2 language model is ready and eagerly awaiting your exploration.
- A model that can create synthetic speech that matches a speaker's lip movementstechxplore.com A model that can create synthetic speech that matches a speaker's lip movements
Machine learning models can help to solve several real-world problems faster and more efficiently. One among these problems involves synthesizing speech for both animated characters and human speakers based on the movements of their lips.
- WizardLM's WizardCoder 15B 1.0 GPTQ and GGML
https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GGML https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GPTQ
https://huggingface.co/WizardLM
- Leaked photo shows new ChatGPT features: collaborative spaces, ability to remember your info and a feature to upload files!
https://twitter.com/aakashg0/status/1668860064772521985?s=46&t=RTJOTWgrpXDFlCqmE07zMw
- Landmark Attention Oobabooga Support + GPTQ Quantized Models!
Models: https://huggingface.co/TheBloke/WizardLM-7B-Landmark
https://huggingface.co/TheBloke/Minotaur-13B-Landmark
Repo: https://github.com/eugenepentland/landmark-attention-qlora
Notes when using the models
Trust-remote-code must be enabled for the attention model to work correctly.
Add bos_token must be disabled in the parameters tab
Truncat the prompt must be increased to allow for a larger context. The slider goes up to a max of 8192, but the models can handle larger contexts as long as you have memory. If you want to go higher, go to text-generation-webui/modules/shared.py and increase truncation_length_max to whatever you want it to be.
You may need to set the repetition_penalty when asking questions about a long context to get the correct answer.
Performance Notes:
Inference in a long context is slow. On the RTX Quadro 8000 I'm testing, it takes about a minute to get an answer for 10k context. This is working on being improved.
Remember that the model only has good performance at the base model for complex queries. Sometimes you may not get the answer you are looking for, but it's worth testing if the base model would be able to answer the question within the 2k context.
- An absolutely stunning example of video created with runway.ml
https://twitter.com/_akhaliq/status/1668330282485948417?s=20
- OpenAI API response time tracker after today's update
https://twitter.com/stanmarion/status/1667085002536828929?s=20
- If Breaking Bad was in France, a Midjourney experiment
https://twitter.com/_akhaliq/status/1668667025823084576?s=20