AI Models : a Primer – Augment Your Experience

Last updated: July 2, 2024

AI is as hot right now as the web was in the Dot com boom in the 90s, when buzzwords such as e-commerce and SEO were hot off the press.. Keeping up with advancements of AI right now is like drinking from the proverbial firehose.

Over the past decade we’ve seen some major trends in tech, and the job market has reflected these trends. As storage has increased and costs have come down, accompanied by advancements in processing power, we’ve unlocked the ability to work with massive amounts of data. First we had Big Data, then Machine Learning and now we’re hot for Artificial Intelligence. GPTs (Generative pre-trained transformers) are neural networks and a type of large language models (LLMs) leveraging natural language processing to power Generative Artificial Intelligence (GenAI). They are trained on large data sets and designed to evaluate input and generate output (text, images, audio, video, 3d, etc) from prompts and a variety of other inputs. GenAI, while controversial in some areas, is a transformative productivity tool, unlocking the ability to prototype and create content in lightning speed. These AI tools can be used to build chatbots, summarize long-form text, provide academic support, generate social media content, stylize content, data analysis, image analysis, image creation, animating images, creating memes, generate quizzes, coding assistance /websites, blog posts, travel plans and more. While GenAI can outperform human speed capabilities, it still requires human intervention from crafting the perfect prompt to final edits. It’s not perfect. I quote a speaker from AWE XR ’24 “AI is not generative, it is derivative”. It still requires human creativity, human input, and most importantly, human context. It is great for prototyping and quickly analyzing information.

GenAI Overview

Whether you are looking for a model or an application that leverages LLMs, it helps to understand the various tools available in the AI Tech Stack. Here’s a good source for finding some tools for your use case: TopAI Tools . I also hopped over to eWeek for news and updates and found a ranking of the top 150 AI tools.

Play with AI models

You can access most of the models below natively from their organization or in APIs, or you can use a number of cloud solutions providers such as AWS Bedrock, Google Cloud, Microsoft, etc.

Kaggle for over 350k public data sets and over 1M public notebooks.
Hugging Face

Open AI

AI/Model	Features/Costs	NOTES
Chat GPT	New in ChatGPT GPT-4o Personal or enterprise Desktop and mobile apps can hear, see and speak Pricing: $19.99 / month Free tier (3.5)	You can now show ChatGPT one or more images. Troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data. To focus on a specific part of the image, you can use the drawing tool in the mobile app.
DALL-E 3	image from text
Sora	video from text

Anthropic

AI/Model	Features/Costs	NOTES
Claude	New in Claude * Convert UI design to front end code * Extract text from Images * Transcribe handwritten notes API access • API integration; independently interact with APIs and the web; Developers can setup a toolbox. Pricing depends on how much text you work with, measured in tokens (1,000 tokens are about 750 words)	Check out the GPT Store Update: 6/8 https://www.youtube.com/watch?v=oGFh62u-5dY • for general use; natural language processing, Claude figures out what it needs from the web, tells the API what it needs Accessible through: • Anthropic Messages API • Amazon Bedrock • Google Vertex AI
Claude 3.5 Sonnet	balance skill and speed; efficient; AI interact with people $3.00 for every million tokens
Claude Opus	most advanced; complex tasks; deep thinking $15.00 for every million tokens
Claude Haiku	fastest and compact model; rapid response, efficient resource utilization $0.25 for every million tokens

Meta

AI/Model	Features/Costs	Notes
Llama 3 > Meta AI Assistant	Model details Card Text Input/output	Available through Hugging Face or Kaggle

Google

AI/Model	Features/Costs
Gemini > Chat with Gemini

Mistral

AI/Model	Features/Costs
Mistral 8x22B chat console	Open Source Mixtral 8x22B is currently the most performant open model. A 22B sparse Mixture-of-Experts (SMoE). Uses only 39B active parameters out of 141B. Fluent in English, French, Italian, German, Spanish, and strong in code 64k context window Native function calling capacities Function calling and json mode available on our API endpoint $2 /1M tokens $6 /1M tokens
Mistral 8x7B	A 7B sparse Mixture-of-Experts (SMoE). Uses 12.9B active parameters out of 45B total. Fluent in English, French, Italian, German, Spanish, and strong in code 32k context window input: $0.7 /1M tokens output: $0.7 /1M tokens
Mistral 7B	A 7B transformer model, fast-deployed and easily customisable. Small, yet very powerful for a variety of use cases. Performant in English and code 32k context window input: $0.25 /1M tokens output: $0.25 /1M tokens
mistral-small-2402 codestral-2405 mistral-large-2402	Mistral also has several optimized models. See their pricing page for details

Multimedia

Organization	AI/Model	Features/Costs	Notes
Stability.ai	Stable Diffusion 3 Stable Assistant, Stable Artisan Stable Audio	Image Generation from text Video Generation from text Music Generation from text of audio samples 3d Models Multilingual Language Models Pricing is per credit. Credits are priced at $10 per 1,000 credits, which is enough credits for roughly 5,000 SDXL 1.0 images. First 25 credits are free.	Self-hosted, developer platform, cloud hosted AWS, Google Cloud, NVidia, Intel Developer Cloud
Spline		3D object and app generation. iOs, iPad, Mac, Apple Vision Pro

Cohere https://cohere.com/

3D from Text

https://spline.design/ai-generate

3DFY ai

NVideo get3d and 3d tools

Midjourney

Computer Vision

Organization	Model	Features/Costs	Notes

Synthesia		Computer Vision – Security, Identity verification, AR/VR/XR, Virtual Try on, Driver Monitoring, pedestrian detection

Image/Video

Organization	Model	Features/Costs
Adobe	Adobe Creative Suite	F
Luma	Luma Dream Machine
Morph Studio
Runway	Gen 3 Alpha, Gen 2, Gen 1 video to video	text to image, image to image, frame interpolation, upscale image, video to video
Topaz		photo and video editing

Audio

Organization	Model	Features/Costs	Notes

Suno
Udio

Avatars

Generative AI video platform with AI avatars, text to video

Animation

Organization	Model	Features/Costs	Notes
Kaiber

Marketing & Creator Tools

FBRC.ai – storybuildlng

Fabric.space

https://www.snackshop.app – TikTok for Graphic Novels

Recent Posts