AI Models : a Primer

Last updated: July 2, 2024

AI is as hot right now as the web was in the Dot com boom in the 90s, when buzzwords such as e-commerce and SEO were hot off the press.. Keeping up with advancements of AI right now is like drinking from the proverbial firehose.

Over the past decade we’ve seen some major trends in tech, and the job market has reflected these trends. As storage has increased and costs have come down, accompanied by advancements in processing power, we’ve unlocked the ability to work with massive amounts of data. First we had Big Data, then Machine Learning and now we’re hot for Artificial Intelligence. GPTs (Generative pre-trained transformers) are neural networks and a type of large language models (LLMs) leveraging natural language processing to power Generative Artificial Intelligence (GenAI). They are trained on large data sets and designed to evaluate input and generate output (text, images, audio, video, 3d, etc) from prompts and a variety of other inputs. GenAI, while controversial in some areas, is a transformative productivity tool, unlocking the ability to prototype and create content in lightning speed. These AI tools can be used to build chatbots, summarize long-form text, provide academic support, generate social media content, stylize content, data analysis, image analysis, image creation, animating images, creating memes, generate quizzes, coding assistance /websites, blog posts, travel plans and more. While GenAI can outperform human speed capabilities, it still requires human intervention from crafting the perfect prompt to final edits. It’s not perfect. I quote a speaker from AWE XR ’24 “AI is not generative, it is derivative”. It still requires human creativity, human input, and most importantly, human context. It is great for prototyping and quickly analyzing information.

GenAI Overview

Whether you are looking for a model or an application that leverages LLMs, it helps to understand the various tools available in the AI Tech Stack. Here’s a good source for finding some tools for your use case: TopAI Tools . I also hopped over to eWeek for news and updates and found a ranking of the top 150 AI tools.

Play with AI models

You can access most of the models below natively from their organization or in APIs, or you can use a number of cloud solutions providers such as AWS Bedrock, Google Cloud, Microsoft, etc.

Open AI

AI/ModelFeatures/CostsNOTES
Chat GPT
New in ChatGPT
GPT-4o
Personal or enterprise

Desktop and mobile apps
can hear, see and speak

Pricing: $19.99 / month
Free tier (3.5)
You can now show ChatGPT one or more images.
Troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data.
To focus on a specific part of the image, you can use the drawing tool in the mobile app.
DALL-E 3image from text
Soravideo from text

Anthropic

Meta

AI/ModelFeatures/CostsNotes
Llama 3 >
Meta AI Assistant
Model details Card
Text Input/output

Available through Hugging Face or Kaggle

Google

AI/ModelFeatures/Costs
Gemini > Chat with Gemini

Mistral

AI/ModelFeatures/Costs
Mistral 8x22B
chat
console
Open Source
Mixtral 8x22B is currently the most performant open model. A 22B sparse Mixture-of-Experts (SMoE). Uses only 39B active parameters out of 141B.
Fluent in English, French, Italian, German, Spanish, and strong in code
64k context window
Native function calling capacities
Function calling and json mode available on our API endpoint
$2 /1M tokens
$6 /1M tokens
Mistral 8x7BA 7B sparse Mixture-of-Experts (SMoE). Uses 12.9B active parameters out of 45B total.
Fluent in English, French, Italian, German, Spanish, and strong in code
32k context window
input: $0.7 /1M tokens
output: $0.7 /1M tokens
Mistral 7BA 7B transformer model, fast-deployed and easily customisable. Small, yet very powerful for a variety of use cases.
Performant in English and code
32k context window
input: $0.25 /1M tokens
output: $0.25 /1M tokens
mistral-small-2402
codestral-2405
mistral-large-2402
Mistral also has several optimized models. See their pricing page for details

Multimedia

OrganizationAI/ModelFeatures/CostsNotes
Stability.ai



Stable Diffusion 3
Stable Assistant, Stable Artisan
Stable Audio
Image Generation from text
Video Generation from text
Music Generation from text of audio samples
3d Models
Multilingual Language Models

Pricing is per credit. Credits are priced at $10 per 1,000 credits, which is enough credits for roughly 5,000 SDXL 1.0 images. First 25 credits are free.
Self-hosted, developer platform, cloud hosted
AWS, Google Cloud, NVidia, Intel Developer Cloud
Spline3D object and app generation. iOs, iPad, Mac, Apple Vision Pro

Cohere https://cohere.com/

3D from Text

https://spline.design/ai-generate

3DFY ai

NVideo get3d and 3d tools

Midjourney

Computer Vision

OrganizationModelFeatures/CostsNotes
SynthesiaComputer Vision – Security, Identity verification, AR/VR/XR, Virtual Try on, Driver Monitoring, pedestrian detection

Image/Video

OrganizationModelFeatures/CostsNotes
AdobeAdobe Creative SuiteF
LumaLuma Dream Machine
Morph Studio
RunwayGen 3 Alpha, Gen 2, Gen 1 video to videotext to image, image to image, frame interpolation, upscale image, video to video
Topaz photo and video editing

Audio

OrganizationModelFeatures/CostsNotes

Suno
Udio

Avatars

Generative AI video platform with AI avatars, text to video

Animation

OrganizationModelFeatures/CostsNotes
Kaiber


Marketing & Creator Tools

FBRC.ai – storybuildlng

Fabric.space

https://www.snackshop.app – TikTok for Graphic Novels